Complex Social and Behavioral Systems: Marilda Sotomayor David Pérez-Castrillo Filippo Castiglione Editors
Complex Social and Behavioral Systems: Marilda Sotomayor David Pérez-Castrillo Filippo Castiglione Editors
Complex Social and Behavioral Systems: Marilda Sotomayor David Pérez-Castrillo Filippo Castiglione Editors
Complex Social
and Behavioral
Systems
Game Theory and Agent-Based Models
A Volume in the Encyclopedia of
Complexity and Systems Science,
Second Edition
Encyclopedia of Complexity and
Systems Science Series
Editor-in-Chief
Robert A. Meyers
The Encyclopedia of Complexity and Systems Science Series of topical
volumes provides an authoritative source for understanding and applying the
concepts of complexity theory together with the tools and measures for
analyzing complex systems in all fields of science and engineering. Many
phenomena at all scales in science and engineering have the characteristics of
complex systems and can be fully understood only through the transdisciplin-
ary perspectives, theories, and tools of self-organization, synergetics, dynam-
ical systems, turbulence, catastrophes, instabilities, nonlinearity, stochastic
processes, chaos, neural networks, cellular automata, adaptive systems,
genetic algorithms, and so on. Examples of near-term problems and major
unknowns that can be approached through complexity and systems science
include: the structure, history, and future of the universe; the biological basis of
consciousness; the integration of genomics, proteomics, and bioinformatics as
systems biology; human longevity limits; the limits of computing; sustainabil-
ity of human societies and life on earth; predictability, dynamics, and extent of
earthquakes, hurricanes, tsunamis, and other natural disasters; the dynamics of
turbulent flows; lasers or fluids in physics; microprocessor design; macromo-
lecular assembly in chemistry and biophysics; brain functions in cognitive
neuroscience; climate change; ecosystem management; traffic management;
and business cycles. All these seemingly diverse kinds of phenomena and
structure formation have a number of important features and underlying
structures in common. These deep structural similarities can be exploited to
transfer analytical methods and understanding from one field to another. This
unique work will extend the influence of complexity and system science to a
much wider audience than has been possible to date.
Filippo Castiglione
Istituto Applicazioni del Calcolo (IAC)
Consiglio Nazionale delle Ricerche (CNR)
Rome, Italy
This Springer imprint is published by the registered company Springer Science+Business Media,
LLC, part of Springer Nature.
The registered company address is: 1 New York Plaza, New York, NY 10004, U.S.A.
Series Preface
v
vi Series Preface
structure of water; control of global infectious diseases; and also evolution and
quantification of (ultimately) human cooperative behavior in politics, economics,
business systems, and social interactions. In fact, most of these issues have
identified nonlinearities and are beginning to be addressed with nonlinear tech-
niques, e.g., human longevity limits, the Standard Model, climate change, earth-
quake prediction, workings of the earth’s interior, natural disaster prediction, etc.
The individual complex systems mathematical and modeling tools and
scientific and engineering applications that comprised the Encyclopedia of
Complexity and Systems Science are being completely updated and the major-
ity will be published as individual books edited by experts in each field who are
eminent university faculty members.
The topics are as follows:
Each entry in each of the Series books was selected and peer reviews
organized by one of our university-based book Editors with advice and
consultation provided by our eminent Board Members and the Editor-in-Chief.
This level of coordination assures that the reader can have a level of
confidence in the relevance and accuracy of the information far exceeding
than that generally found on the World Wide Web. Accessibility is also a
priority and for this reason each entry includes a glossary of important terms
and a concise definition of the subject. In addition, we are pleased that the
mathematical portions of our Encyclopedia have been selected by Math
Reviews for indexing in MathSciNet. Also, ACM, the world’s largest educa-
tional and scientific computing society, recognized our Computational Com-
plexity: Theory, Techniques, and Applications book, which contains content
taken exclusively from the Encyclopedia of Complexity and Systems Science,
with an award as one of the notable Computer Science publications. Clearly,
we have achieved prominence at a level beyond our expectations, but consis-
tent with the high quality of the content!
Game theory is the study of decision problems which involve several individ-
uals (the decision-makers or players) interacting rationally. The models of
game theory are abstract representations of a number of real-life situations and
have applications to economics, political sciences, computer sciences, evolu-
tionary biology, social psychology, and law, among others. These applications
are also important for the development of the theory, since the questions that
emerge may lead to new theoretic results.
This volume provides the main features of Game Theory, covering most of
the fundamental theoretical aspects under the cooperative, non-cooperative,
and “general” or “mixed” approaches.
The cooperative approach focuses on the possible outcomes of the
decision-makers’ interaction by abstracting from the actions or decisions that
may lead to these outcomes. Specifically, cooperative game theory studies the
interactions among coalitions of players. Its main question is: Given the sets of
feasible payoffs for each coalition, what payoff will be awarded to each player?
One can take a positive or normative approach to answering this question, and
different solution concepts in the theory lead towards one or the other.
The non-cooperative approach focuses on the actions that the decision-
makers can take. As argued by John von Neumann and Oskar Morgenstern in
their famous 1944 book titled Theory of Games and Economic Behavior, most
economic questions should be analyzed as games. Some games are dynamic,
stressing the sequential nature of the various decisions that agents can make.
Other situations are better modeled as static games.
The volume also considers contributions of game theory to mechanism
design, which has helped the development of other key research areas such as
auction theory, contract theory, and two-sided matching theory. Given the
importance of these areas in game theory and in economics, several chapters
are devoted to their study. The reader can also appreciate the many applications
of game theory to practical problems in several contributions to this volume.
Finally, a section is dedicated to the modeling and simulation paradigm
known as agent-based modeling (ABM) that is markedly useful in studying
complex systems made up of a large number of interdependent objects. This
paradigm is relatively immature, even though commonly applied in a broad
spectrum of disciplines (game theory included), thus a clear-cut and widely
accepted definition of high-level concepts of agents, environment, interac-
tions, and so on is still lacking. This section addresses the epistemological
ix
x Volume Preface
xi
xii Contents
Robert A. Meyers
President: RAMTECH Limited
Manger, Chemical Process Technology, TRW Inc.
Post doctoral Fellow: California Institute of Technology
Ph.D. Chemistry, University of California at Los Angeles
B.A. Chemistry, California State University, San Diego
Biography
Dr. Meyers has worked with more than 20 Nobel laureates during his career
and is the originator and serves as Editor-in-Chief of both the Springer Nature
Encyclopedia of Sustainability Science and Technology and the related and
supportive Springer Nature Encyclopedia of Complexity and Systems Science.
Education
xv
xvi About the Editor-in-Chief
Dr. Meyers holds more than 20 patents and is the author or Editor-in-Chief of
12 technical books including the Handbook of Chemical Production Pro-
cesses, Handbook of Synfuels Technology, and Handbook of Petroleum Refin-
ing Processes now in 4th Edition, and the Handbook of Petrochemical
Production Processes, now in its second edition, (McGraw-Hill) and the
Handbook of Energy Technology and Economics, published by John Wiley
& Sons; Coal Structure, published by Academic Press; and Coal Desulfuri-
zation as well as the Coal Handbook published by Marcel Dekker. He served
as Chairman of the Advisory Board for A Guide to Nuclear Power Technology,
published by John Wiley & Sons, which won the Association of American
Publishers Award as the best book in technology and engineering.
About the Volume Editors
xvii
xviii About the Volume Editors
(2003), is presented in Chap. 4. Some discussion Signaling games and inspection games are also
on correlated equilibrium and Bayesian games is two-player games. Signaling games is the subject of
also provided in this chapter. Chap. 12. They are games of incomplete informa-
The correlated equilibrium is a game theoretic tion in which one player is informed and the other is
solution concept proposed by Aumann (1974, not. Players can use the actions of their opponents to
1987) in order to capture the strategic correlation make inferences about the hidden information. The
opportunities that the players face when they take earliest work on this subject is Spence’s seminal
into account the extraneous environment in which 1972 work, in which education serves as a signal
they interact. Chapter 5 focuses on two possible of ability. Inspection games are covered in Chap. 13.
extensions of the correlated equilibrium to Bayes- These games deal with the problem faced by an
ian games: the strategic form correlated equilib- inspector who is required to control the compliance
rium and the communication equilibrium. The of an inspectee to some legal or otherwise formal
general framework of games with incomplete undertaking. They started with the analysis of arms
information is treated in Chap. 6, with special control and disarmament problems in the early
reference to “Bayesian games.” 1960s and have been applied to auditing, environ-
Repeated games deal with situations in which a mental control, material accountancy, etc.
group of agents engage in a strategic interaction Inspections cause conflict in many real-world
over and over. Chapter 7 is devoted to repeated situations. In economics, there are services of
games with complete information. In such games many kinds, the fulfillment or payment of which
the data of the strategic interaction is fixed over has to be verified. One example is the problem of
time and is known by all the players. Chapter 8 principal-agent relationships discussed in detail in
discusses repeated games with incomplete infor- Chap. 14. The principal-agent models provide the
mation, a situation where several players repeat theory of contracts under asymmetric information,
the same stage game, the players having different concerning relationships between owner and man-
knowledge of the stage game which is repeated. ager, insurer and insured, etc. The principal, e.g., an
Repeated games have many equilibria, includ- employer, delegates work or responsibility to the
ing the repetition of stage game Nash equilibria. agent, the employee, and chooses a payment sched-
At the same time, particularly when monitoring is ule that best exploits the agent’s self-interests. The
imperfect, certain plausible outcomes are not con- agent, of course, behaves so as to maximize her own
sistent with equilibrium. Reputation effects is the utility given the fee schedule proposed by the prin-
term used for the impact upon the set of equilibria cipal. The problem faced by the principal is to devise
(typically of a repeated game) of perturbing the incentives to motivate the agent to act in the princi-
game by introducing incomplete information of a pal’s interest. This generates some type of transac-
particular kind. This issue is treated in Chap. 9. tion cost for the principal, which includes the task of
Games with two players are of particular sig- investigating and selecting appropriate agents,
nificance. The first two-person game studied in gaining information to set performances standards,
the literature was the zero-sum two-person game, monitoring agents, bonding payments by the agents,
first analyzed by von Neumann and Morgenstern and residual losses.
(1944). In such a game, one player’s gain is the Chapter 15 is devoted to differential games with
other player’s loss. Chess, checkers, rummy, two focus on two-player zero-sum and antagonist dif-
finger morra, and tic-tac-toe are all examples of ferential games. These are games in which the state
zero-sum two-person games. The theory for such of the players depends on time in a continuous way.
games is surveyed in Chap. 10. Recent results on The positions of the players are solutions to differ-
stochastic zero-sum games are presented in ential equations. Motivated by military applica-
Chap. 11. Stochastic games are used to model tions in the “Cold War”, these games have a wide
dynamic interactions in which the environment range of applications from economics to engineer
changes in response to the behavior of the players. sciences and recently to biology and behavioral
These games are discussed in Chap. 11. ecology.
Game Theory, Introduction to 5
Mechanism designed is the subject of Another class of problems that have been
Chap. 16. It studies the construction of mecha- discussed from the perspective of cooperative and
nisms that aim to reach a socially desirable out- noncooperative game theory is the cost sharing
come in the presence of rational but selfish problems, treated in Chap. 21. Applications are
players, who care only about their own private numerous ranging from environmental issues like
utility. More specifically, the question is how to pollution, fishing grounds, to sharing multipurpose
design a mechanism such that the equilibrium reservoirs, road systems, communication net-
behavior of the players in the game induced by works, and the Internet. The worth of a “coalition”
the mechanism leads to the socially desired goal. of such activities is defined as the hypothetical cost
The theory of mechanism design has contrib- of carrying out the activities in that coalition only.
uted to the development of other research areas as, Market games and clubs are treated in Chap. 22
for example, auction theory, contract theory, and with focus on the equivalence between markets –
two-sided matching theory. “For having laid the defined as private goods economies where all
foundations of mechanism design theory” the participants in the economy have utility functions
2007 Nobel Prize in Economics was awarded to that are linear in the variable money – and games
Leonid Hurwicz, Eric Maskin, and Roger in characteristic function form.
Myerson. Chapter 17 is devoted to the presenta- Learning in games is surveyed in Chap. 23. It
tion of auctions and to introducing major contri- covers models in which players are “rational” but
butions. It studies various auction formats, not necessarily in equilibrium: players forecast,
including English (ascending-price) and Dutch possibly inaccurately, the future behavior of their
(descending-price) auctions, first-price and opponents and optimize or e optimize with respect
second-price sealed-bid auctions, as well as all- to their forecasts.
pay auctions. Fair division is reviewed in Chap. 24. It pro-
A related theory is the theory of implementa- vides a rigorous analysis of procedures for allo-
tion, the subject of Chap. 18. It reverses the usual cating goods, or deciding who wins on what
procedure, namely, fix a mechanism and see what issues, in a dispute.
the outcomes are. More precisely, it investigates Voting methods as ways to take collective deci-
the correspondence between normative goals and sions have been studied since ancient times. The
mechanisms designed to achieve those goals. contributions by Arrow (1951, 1963) and Black
A class of “mixed” games is that of two-sided (1948, 1958) broaden the view by considering the
matching games, which has been analyzed since design of collective-decision methods in general,
Gale and Shapley, 1962, under both cooperative from an axiomatic point of view. These procedures
and noncooperative game theoretic approaches. allow to aggregate preferences taking into account
The two-sided matching theory is surveyed in ethical and pragmatic principles, as well as the par-
Chap. 19 and 20. Chapter 19 focuses on the ticipants’ incentives. They are studied in Chap. 25.
differences and similarities between some The following two chapters deal with applica-
matching models. In their paper, Gale and tions to political sciences. The first one, Chap. 26,
Shapley formulated and solved the stable presents a game theoretic analysis of voting sys-
matching problem for the marriage and the col- tems as procedures to choose a winner among a set
lege admissions markets. The solution of the of candidates from the individual preferences of the
college admissions problem was given by a sim- voters or more ambitiously, allowing to rank all the
ple deferred-acceptance algorithm which has candidates or a part of them. Such a situation
been adapted and applied in the reorganization occurs not only in the field of elections but also in
of admission processes of many two-sided many other fields as games, sports, artificial intel-
matching markets. Chapter 20 studies the one- ligence, spam detection, web search engines, and
sided matching model and discusses applica- statistics. From a practical point of view, it is cru-
tions, such as the medical residency matching, cial to be able to announce who is the winner in a
kidney exchange, and school choice. “reasonable” time. This raises the question of the
6 Game Theory, Introduction to
complexity of the voting procedures. The second 5. Correlated Equilibria and Communication in
chapter, Chap. 27, details the complexity results Games
about several voting procedures. Author: Françoise Forges
Chapter 28 deals with applications to biology. 6. Bayesian Games: Games with Incomplete
This field, known as evolutionary game theory, Information
started in 1972 with the publication of a series of Author: Shmuel Zamir
papers by the mathematical biologist John May- 7. Repeated Games with Complete Information
nard Smith. Maynard Smith adapted the methods Authors: Olivier Gossner and Tristan
of traditional game theory, which were created to Tomala
model the behavior of rational economic agents, 8. Repeated Games with Incomplete Information
to the context of biological natural selection. Author: Jérôme Renault
Network models have a long history in sociol- 9. Reputation Effects
ogy, natural sciences, and engineering. However, Authors: George Mailath
only recently economists have begun to think of 10. Zero-Sum Two Person Games
political and economic interactions as network Author: T.E.S. Raghavan
phenomena and to model everything as games of 11. Stochastic Games
network formation. Chapter 29 is devoted to sta- Authors: Yehuda John Levy and Eilon
ble networks and the game theoretic underpin- Solan
nings of stable networks. 12. Signaling Games
Chapter 30 deals with some aspect of bounded Author: Joel Sobel
rationality that has generated important work, 13. Inspection Games
namely, the presence of constraints on the capac- Authors: Rudolf Avenhaus and Morton
ities of players. Various constraints could be con- J. Canty
sidered, for example, limits on the ability to plan 14. Principal-Agent Models
ahead in intertemporal decision-making or on the Authors: David Pérez-Castrillo and Ines
ability to compute best responses. This chapter Macho-Stadler
discusses cognitive costs to players of using strat- 15. Differential Games
egies that depend on long histories of past play. Author: Marc Quincapoix
This is done mainly in the context of bargaining 16. Mechanism Design
and markets. It is shown that such complexity Author: Ron Lavi
considerations often enable us to make sharp pre- 17. Auctions
dictions. It is also considered the issue in the Author: Martin Pesendorfer
context of repeated games. 18. Implementation Theory
Author: Luis Corchon
List of the Chapters 19. Two-Sided Matching Models
1. Cooperative Games (von Neumann- Authors: Marilda Sotomayor and Ömer
Morgenstern Stable Sets) Özak
Authors: Ryo Kawasaki, Jun Wako and 20. Market Design
Shigeo Muto Authors: Fuhito Kojima, Fanqi Shi and
2. Cooperative Games Akhil Vohra
Author: Roberto Serrano 21. Cost Sharing
3. Dynamic Games with an Application to Author: Maurice Koster
Climate Change Models 22. Market Games and Clubs
Authors: Prajit K. Dutta Author: Myrna Wooders
4. Static Games 23. Learning in Games
Author: Oscar Volij Author: John Nachbar
Game Theory, Introduction to 7
Internal stability A set of imputations stable set was defined by von Neumann and
(outcomes, strategy combinations) satisfies Morgenstern (1953) as a solution concept for
internal stability if there is no domination characteristic function form cooperative games.
between any two imputations in the set. They also defined the stable set in abstract
Strategic form game A strategic form game games so that one can apply the concept to more
consists of a player set, each player’s strategy general games including noncooperative situa-
set, and each player’s payoff function. It is tions. Greenberg (1990) and Chwe (1994) cleared
usually used to represent noncooperative a way to apply the stable set concept to the anal-
games. ysis of noncooperative games in strategic and
Von Neumann-Morgenstern stable set A set of extensive forms.
imputations (outcomes, strategy combinations) The stable set is a set of outcomes satisfying
is a von Neumann-Morgenstern stable set if it two stability conditions: internal and external sta-
satisfies both internal and external stabilities. bility. Internal stability states that between any
two outcomes in the set, there is no group of
players such that all of its members prefer one to
Definition of the Subject the other and they can realize the preferred out-
come. External stability states that for any out-
The von Neumann-Morgenstern stable set come outside the set, there is a group of players
(hereafter stable set) is the first solution concept such that all of its members have a commonly
in cooperative game theory defined by J. von preferred outcome in the set and they can realize
Neumann and O. Morgenstern. Though it was it. Though the existence of stable sets does not
defined for cooperative games in characteristic hold in general as was shown by Lucas (1968) and
function form, von Neumann and Morgenstern Lucas and Rabie (1982), the stable set has
gave a more general definition of a stable set in revealed many interesting behaviors of players in
abstract games. Later, J. Greenberg and M. Chwe economic, political, and social systems.
cleared a way to apply the stable set concept to the Von Neumann and Morgenstern (and also
analysis of noncooperative games in strategic and Greenberg) took into account only a single move
extensive forms. Stable sets in a characteristic by a group of players and their ability to foresee
function form game may not exist, as was shown subsequent moves made by other groups of
by W. F. Lucas for a ten-person game that does not players. Harsanyi (1974) first pointed out that
admit a stable set. On the other hand, stable sets stable sets in characteristic function form games
exist in many important games. In voting games, may not take into account the farsighted behavior
for example, stable sets exist, and they indicate of players and their ability to foresee subsequent
what coalitions can be formed in detail. The core, moves made by other groups of players.
on the other hand, can be empty in voting games, Harsanyi’s work inspired Chwe (1994) to incor-
though it is one of the best-known solution con- porate the notion of foresight in social environ-
cepts in cooperative game theory. The analysis of ments into the von Neumann-Morgenstern
stable sets is not necessarily straightforward, since stability.
it can reveal a variety of possibilities. However, Chwe focused on a possible chain of moves,
stable sets give us deep insights into players’ where a move by a group of players will cause a
behavior, such as coalition formation, in eco- sequence of moves from other groups of players.
nomic, political, and social situations. Then the group of players moving first should take
into account the sequence of moves that may
follow and evaluate their profits of the final out-
Introduction come of the sequence rather than the outcomes in
the intermediate steps of the sequence. By incor-
For studies of economic or social situations where porating such a sequence of moves, Chwe (1994)
players can engage in cooperative behavior, the defined a more forward-looking concept, which
Cooperative Games (Von Neumann-Morgenstern Stable Sets) 11
we call a farsighted stable set in what follows. The “stability” means that no group of players has
incorporation of the farsighted stable set provides an incentive to deviate from it, and “instability”
richer results than the myopic stable set in many means that there is at least one group of players
classes of games. that has an incentive to deviate from it. Then the
The rest of the chapter is organized as follows. internal and external stability conditions guaran-
The first several sections cover the definitions of tee that the common understanding is never
stable sets in abstract games and other more spe- disproved and thus continues to prevail. In fact,
cific models in cooperative game theory. The suppose that the set is both internally and exter-
examples include voting games, production nally stable and take any outcome in the set.
games, assignment games, marriage games, and Then by internal stability, no group of players
house barter games. The second set of sections can be better off by deviating from it and induc-
then covers farsighted stable sets, which are ing another outcome inside the set. Thus, no
defined in response to the criticism that the original group of players reaches an agreement to deviate,
stable set is myopic. This second part also starts which makes each outcome inside the set remain
with the definition of the concepts in an abstract stable. Deviating players may be better off by
setting and then proceeds to examples which inducing an outcome outside the set, but out-
include strategic form games, characteristic func- comes outside the set are commonly considered
tion form games, coalition formation games, unstable. Thus, deviating players can never
matching games, and house barter games. The expect that such an outcome will continue. Next
chapter then concludes with some remarks on the take any outcome outside the set. Then by exter-
possible directions for future research. nal stability, there exists at least one group of
players who can become better off by deviating
from it and inducing an outcome inside the set.
Stable Sets in Abstract Games The induced outcome is considered stable since it
is in the set. Hence, the group of players will
An abstract game is a pair (W, ) of a set of deviate to a stable outcome, thereby reinforcing
outcomes W and an irreflexive binary relation that the outside outcomes are unstable.
on W, where irreflexivity means that x x does
not hold for any x W. The relation is
interpreted as follows: if x y holds, then there
must exist a set of players such that they can Stable Set and Core
induce x from y by themselves and all of them
are better off in x. Another solution concept that is widely known is
A subset K of W is called a stable set of abstract the core. For a given abstract game G = (W, ),
game (W, ) if the following two conditions are a subset C of W is called the core of G if
satisfied: C = {x W | there is no y W with y x}.
From the definition, the core satisfies internal
1. Internal stability: For any two elements x, y stability. Thus, the core C of G is contained in
K, x y never holds. any stable set of G if the latter exists. To see this,
2. External stability: For any element z 2
= K, there suppose that C 6 K for a stable set K and C is
must exist x K such that x z. nonempty, i.e., C 6¼ ∅. (If C = ∅, then clearly
C K.) Take any element x C \ K. Since x 2 = K,
We explain more in detail what the external by external stability, there exists y K with
and internal stability conditions imply in the y x, which contradicts x C. When the core
definition of a stable set. Suppose that players of a game satisfies external stability, it has very
have a common understanding that each outcome strong stability since the core itself is now a stable
inside a stable set is “stable” and that each out- set. In this case, it is called the stable core. The
come outside the set is “unstable.” Here stable core is the unique stable set of the game.
12 Cooperative Games (Von Neumann-Morgenstern Stable Sets)
The first condition says that all players coop- Though this game has no stable set, it has a
erate and share the worth v(N) that they can pro- nonempty core. A game with no stable set and an
duce. The second condition says that each player empty core was also found by Lucas and Rabie
must receive at least the amount that he/she can (1982). We remark on a class of games in which a
gain by himself/herself. Let A be the set of all stable core exists. As mentioned before, if a stable
imputations. Let x, y be any two imputations and set exists, it always contains the core, which is of
S be any coalition. We say that x dominates y via course true also in characteristic function form
S and write this as x domS y if the following two games. Furthermore, in characteristic function
conditions are satisfied: form games, there is an interesting class of games,
called convex games, in which the core is a stable
1. Coalitional rationality: xi > yi for each i S. core. A characteristic function form game (N, v) is a
2. Effectivity: i S xi v(S). convex game if for any S, T N with S T and for
any i 2
= T, v (S [ {i}) v(S) v (T [{i}) v(T), i.e.,
The first condition says that every member of the bigger coalition a player joins, the larger the
coalition S strictly prefers x to y. The second player’s contribution becomes. In convex games,
condition says that coalition S can guarantee the the core is large and satisfies external stability. For
payoff xi for each member i S by themselves. the details, refer to Shapley (1971).
Cooperative Games (Von Neumann-Morgenstern Stable Sets) 13
Applications of Stable Sets in Abstract Thus, the proof of external stability is complete.
and Characteristic Function Form Games This three-point stable set indicates that a two-
person coalition is formed and that players in the
Symmetric Voting Games coalition share equally the outcome obtained by
This section deals with applications of stable sets passing a bill.
to voting situations. Let us start with a simple
example. This game has three other types of stable sets.
First, any set K 1c ¼ fx Aj x1 ¼ cg with
Example 1 Suppose there is a committee 0 c < 1/2 is a stable set. The internal stability
consisting of three players deciding on a bill. of each K 1c is trivial. To show external stability,
Each player has one vote and whether to pass take any imputation x ¼ ðx1 , x2 , x3 Þ 2 = K 1c . Sup-
the bill or not is decided by the simple majority pose x1 > c. Define y = (y1, y2, y3) by y1 = c,
rule. That is, to pass a bill, at least two votes are y2 = x2 + (x1 c) /2, y3 = x3 + (x1 c)/2. Then
necessary. Before analyzing the players’ behav- y K 1c and y dom{2,3} x. Next suppose x1 < c.
ior, we first formulate the situation as a charac- Notice that at least one of x2 and x3 is less than 1 c
teristic function form game. Let the player set be since c < 1/2. Suppose without loss of generality
N = {1, 2, 3}. Since a coalition of a simple x2 < 1 c. Since c < 1/2, we have ðc; 1 c; 0Þ
majority of players can pass any bill, we give K 1c and (c, 1 c, 0) dom{1,2}x. Thus, external
value 1 to such coalitions. Other coalitions can stability holds. This stable set indicates that player
pass no bill. We thus give them value 0. Hence, 1 gets a fixed amount c and players 2 and 3 negotiate
the characteristic function is given by for how to allocate the rest 1 c. Similarly, any sets
K 2c ¼ fx Aj x2 ¼ cg and K 3c ¼ fx Aj x3 ¼ cg
with 0 c < 1/2 are stable sets. The three-person
1 if j S j 2,
vð S Þ ¼ game of Example 1 has no other stable set. See von
0 if j S j 1,
Neumann and Morgenstern (1953). The first type of
stable set (K) is called a symmetric (or objective)
where |S| denotes the number of players in coali- stable set, while the other types K 1c , K 2c , K 3c are
tion S. The set of imputations is given by called discriminatory stable sets. As a generalization
of the above result, symmetric stable sets are found
A ¼ fx ¼ ðx1 , x2 , x3 Þj x1 þ x2 þ x3 ¼ 1, x1 , x2 , x3 0g
in general n-person simple majority voting games.
An n-person characteristic function form game
(N, v) with N = {1, 2,. . ., n} is called a simple
One stable set of this game is given by the set majority voting game if
K consisting of three imputations: (1/2, 1/2, 0),
(1/2, 0, 1/2), and (0, 1/2, 1/2). A brief proof is the 1 if j S j> n=2,
vð S Þ ¼
following. Since each of the three imputations has 0 if j S j n=2:
only two numbers 1/2 and 0, internal stability is
trivial. To show external stability, take any impu- A coalition S with v(S) = 1, i.e., with |S| > n/2,
tation x = (x1, x2, x3) from outside K. Suppose first is called a winning coalition. A winning coalition
x1 < 1/2. Since x 2 = K, at least one of x2 and x3 is S is said to be minimal if v(T) = 0 for every strict
less than 1/2. If x2 < 1/2, then (1/2, 1/2, 0) dom- subset T of S. In simple majority voting games, a
inates x via coalition {1, 2}. A similar argument minimal winning coalition is a coalition of (n + 1)/2
can be used for when x3 < 1/2 by using (1/2, 0, players if n is odd or (n + 2)/2 players if n is even.
1/2) to dominate x via {1, 3}. Next suppose The following theorem holds. See Bott (1953) for
x1 = 1/2. Since x 2 = K, 0 < x2, x3 < 1/2. Thus, the proof.
(0, 1/2, 1/2) dominates x via coalition {2, 3}.
Finally suppose x1 > 1/2. Then x2, x3 < 1/2, and Theorem 1 Let (N, v) be a simple majority voting
thus, (0, 1/2, 1/2) dominates x via coalition {2, 3}. game. Then the following hold:
14 Cooperative Games (Von Neumann-Morgenstern Stable Sets)
and hY i ¼ [ hxi for a set Y. Thus, the core is not a useful tool for analyzing
xY
voting situations with no veto player. In simple
It should be noted from the first proposition of majority voting games, no player has veto power,
Theorem 1 that when the number of players is and thus, the core is empty. The following theo-
odd, a minimal winning coalition is formed. The rem shows that stable sets always exist.
members of the coalition share equally the total
profit. On the other hand, when the number of Theorem 3 Let (N, v) be a voting game. Let S be
players is even, the second proposition of Theo- a minimal winning coalition and define a set K by
rem 1 shows that every player may gain a positive ( )
X
profit. This implies that the grand coalition of all
K¼ x A x ¼ 1, xi ¼ 0 8i S :
detail in Hart (1973) and Muto (1982a). To facil- shows that in the negotiation of splitting the 2p
itate the discussion, let us start with a simple dollars, three players form a coalition and share
example. equally the gain obtained through collaboration. At
least two players are necessary to produce the com-
Example 2 There are four players, each having modity. Thus, a three-player coalition is the smallest
one unit of a raw material. Two units of the raw coalition that can prevent its complement from pro-
material are necessary for producing one unit of ducing the commodity, i.e., a minimal blocking
an indivisible commodity. One unit of the com- coalition. We would claim that in the market, a
modity is sold at p dollars. The situation is formu- minimal blocking coalition is formed and that profits
lated as the following characteristic function form are shared equally within the coalition.
game. The player set is N = {1, 2, 3, 4}. Since two An extension of the model was given by Hart
units of the raw material are necessary to produce (1973) and Muto (1982a). Hart considered the
one unit of the commodity, the characteristic func- following production market with n players, each
tion v is given by holding one unit of a raw material. To produce one
unit of an indivisible commodity, k units of raw
8
< 2p if j S j¼ 4,
> materials are necessary. The associated production
vðS Þ ¼ p if j S j¼ 3,2, market game is defined by the player set N = {1,
>
: 2,. . ., n} and the characteristic function v given by
0 if j S j¼ 1,0:
8
>
> 0 if 0 j S j< k,
>
>
The set of imputations is >
> p if k j S j< 2k,
<
⋮
vð S Þ ¼
A ¼ fx ¼ ðx1 , x2 , x3 , x4 Þj x1 þ x2 þ x3 þ x4 ¼ 2p; >
> jp if jk j S j< ðj þ 1Þk,
>
>
>
> ⋮
x1 , x2 , x3 , x4 0g: :
hp if hk j S j n,
The following set K is one of the stable sets of
where n = hk + r and h, r are integers such that
the game:
h 1 and 0 r k 1. When h = 1,
K ¼ hfx ¼ ðx1 , x2 , x3 , x4 Þ Aj x1 ¼ x2 ¼ x3 x4 gi:
0 if j S j< k,
vðS Þ ¼
p if j S j k:
To show internal stability, take two imputa-
tions x = (x1, x2, x3, x4) with x1 = x2 = x3 x4
The following theorem holds.
and y = (y1, y2, y3, y4) in K. Suppose x dominates
y. Since x1 = x2 = x3 p/2 x4, the domination
Theorem 4 Suppose h = 1. Let t = n k + 1 and
must hold via coalition {i, 4} with i = 1, 2, 3. Then
P P n = tu + w where u, w are integers such that u 1
we have a contradiction 2p ¼ 4i¼1 xi > 4i¼1 yi
and 0 w t 1. Then the following set K is a
¼ 2p, since y K implies that the largest three
stable set:
elements of y are equal. To show external stability,
take z = (z1, z2, z3, z4) 2
= K. Suppose z1 z2 z3 z4. K ¼ hfx ¼ ðx1 , . . . , xn Þ Aj
Then z1 > z3. Define y = (y1, y2, y3, y4) by x1 ¼ . . . ¼ xt xtþ1 ¼ . . . ¼ x2t
. . . xtuþ1 ¼ . . . ¼ xn ¼ 0i,
8
> z þ z2 2z3
< z3 þ 1 for i ¼ 1,2,3, where
yi ¼ 4 ( )
: z4 þ z1 þ z2 2z3 for i ¼ 4:
> X
n
4 A¼ x ¼ ðx 1 , . . . , x n Þ x ¼ p, x1 , . . . , xn 0
i¼1 i
Then y K and y dom{3,4} z, since y3 > z3,
y4 > z4 and y3 + y4 p = v ({3, 4}). This stable set is the set of imputations.
16 Cooperative Games (Von Neumann-Morgenstern Stable Sets)
0
Theorem 6 Suppose there are m players, 1, 2,. . ., If object k is sold to buyer i, we have a surplus
m, each holding one unit of raw material P, and uik 0 ≔ max ð0, hik 0 ck 0 Þ. Let c ¼ ðck 0 Þk 0 F , H ¼
n players, m + 1, m + 2,. . ., m + n, each holding ðhik 0 Þði,k 0 Þ B F , and U ¼ ðuik 0 Þði,k 0 Þ B F . An
one unit of raw material Q. To produce one unit of assignment market M is defined by the five ele-
an indivisible commodity, one unit of each of raw ments (B, F, H, c, U), where we will suppress H, c,
materials P and Q is necessary. One unit of com- or U when no confusion may arise. We remark that
modity is sold at p dollars. In this market, the an assignment market with |B| 6¼ |F | can be trans-
following set K is a stable set: formed into a market with |B| = |F | by adding
dummy buyers or sellers and zero rows or columns
K ¼ fx ¼ ðx1 , x2 , . . . , xmþn Þ A j correspondingly to the original valuation matrix U.
x1 ¼ ¼ xm , xmþ1 ¼ ¼ xmþn g: An assignment game G is a characteristic func-
tion form game associated with a given assign-
where
ment market M = (B, F, U). We define the player
(
set of G to be B [ F. To define the characteristic
A¼ x ¼ ðx1 , . . . , xm , xmþ1 , . . . , xmþn Þ j function v of G, we first consider the following
) assignment problems P(S) for each coalition
X
mþn
S B [ F with S \ B 6¼ ∅ and S \ F 6¼ ∅:
xi ¼ p min ðm,nÞ, x1 , . . . ,xmþn 0 ,
i¼1 X
ðS Þ ¼ max
PðS Þ : m uik 0 xik 0
is the set of imputations of this game. x
ði, k 0 Þ ðS\BÞ ðS\F Þ
X
s:t: xik 0 1 for all i S \ B,
This theorem shows that players holding the 0
X k S\F
same raw material form a coalition and share xik 0 1 for all k 0 S \ F,
equally the profit gained through collaboration. i S\B
For further results on stable sets in production xik 0 0 for all ði,k 0 Þ ðS \ BÞ ðS \ F Þ:
market games, refer to Hart (1973), Muto
(1982a), and Owen (1995). Refer also to Lucas Each assignment problem P (S) has at least one
(1990), Owen (1968), and Shapley (1953) for optimal integer solution (see Simonnard (1966)),
further general studies on stable sets. which gives an optimal matching between sellers
and buyers in S that gives the highest possible
surplus in S. The characteristic function v is
Assignment Games defined by vðS Þ ¼ m ðS Þ for each S F [ B with
The following three sections deal with applications S \ B 6¼ ∅ and S \ F 6¼ ∅, and v(S) = 0 for each
of stable sets to two-sided markets including S with S B or S F, where the latter part means
matching situations and barter markets with indi- that the worth of a coalition consisting of only
visible commodities. First, we consider the assign- sellers or only buyers is zero, since those players
ment market game introduced by Shapley and cannot obtain any surplus by trading among them-
Shubik (1972). An assignment market consists of selves. We define v(∅) = 0. The set of imputa-
a set of nb( 1) buyers B = {1,.. ., nb} and a set0 of tions is defined as the set
n0s ð 1Þ sellers F ¼ 10 , . . . , n0s . Each seller k (
F has one indivisible unit of a commodity to sell,
0 A¼ ðy,zÞ RBþ RFþ j
which we call object k . Thus, we have n0s objects
in the market, and these objects may be differen- )
0 X X
tiated. Each seller k places a monetary value ck0
0
yi þ z k 0 ¼ vð B [ F Þ :
( 0) for object k . Hereafter, we also denote by iB k0 F
F the set of the n0s objects. Each buyer i B wants
to buy at most one object in F and places a mon- Shapley and Shubik (1972) proved that for any
0
etary value hik0 ( 0) for each object k F. assignment game G, the core C is nonempty and
18 Cooperative Games (Von Neumann-Morgenstern Stable Sets)
given by the set of optimal solutions to the dual covering the area A1. Since K2 is depicted as a
problem of assignment problem P (B [ F), i.e., curve in the area, it is called a bargaining curve.
Since F contains infinitely many f, we have infi-
C ¼ fðy,zÞ Aj yi þ zk 0 uik 0 ¼ vðfi,k 0 gÞ nitely many stable sets in G1.
for each i, k 0 B F : Let us consider implications of the stable sets
in G1. We assume h1 > h2 for simplicity. Since a
They also showed that for each (y, z) C, the stable set K is a subset of imputations, we see that
vector (zk0 + ck0 )k0 F gives prices of the objects an efficient trade is made, i.e., Q is always
which equilibrate demand and supply of each assigned to buyer 1. More precisely, K1 shows
object. A prototype of the assignment game was the possibility that buyer 2 with a lower reserva-
studied in detail by von Neumann and tion price h2 is excluded by price competition, and
Morgenstern (1953). They considered stable sets buyer 1 pays z [h2, h1] for Q. The set K2 shows
of a market with two buyers and one seller having the possibility that the two buyers form a coalition
one object for sale. to bargain with seller 10 . In this case, buyer 1 pays
z [0, h2) for Q and gives buyer 2 part of the
Example 4 There are three players, seller 10 and additional profit h2 z as a side payment, which is
buyers 1 and 2. Seller 10 has an object Q for sale. determined by a function f. A stable set does not
Seller 10 has no monetary value for Q. Buyers specify a particular price z. Instead, a stable set
1 and 2 value Q at h1 and h2 dollars, respectively. specifies a division rule of a profit. For G1, if a
We assume h1 h2 0 and h1 > 0. The object price z is determined in [h2, h1], buyer 1 receives
Q can be owned by only one buyer, since it is the full profit h1 z; if z [0, h2), a rule f specifies
indivisible. However, payments can be freely a portion of the additional profit h2 – z that buyer
made among players. 2 gets. In addition, multiple division rules are
allowed within F , i.e., the set derived from inter-
Let G1 be the assignment game of the above nal and external stability. Interpreting these fea-
market. The player sets of G1 are B = {1, 2} and tures, von Neumann and Morgenstern (1953)
F = {10 }, and the characteristic function v is such explained that a division rule in a stable set
that v ({1, 2, 10 }) = h1, v ({1, 10 }) = h1, v ({2, shows a standard of behavior that can be
10 }) = h2, v ({1, 2}) = 0, and v(1) = v(2) = established among the players and that multiple
v (10 ) = v (∅)= 0. The set of imputations is division rules can be stable standards of behavior.
defined by A ¼ ðy1 , y2 ,zÞ RBþ RFþ j y1 þ y2 þ Example 4 raises two questions on the existence
z ¼ h1 g: We note that a payoff z of seller 10 also of stable sets in assignment games. First, when does
denotes a price of Q. To describe all stable sets in an assignment game have the stable core? This is a
G1, let F be the set of continuous nondecreasing natural question, since if h2 = 0, then the set K^
function f of w [0, h2] with w f (w) 0 and ¼ fðy1 ; y2 ; zÞ Aj y1 ¼ h1 z, y2 ¼ 0, h1 z
f (0) = 0. Von Neumann and Morgenstern (1953) 0g becomes the stable core. However, even if
showed that K is a stable set of G1 if and only if h2 > 0, the set K^ is qualified as a stable set with
K is a union of two subsets of imputations, no payment to buyer 2. This raises the second
question: does every assignment game have a stable
set in which any monetary transfer is restricted
K 1 ¼ fðy1 , y2 ,zÞ Aj y1 ¼ h1 z, y2 ¼ 0, h1 z h2 g,
within each of the efficient trading pairs? This ques-
K 2 ¼ fðy1 , y2 ,zÞ Aj y1 ¼ h1 z f ðh2 zÞ;
tion was answered completely by recent studies.
y2 ¼ f ðh2 zÞ, h2 z 0g To consider the first question, let M = (B, F, U)
be any assignment market and G the associated
supported by some f F . The set K1 is the core of assignment game. We may assume without loss of
G1. However, no imputation in K1 dominates any generality that |B| = |F | = n and that the rows and
imputation in A1 = {(y1, y2, z) A| y2 + z h2}. columns of U are arranged so that the diagonal
It is K2 that gives K the full external stability assignment x
with x
ii0 ¼ 1ði ¼ 1, . . . ,nÞ is an
Cooperative Games (Von Neumann-Morgenstern Stable Sets) 19
optimal solution to P (B [ F). We say that U has a Example 5 There are two buyers 1 and 2 and two
dominant diagonal if all of its diagonal entries are sellers 10 and 20 . Each seller has the same object Q.
row and column maximums,
i.e., Sellers 10 and 20 value Q at c10 and c20 dollars,
uii0 ¼ maxfuik 0 j k 0 F g ¼ max uji0 j j B for respectively, while buyers 1 and 2 value Q at h1
each i = 1,. . . n. If U has a dominant diagonal, and h2 dollars, respectively. We assume
each pair of buyer i and seller i0 (i = 1,. . ., n) can h1 > h2 > c20 > c10 0.
produce a maximum surplus by trading with each Let G2 be the associated assignment game. The
other. Thus, each player does not have to compete set of imputations of G2 is
with others in the same side for a partner. It is then
proved that the dominant diagonal condition is a A ¼ ðy,zÞ ¼ ðy1 , y2 , z10 , z20 Þ R4þ
necessary and sufficient condition for the core
of j y1 þ y2 þ z10 þ z20 ¼ h1 þ h2 c10 c20 g
G to contain the imputations y, z and y, z with
yi ¼ 0, zi0 ¼ uii0 , yi ¼ uii0 , and zi0 ¼ 0 for i = 1,. . ., and the core C is
n. Furthermore, the following theorem holds.
C ¼ fðy, zÞ A j ðy1 , z10 Þ ¼ ðh1 p0 , p0 c10 Þ,
Theorem 7 Let M = (B, F, U) be any assignment ðy2 , z20 Þ ¼ ðh2 p0 , p0 c20 Þ, p0 ½c20 , h2 g:
market with |B| = |F |. The associated assignment
game G has the stable core if and only if U has a
dominant diagonal. Shubik (1985) presented an interesting trans-
It is also proved that an assignment game G is action rule for Böhm-Bawerk markets, which is
convex if and only if a valuation matrix U satisfies called the official price mechanism.
that uik0 = 0 for each (i, k0 ) B F with i 6¼ k. The official price mechanism presumes that a
The dominant diagonal condition is weaker than set of efficient trading pairs is determined and
this condition for a convex assignment game. transactions are conducted only within each of
Hence, the core becomes the unique stable set those pairs. In this mechanism, an official price
for an assignment game in a class that includes p is first announced, and then each pair trades at p.
convex assignment games. For more details, refer However, if the official price is not in between
to Solymosi and Raghavan (2001). their reservation prices, the trade is done at the
To address the second question, we first show nearest reservation price. For Example 5, assume
an instance of the Böhm-Bawerk market. (1, 10 ) and (2, 20 ) to be the chosen efficient trading
A Böhm-Bawerk market is an assignment market pairs. Then the official price mechanism derives
(B, F, H, c) in which there is no product differen- the following set: Ko = C1 [ C2 [ C [ C3 [ C4,
tiation in objects for sale, i.e., hik0 = hi for each where
(i, k0 ) B F.
C 1 ¼ ðy,zÞ Aðy1 , z10 Þ ¼ 0, h1 c10 , y2 , z20 ¼ 0, h2 c20 ðobtained at p h1 Þ
C 2 ¼ ðy,zÞ Aðy1 , z10 Þ ¼ h1 p, p c10 , y2 , z20 ¼ 0, h2 c20 , p ½h2 , h1 ,
C ¼ ðy,zÞ Ajðy1 , z10 Þ ¼ ðh1 p, p c10 Þ, y2 , z20 ¼ h2 p, p c20 , p ½c2 , h2 ,
C 4 ¼ ðy,zÞ Ajðy1 , z10 Þ ¼ ðh1 p, p c10 Þ, y2 , z20 ¼ h2 c20 , 0 , p ½c10 , c20 ,
C 5 ¼ : ðy,zÞ Ajðy1 , z10 Þ ¼ ðh1 c10 ,0Þ, y2 , z20 ¼ h2 c20 , 0 ðobtained at p c10 Þ
The set KO contains only imputations brought by party in the sense that there is no payment between
transactions within each of the efficient trading pairs players belonging to different trading pairs. Shubik
(1, 10 ) and (2, 20 ). There is no payment to a third (1985) proved the following theorem.
20 Cooperative Games (Von Neumann-Morgenstern Stable Sets)
Example 5 further suggests that a stable set Applying the above notions, the stable set KO
with no payment to a third party can be regarded of Example 5 is expressed as follows:
as a union of cores of subgames defined below.
For a given assignment market M = (B, F, U), let K o ¼ Cf10 ,20 g [ Cf20 g [ C∅ [ Cf2g [ Cf1,2g
x
be any optimal solution to the associated
assignment problem P (B [ F). A buyer i is said In general, we have the following theorem,
to be matched at x
if x
ik 0 ¼ 1 for some k0 F. whose proof was outlined by Shubik (1985) and
A matched seller is defined in the same way. Let completed by Núñez and Rafels (2013).
B
, F
be the sets of matched buyers and sellers at
x
. A bijection m from B
onto F
is referred to as Theorem 9 Let M = (B, F, U) be any assignment
the optimal matching associated with x
if m(i) = j0 market and G its associated assignment game. For
and m1(j0 ) = i for (i, j0 ) B
F
with x
ij0 ¼ 1. each optimal matching m in M, the union of
We simply call m an optimal matching in M if m is the extended cores of m-compatible subgames,
an optimal matching associated with some opti- i.e., K m ¼ [ðI ,J Þ Sm CI[J gives a stable set of G.
mal solution to P (B [ F). For any pair of subsets Furthermore, Km is a unique stable set that we have
I B and J F, a submarket MI[J is an assign- when monetary transfer is restricted within each
ment market (B\I, F \J, UI[J), where UI[J is the pair formed at m.
valuation submatrix obtained by deleting the rows
in I and the columns in J from U. The subgame The existence of stable sets in the whole class
GI[J is the assignment game associated with of assignment games had been an unresolved
MI[J. The characteristic function of GI[J is question for many years, which was positively
denoted by vI[J. Given an optimal matching m answered by Theorem 9.
in M, a subgame GI[J of G is said to be
m-compatible if Marriage Games
This section considers stable sets of one-to-one
X matching games, the so-called marriage games. A
vðB [ F Þ ¼ vI[J ððBnI Þ [ ðFnJ ÞÞ þ uimðiÞ
marriage game G is defined by a triple (M, W, R).
i I\B
i means staying single. A potential partner who is Pareto-inferiority to n for M and for W is defined in
preferred (or inferior) to i is said to be acceptable the same way by replacing m(m)Rmn(m) with n(m)
(or unacceptable) for player i. For simplicity, we Rmm(m) and m(w)Rw n(w) with n(w)Rw m(w).
also use a rank order list to present a preference The core C of a marriage game G is defined to
relation. For example, “m)w1,. . ., wk, m, wk + 1,. . ., be a subset of individually rational matchings that
wg” is the rank order list representing m’s prefer- are not blocked by any pair (m, w) M W.
ences w1Pm. . . Pmwk PmmPmwk + 1Pm. . . Pmwg. Although a matching in the core is usually called a
An outcome of a marriage game G = (M, W, R) stable matching in the literature on matching
is a matching, which is defined by a bijection m: games, we call it a core matching to avoid confu-
M [W ! M [W with m(m) W [{m} for each sion with matchings in a stable set. Gale and
m M, m(w) M [{w} for each w W, and Shapley (1962) proved that the core is nonempty
m(m) = w if and only if m(w) = m. Let A0 be the set for any marriage game by using their celebrated
of all matchings in G. Given a matching m A0, a deferred acceptance algorithm.
player i M [ W is said to be matched at m if m Let V be any nonempty set of matchings. For any
assigns i to another player in the other set. Other- pair of matchings m, n V, we define two functions
wise, player i is said to be unmatched at m. A pair m ^ n and m _ n from M W to M W by
(m, w) M W is called a matched pair at m if
m(m) = w (and m(w) = m). We denote by Mm and m ^ nðmÞ ¼ minfmðmÞ, nðmÞg for each m M ,
Wm the sets of matched men and women at m, m ^ nðwÞ ¼ maxfmðwÞ, nðwÞg for each w W ,
respectively. For simplicity, we also use set theo- m _ nðmÞ ¼ maxfmðmÞ, nðmÞg for each m M,
retic notations of a matching such as m = {(mi1 , m _ nðwÞ ¼ minfmðwÞ, nðwÞg for each w W ,
wj1 ),. . ., (mik , wjk )} and (mi1 , wj1 ) m, where (mi1 ,
wj1 ),. . ., (mik , wjk) are the matched pairs at m. where min{m(i), n(i)} and max{m(i), n(i)}, respec-
To define the core and a stable set of a marriage tively, denote a weakly inferior element and a
game G, we define a domination relation between weakly preferable element in {m(i), n(i)} for
two matchings. Let m, n be any pair of matchings. We player i M [ W. We say that V is a lattice if
say that m dominates n if (1) there exists a matched m ^ n V and m _ n V for each m, n V. We
pair (m, w) at m with m(m) = wPmn(m) and m- also say that V has invariant matched players if
(w) = mPw n(w) or (2) there exists an unmatched Mm = Mn and Wm = Wn for each m, n V.
player a M [ Wat m with m(a) = aPan(a). When m It is well known that the core C of a marriage
dominates n via a matched pair (m, w) at m, the pair game has the following properties:
(m, w) is called a blocking pair to n. If a coalition
S M [ W is effective in m’s domination of n, i.e., 1. Lattice structure: C is a lattice.
m(S) = S and m(a)Pan(a) for each a S, then 2. Invariant matched players: C has invariant
S always includes a matched pair or an unmatched matched players.
player at m that enables m to dominate n. Thus, we 3. Opposition of interests: for any m, n C, if
only consider domination by a pair or a single player. m(m)Rmn(m) for each m M, then n(w)Rw
When we do not have to specify a matching that m(w) for each w W and vice versa.
contains a blocking pair (m, w), we say that n is 4. Existence of polarized optimal core matchings:
blocked by (m, w). A matching n can also be blocked there exists a man-optimal core matching mM
by a single player i M [ W if iPin(i). A matching m and a woman-optimal core matching mW such
is said to be individually rational if m is not blocked that mM and mW are M-Pareto superior and
by any single player. Let A be the set of individually W-Pareto superior to any other m C, respec-
rational matchings of G. Finally, we say that m is tively. Here, if C is a singleton, these two
M-Pareto superior to n if m(m)Rmn(m) for each m matchings coincide and vice versa.
M with strict preference for some m, and m is
W-Pareto superior to n if m(w)Rw n(w) for each The third and fourth properties are in fact
w W with strict preference for some w. The m’s derived by the lattice property. The polarized
22 Cooperative Games (Von Neumann-Morgenstern Stable Sets)
optimal core matchings can be found by the (m2, w3), (m3, w1)} and m3 = {(m1, w3), (m2, w2),
deferred acceptance algorithm in polynomial (m3, w1)} are dominated by mW via (m3, w2). Thus,
time. See Roth and Sotomayor (1990), Gusfield K is externally stable and therefore a stable set of
and Irving (1989), and Manlove (2013) for more G3. We note that K has a lattice structure and
details of the core and the deferred acceptance invariant matched players. Furthermore, the set
algorithm. K is in fact the unique stable set of G3.
A stable set of a marriage game is a nonempty Ehlers (2007) gave the following two character-
subset K of A (the set of individually rational izations of a stable set in a marriage game with a
matchings) that satisfies internal stability (i.e., remark that the second type of characterization was
for any m, n K, m does not dominate n) and first noted by von Neumann and Morgenstern (1953).
external stability (i.e., for each n A \ K, there
exists m K that dominates n). We use C(R) and Theorem 10 Let G = (M, W, R) be a marriage
KðRÞ to denote the core and the set of stable sets game.
obtained under a preference profile R. For a stable
set defined on A0 (the set of all matchings), see the 1. If K is a stable set of G, then K is a maximal set
remark at the end of this section. Let us examine a such that:
simple marriage game to see its core and (a) K is a superset of C(R).
stable set. (b) K is a lattice.
(c) K has invariant matched players.
Example 6 Let G3 = (M, W, R) be the marriage
game with M = {m1, m2, m3}, W = {w1, w2, w3}, Furthermore, if K is a unique maximal set
and the following preference profile R: with (a), (b), and (c), then K is a stable set of G.
m1 w2 , w1 , w3 , m1 , w1 m3 , m2 , m1 , w1 2. K is a stable set of G if and only if
m2 w2 , w3 , w1 , m2 , w2 m3 , m1 , m2 , w2
m3 w3 , w2 , w1 , m3 , w3 m1 , m2 , m3 , w3 : K ¼ fm Aj m is not blocked by any pair in T ðK Þg:
set satisfying conditions (a), (b), and (c). Ehlers and putting it immediately below m1. The
(2007) gave an example in which we have two = {mM, m
original core C(R) W} is extended to the
maximal sets with (a), (b), and (c), but only one core C ðRÞ ¼ m
M , mM , mW under R, and C ðRÞ
of them is a stable set. We will discuss the existence gives the stable set K under R. We can easily make
of a stable set later. The second characterization the K-trimmed preference profile if a stable set
means that a stable set K is equivalent to the set of K is given. However, without knowing a stable
individually rational matchings that are not domi- set K, can we make a preference profile R under
nated by any pair in T(K). Thus, a stable set in a which the core C(R) gives a stable set K under R?
marriage game can be regarded as a core obtained The following theorem gives a positive answer to
when only pairs in T(K) can block a matching. this question.
For any preference profile R and any set of
pairs S M W, we define the preference profile Theorem 11 For any marriage game G = (M, W,
R\S to be a preference profile that is obtained by R), there exists a preference profile R
with the
the following trimming-off operation on the rank property that KðRÞ ¼ KðR
Þ ¼ fC ðR
Þg . The
order lists corresponding to R: preference profile R
can be constructed from
R in polynomial time.
• For each m M, on the rank order list of Rm,
remove all acceptable women w with (m, w) The equation KðRÞ ¼ KðR
Þ ¼ fC ðR
Þg
S, and put them immediately below m without means that the set of stable sets under R
is equiv-
changing their relative ranks. alent to the set of stable sets under the original
• For each w W, on the rank order list of Rw, preference profile R and that a marriage game
remove all acceptable men m with (m, w) S, (M, W, R) has a unique stable set, which is
and put them immediately below w without C(R
). As mentioned in section “Stable Sets in
changing their relative ranks. Characteristics Function from Games,” the core
existence property of marriage games does not
Given a stable set K under R, let suffice for the existence of a stable set. Since the
preference profile R
can be obtained for any
marriage game, this theorem shows the existence
S ¼ ðm,wÞ M W j ðm,wÞ m for each m K
K ¼ fm Aj m is not blocked by any ðm,wÞ T ðK Þg 1. Preliminaries: Since any stable set includes
¼ C ðRÞ: the core and has invariant matched players
from Theorem 10, if some players are
In Example 6, since (m1, w2), (m3, w1) / T (K), we unmatched at some core matching, then they
obtain the K-trimmed preference profile R by trim- are also unmatched at any matching in any
ming off these two pairs from R as shown below: stable set. Thus, for finding a stable set, it
suffices to consider a subgame (M0 , W0, R0 ) of
G in which (1) M0 and W0 do not include the
m1 w1 ,w3 ,m1 ,w2 , w1 m2 ,m1 ,w1 ,m3
unmatched players in G and (2)
m2 w2 ,w3 ,w1 ,m2 , w2 m3 ,m2 ,w2 ,m1
m3 w3 ,w2 ,m3 ,w1 w3 m1 ,m2 ,m3 ,w3 : R0 ¼ R0x x M [W is such that each R0m over
W0 [{m} is defined by using the relative rank-
For example, m1’s rank order list is obtained by ings over W0 [{m} under Rm, and each R0m over
removing w2 from the original list w2, w1, w3, m1 M0 [ {m} is also defined in the same way.
24 Cooperative Games (Von Neumann-Morgenstern Stable Sets)
Furthermore, since we are considering a stable both the W-inferior and M-inferior m-adjacent
set defined on the set of individually rational pairs. Refer to Gusfield and Irving (1989) for
matchings, we can neglect a matching having a the details of these properties.
pair in which one player is unacceptable to the 2. Important properties: Wako (2010) showed
other. Thus, we may delete such pairs by trim- the following properties of core matchings,
ming them off from the preference profile. unstable pairs, and sets of stable sets.
Hence, without loss of generality, we may
assume that all players in M [ W are matched Property 1 (1) If a core matching m C(R)
at some core matching, and for each (m, w) dominates each matching n A(R) containing
M W, w is acceptable to m if and only if m is (m, w), then (m, w) is an unstable pair under R.
acceptable to w. (2) Let m be a core matching in C(R) that is
Under the above assumption, take any pref- neither W-worst nor M-worst.
erence profile R and any core matching m (2a) If m has no W-inferior m-adjacent
C(R). Let A(R) be the set of individually ratio- matching, then there exists a W-inferior m-adjacent
nal matchings under R. It should be noted that pair (m, w) such that m dominates each matching n
all players are matched at m. First, we define an A(R) containing (m, w).
unstable pair under R to be any pair (m, w) (2b) If m has no M-inferior m-adjacent
M W that is not formed at any matching in matching, then there exists an M-inferior
any stable set K KðRÞ. We say that: m-adjacent pair (m, w) such that m dominates
• n A(R) is a W-inferior m-adjacent each matching n A(R) containing (m, w).
matching under R if:
1. All players are matched at n. Property 2 If (m, w) is an unstable pair under R,
2. For each w W, n(w) is ranked imme- then KðRÞ = KðRnfðm, wÞgÞ.
diately below m(w) in Rw or n(w) = m(w).
3. n is M-Pareto superior to m. Property 1 means that for any core matching m
• (m, w) is a W-inferior m-adjacent pair under under a given preference profile R, if there is no
R if w and m are mutually acceptable, and W (M)-inferior m-adjacent matching, then there
m is ranked immediately below m(w) in Rw. exists an unstable pair, or it is the W-worst or
• n A(R) is a W-worst matching under R if M-worst core matching. Property 2 means that
for each w W, n(w) is the least preferred even if we trim off an unstable pair from R, the
acceptable partner Rw. set of stable sets does not change. Using these
We also define an M-inferior m-adjacent properties, we will present a basic idea to prove
matching, an M-inferior m-adjacent pair, and Theorem 11.
an M-worst matching in the same manner as
above by exchanging the set W and its element 3. Procedure to find a stable set: First, let
w for the set M and its element R(0) = R. Starting with the woman-optimal
m symmetrically. core matching m C(R(0)), we examine
We note that even if m is fixed, we have whether there is a W-inferior m-adjacent
different W (or M)-inferior m-adjacent matching. If we find such a matching m0 ,
matchings and pairs depending on R. It is which belongs to C(R(0)), then move on to m0
well known that if m is a core matching in and examine whether there is W-inferior
C(R), then both W-inferior m-adjacent and m0 -adjacent matching. If no such a matching
M-inferior m-adjacent matchings also belong exists, then from (1) and (2a) in Property 1,
to C(R). In addition, these matchings can be there is a W-inferior m0 -adjacent pair (m1, w1)
found by the method of elimination of rota- that is unstable under R(0), and then trim it off
tions. However, if m is W-worst (M-worst), from R(0) and let R(1): = R(0)\ {(m1, w1)}. From
then there exists no W-inferior (M-inferior) Property 2, we have K Rð0Þ ¼ K Rð1Þ . By
m-adjacent matching. The same is true for this trimming-off operation, the core gets
Cooperative Games (Von Neumann-Morgenstern Stable Sets) 25
larger, i.e., C (R(0)) C (R(1)). Thus, m0 elimination of rotations, which is studied in detail
C (R(1)). Applying Property 1 to m0 and R(1), by Gusfield and Irving (1989). For the details of
we examine whether there is a W-inferior the proof of Theorem 11, see Wako (2010).
m0 -adjacent matching under R(1). If no such a
matching exists, then from (1) and (2a) in Prop- Remark In this section, we defined a stable set
erty 1, there is a W-inferior m0 -adjacent pair on the set A of individually rational matchings.
(m2, w2) that is unstable under R(1), and then However, it can also be defined on the set A0 of all
trim it off from R(1) and let R(2): = R(1)\ {(m2, matchings. Ehlers (2007) defined the way of mod-
w2)}. At this point, we have ifying a given preference profile so that we can
have a stable set on A0 as a stable set on the set of
individually rational matchings under the modi-
K Rð0Þ ¼ K Rð1Þ ¼ K Rð2Þ and fied preference profile. Hence, the theorems in this
C Rð0Þ C Rð1Þ C Rð2Þ : section hold for stable sets on the set of all
matchings.
an outcome of the market, called an allocation, to m, n K, m does not weakly dominate n, and for
be a bijection x from N onto N, where x(i) denotes each n A \ K, there exists m K that weakly
the house assigned to player i at x. An allocation is dominates n. On the other hand, the core of M is
a permutation of N. For simplicity, we also use a the set of allocations that are not strongly domi-
vector representation such as x = (x1,. . ., xn) to nated by any other allocations. An sdom stable set
denote an allocation x, in which each element xi of M is a nonempty set of allocations with internal
denotes x(i). Let A be the set of allocations. stability and external stability defined by strong
A market defined as above is referred to as a domination instead of weak domination.
house barter market M = (N, R) or briefly a market From the above definitions, the strict core is a
M. subset of PA \ IR, and the core is a subset of WPA
Let x, y be any pair of allocations in a market \ IR. A wdom stable set is a subset of PA, and an
M = (N, R). For each nonempty coalition S N, sdom stable set is a subset of WPA. However, both
let x(S) be the set of houses assigned to the mem- wdom and sdom stable sets may not be subsets of
bers of S at x, i.e., IR. Shapley and Scarf (1974) proved that the core
is nonempty for all house barter markets. How-
xðS Þ ¼ fj N j j ¼ xðiÞ for some i S g: ever, since external stability is not imposed on the
core, the core does not necessarily coincide with
We say that x weakly dominates y and denote it an sdom stable set. In fact, the following example
by x wdom y if there exists a coalition S such that: shows that there is a house barter market with no
sdom stable set.
1. x(i)Riy(i) for each i S with strict preference
for some i S. Example 7 Let M1 = (N, R) be the market with
2. x(S) = S. the player set N = {1, 2, 3} and the following
preference profile:
The second condition is the effectivity condi-
tion, which requires that each player i in S can 1) 2P1 3P1 1,
obtain house x(i) by exchanging their own endow- 2) 3P2 1P2 2,
ments. We say that x strongly dominates y and 3) 1P3 2P3 3:
denote it by x sdom y if x(i)Piy(i) for each i
S and x(S) = S. We use the notations x wdomS Market M1 has six allocations: x1 = (2, 3, 1),
y and x sdomS y when we indicate the associated x = (2, 1, 3), x3 = (1, 3, 2), x4 = (3, 2, 1),
2
x2,. . ., x6. In addition, every player shows only T k ¼ [ Bi N n [k1l¼1 T l and there is no nonempty
i Tk
strict preferences. Roth and Postlewaite (1977) in
fact proved that for any house barter market, if S T k with S ¼ [ Bi N n [k1
l¼1 T l :
iS
each player’s preferences are strict, then the strict
core is a singleton, and it is a unique wdom stable We say that Tk T is a lower (higher) set of Tl
set, i.e., the wdom stable core. Wako (1991) pro- T if k > l (k < l). The fact that T = {T1,. . ., Tm}
ved that this property is generalized as follows: is a PMSS means that for each player i in Tk T,
player i’s most preferred houses among Tk and its
Theorem 12 For any house barter market lower sets are endowed in Tk. Quint and Wako
M = (N, R), if the strict core SC is nonempty, it (2004) showed that any house barter market has at
is a unique wdom stable set. Furthermore, for any least one PMSS and that even if more than one
x, y SC, we have x(i)Iiy(i) for each i N. PMSS exists, each PMSS consists of the same sets
with only orders of some sets being different.
This theorem shows the following features of Then the following theorem was proved.
the strict core of a house barter market. First, any
allocation outside the strict core is weakly domi-
Theorem 13 Let T = {T1,. . ., Tm} be a PMSS of
nated by some strict core allocation, since the
a house barter market M = (N, R). Then the strict
strict core is a wdom stable set. Second, even if
core is nonempty if and only if there exists an
the strict core contains different allocations, they
allocation x A such that
are indifferent for each player. However, the strict
core can be empty when indifferences are allowed
xðT k Þ ¼ T k and xðiÞ Bi N n [k1
l¼1 T l
in preferences. Shapley and Scarf (1974) first
pointed out this fact by the following example. for each i T k and each T k T :
Example 8 Let M2 = (N, R) be the market with The necessary and sufficient condition given
the player set N = {1, 2, 3} and the following above requires that in each Tk T, each player
preference profile: i Tk can obtain his/her most preferred house
(among those owned in Tk and its lower sets)
1) 2P1 3I 1 1, through a feasible exchange within Tk. We refer
2) 1I 2 3P2 2, to this condition as segmentability. Suppose that
3) 2P3 1I 3 3: T is a PMSS of a house barter market with
segmentability. Then, even if a player in a set Tk
We see that the strict core of M2 is empty and that T has more preferable houses in a higher set Th,
the sets K1 = {(2, 3, 1), (2, 1, 3)} and K2 = {(1, 3, 2), those houses are exchanged within Th in a mutu-
(3, 1, 2)} are both wdom stable sets of M2. Thus, ally beneficial way. In addition, from the defini-
neither the nonemptiness of the strict core nor the tion of a PMSS, no player in Th has an incentive to
uniqueness of a wdom stable set holds when indif- trade with a player in lower sets Tk with k > h.
ferences are allowed in preferences. Since the strict core is a wdom stable set,
segmentability is also a sufficient condition for
Quint and Wako (2004) considered a necessary the existence of a wdom stable set. Quint and
and sufficient condition for the strict core to be Wako (2004) gave a polynomial-time algorithm
nonempty. For each player i N and each non- to examine segmentability of a house barter mar-
empty coalition S N, let Bi(S) be the set of player ket. We show an example of a house barter market
i’s most preferred houses in S, i.e., Bi(S) = {h S| with segmentability.
hRij for each j S}. We call a partition
T = {T1,. . ., Tm} of N a partition by minimal Example 9 Let M3 = (N, R) be the market with
self-mapped sets (PMSS) if each Tk T satisfies the player set N = {1, 2, 3, 4, 5, 6} and the
the following conditions: following preference profile:
28 Cooperative Games (Von Neumann-Morgenstern Stable Sets)
1) 2P1 3P1 5P1 4P1 1P1 6, Throughout the many different situations that it
2) 1 I 2 3P2 4P2 6P2 5P2 2, covers, including strategic form games, the solu-
3) 1P3 2P3 3P3 4P3 5P3 6, tion concept is kept constant, namely, that of von
4) 2P4 5P4 6P4 3P4 4P4 1, Neumann-Morgenstern stability.) Chwe’s model
5) 1 I 5 4P5 5P5 3P5 6P5 2, is a simplification of one of the many models – or
6) 3P6 6P6 1P6 2P6 4P6 5: as Greenberg called them situations – introduced
there. Farsighted stable sets are defined as von
Although M3 has two PMSSs, T = {T1 = {1, 2, Neumann-Morgenstern stable sets defined by
3}, T2 = {4, 5}, T3 = {6}} and using Chwe’s indirect domination relation. To
T 0 ¼ T 01 ¼ f1,2,3g, T 02 ¼ f6g, T 03 ¼ f4,5g , the distinguish between farsighted stable sets and the
differences are only in the orders of sets in T and stable sets discussed earlier, we will refer at times
T0. In this market, the set K = {(2, 3, 1, 5, 4, 6)} is to the latter as classical stable sets.
the strict core. While the focus of this section is on farsighted
The house barter market was also discussed by stable sets, it should be noted that Harsanyi’s
Moulin (1995) from a wide perspective of coop- original version of indirect domination has also
erative microeconomics and game theory. Recent been used in the literature.
studies have shown that interesting economic Greenberg et al. (2002) used Harsanyi’s indi-
implications are derived from this market model rect domination to define a stable set, which they
when we assume the farsighted von Neumann- call a sophisticated stable set, and applied them to
Morgenstern stability. an exchange economy. They show that there is a
one-to-one correspondence property between the
sophisticated stable sets defined on the set of
payoffs in the economy and the sophisticated sta-
Farsighted Stable Sets in a General ble sets defined on the set of allocations. This
Setting property is shared by the core but not the classical
stable set.
The Model Chwe defines a game as the collection of the
We have so far looked at stable sets using a dom- following primitives: (N, X, (≲i)i N, (!S)S N,
ination relation that involved only a one-step devi- S 6¼ ø) where N = {1, 2,. . ., n} represents the set of
ation by a coalition. Harsanyi (1974) argued that players, X is the set of possible outcomes, and ≼i is
these stable sets use domination relations that do player i’s preferences over the set X. These three
not take into account possible subsequent devia- elements are related to the components of a strate-
tions by some other coalitions and defined a new gic form game, while the fourth component may
domination relation called indirect domination. not be as familiar. For each nonempty subset of
Chwe (1994), based on this critique, modified players, S N, !S is a binary relation on X called
the indirect domination concept based on a ver- the enforceability relation, or the effectiveness rela-
sion laid out in the postscript of Harsanyi (1974), tion, where x ! S y indicates that when the status
and laid out a general framework that includes quo is x, the coalition S can induce outcome y by
both noncooperative and cooperative game themselves. That is, the enforceability relation con-
models on which these concepts can be defined. tains information about what coalitions can do. As
Greenberg (1990) had already laid down the we will see in the subsequent sections, this model is
groundwork to apply von Neumann-Morgenstern general enough that games in the game theoretic
stable sets to models outside of the characteristic literature can be formulated in this manner. It will
function form games, such as strategic form also be apparent that how the enforceability condi-
games. (The theory of social situations, as laid tion is defined affects the definition of indirect
out in Greenberg (1990), starts with an abstract domination and the farsighted stable set.
framework, which Greenberg calls a situation, Let x and y be two outcomes in X. We say that
where the rules of the situation have been defined. x indirectly dominates y and denote this by x
y
Cooperative Games (Von Neumann-Morgenstern Stable Sets) 29
if there exists a sequence of outcomes y = x0, x1, , game forms outside of the games in characteristic
xp = x and coalitions S1, S2, , Sp such that for each function form. We call the stable set defined by the
j = 1, 2, , p, (i) xj1 !S j xj and (ii) xj1 ≺S j x. relation d as the myopic stable set. The myopic
Condition (i) states that each move from xj 1 to xj is stable set is very similar to the classical stable set,
a feasible one by the coalition Sj, and these moves except that the former is built off a model with an
start from y and end at x. Condition (ii) states that all explicitly defined enforceability relation, while
these coalitions are better off at the final outcome the latter is built off a domination relation that is
than at the outcome when they make their move. implicitly myopic. See Kawasaki (2010) for
Note that it needs not be the case that all members details in the house barter game and Herings
in, for example, Sj are better off in outcome xj than in et al. (2017) for two-sided matching where those
xj 1. two concepts can be different. We mention briefly
The notation ≺S denotes the preference relation some other domination relations closely related to
of the coalition S and is implicitly assumed that it indirect domination. In Harsanyi’s original defini-
can be represented in terms of the preferences of tion of indirect domination, the first condition
the individual, ≾i. In practice, x ≺S y if x ≺i y for all (i) is replaced by the following condition: (i’) for
i S, or x ≺ S y if x ≾i y for all i S and x ≺i y for each j, xj 1 is directly dominated by xj. There-
some i S. Unless specified otherwise, in the parts fore, Harsanyi’s indirect domination is stronger
that follow, we employ the former interpretation for than the indirect domination relation defined in
the notation ≺S and call the indirect domination Chwe (1994). Another domination relation can
using this version as the usual indirect domination. also be defined by using just condition (i’) and
When we use the usual indirect domination in a without condition (ii). Page and Wooders (2009)
given circumstance, we will not write out the def- used the term path dominance to describe this
inition of indirect domination. For those relations domination relation in the network formation
that are not usual, the indirect domination relation model.
will be defined explicitly.
We hereupon remark that in the definition of The Largest Consistent Set and the Largest
indirect domination, it is implicitly assumed that Farsighted Conservative Stable Set
joint moves by groups of players are neither once- Because this entry focuses entirely on the stable
and-for-all nor binding, i.e., some players in a set of von Neumann and Morgenstern, the major-
deviating group may later make another move ity of what follows focuses on stable sets defined
with players within or even outside the group. In on indirect domination relations defined in several
some models, this assumption may not be valid, models. However, Chwe (1994) also defined a
and in those circumstances, we impose restric- solution concept called the largest consistent set.
tions on the relation !S. This is done in nonco- To obtain a better perspective on this solution
operative strategic form games, in which concept, we first introduce an equivalent formula-
coalitional deviations are not allowed, and in net- tion of a farsighted stable set.
work formation games and matching games, in Let K X be a set of outcomes and for each
which deviations are conducted through only x X, define Kx = {y K| y = x or y
x} to be
pairs or singletons. the set of “likely outcomes” in K when the status
When p = 1 in the definition of indirect dom- quo is x. Then, K is a farsighted stable set if and
ination, we simply say that x directly dominates y, only if the following two conditions are satisfied
which is denoted by x d y. When we want to (see Diamantoudi and Xue (2003) and Xue (1998)
specify a deviating coalition, we say that x directly for details):
dominates y via coalition S, which is denoted by
x dS y. This form of direct domination originates • x K ) There do not exist y X and S N
from the theory of social situations in Greenberg with x ! S y such that for some z Ky, z S x.
(1990) and allows us to define the myopic domi- • x2= K ) There exist y X and S N with
nation relation to strategic form games and other x ! S y such that for some z Ky, z S x.
30 Cooperative Games (Von Neumann-Morgenstern Stable Sets)
This characterization sheds light to the opti- In practice, the LCS is generally too inclusive
mistic behavior that is implicitly assumed in the to give a meaningful prediction to the model. In
farsighted stable set. The deviating coalition almost all strategic form games considered in the
S carries out the deviation if there is one outcome following, the LCS is the set of all individually
that is a likely outcome that makes the coalition rational outcomes. One exception is the game in
S better off. Kawasaki and Muto (2009), but that is because
On the other hand, the consistent set defined in every outcome is individually rational. Therefore,
Chwe (1994) assumes a more conservative behav- the general focus on the sections that follow is on
ior in the following sense. When referring to the farsighted stable sets.
set Ky in the above definition, the phrase “for
some” is replaced by “for all.” Applications of Farsighted Stable Sets in
Formally, K X is a consistent set if the Strategic Form Games
following conditions hold: Let G = (N, (Xi)i N, (ui)i N) be a game in strate-
gic form, where N is the set of players, Xi is the set
• x K ) There do not exist y X and S N of strategies for player i, and ui is player i’s payoff
with x ! S y such that for all z Ky, z S x. function. In order to apply farsighted stable sets to
• x2= K ) There exist y X and S N with this framework, we need to reformulate this stra-
x ! S y such that for all z Ky, z S x. tegic form game into the language of Chwe’s
framework.
The set L X is called the largest consistent set The first three components are relatively
(LCS) if it itself is a consistent set and for every straightforward; N is the same for both models,
consistent set K, K L. Chwe (1994) showed that X = X1 X2 Xn, and the ordinal prefer-
this concept is well defined, that is, there exists ences ≾i can be obtained from the payoff func-
one and only one such set of outcomes that can be tions very easily. The enforceability relation is
called the LCS. Moreover, it was shown that the given by the following: for two outcomes x and y,
LCS contains all farsighted stable sets.
A careful look at the definition reveals the pos- x!S y , xi ¼ yi 8i N nS
sibility that the empty set can be a consistent set for
any environment, as the condition involving “for The enforceability relation for strategic form
all Ky” can be satisfied vacuously. This possibility games states that when a coalition S deviates from
can also lead to the possibility of the LCS being the status quo x, the other players are assumed to
empty as well. Chwe (1994) provided a sufficient be choosing their strategies in x. This assumption
condition that was later weakened by Xue (1997), is closely related to the assumptions made in equi-
guaranteeing the nonemptiness of the LCS. librium concepts. Also, note that in this model, we
Another approach around the possible empti- allow coalitions to form freely. However, if we
ness of the solution concept is to modify the want to stick to the assumptions made in the
phrase to “for all z Ky such that Ky 6¼ ∅” in concept of Nash equilibrium, we can similarly
the two conditions. Such is the approach taken in define an enforceability relation for such situa-
Greenberg (1990) for defining a conservative ver- tions by the following:
sion of the (myopic) stable set and in Diamantoudi
and Xue (2003) for the conservative version of the x!S y , xi ¼ yi 8i N nS and j S j¼ 1
farsighted stable set in coalition formation games,
which they call a farsighted conservative stable An indirect domination relation based on this
set. They also define the analogue to the LCS enforceability relation can be defined. A stable set
called the largest farsighted conservative stable defined by this indirect domination is called a
set (LFCSS). When such a set exists, it coincides noncooperative farsighted stable set. Compared
with the LCS, because the only part that separates to myopic stable sets, in most games, farsighted
the two concepts is nonemptiness. stable sets give much sharper insights into
Cooperative Games (Von Neumann-Morgenstern Stable Sets) 31
players’ behavior in economic, political, and via the sequence (C, D) ! 1 (D, D) ! 1,2 (C, C)
social situations. In the following, we first review and (D, C) ! 2 (D, D) ! 1,2 (C, C), respectively.
the results for prisoner’s dilemma games. For Moreover, no other indirect domination relation
these games, the main result is that in most exists. Hence, if the two players are farsighted
cases, only Pareto-efficient outcomes can be and make a joint but not binding move, the far-
supported by farsighted stable sets. Next, we sighted stable set succeeds in showing that cooper-
look at a public good provision game where the ation of the players results in the unique stable
production level of the public good is either 0 or 1, outcome.
and players either choose whether to contribute or We now study the farsighted stable sets in the
not. This game is closely related to the prisoner’s mixed extension of the prisoner’s dilemma, i.e.,
dilemma, but the results differ greatly as almost all the prisoner’s dilemma with mixed strategies
strictly individually rational outcomes can be played. Let X1 = X2 = [0, 1] be the sets of
supported by a farsighted stable set. Then, we mixed strategies of players 1 and 2, respectively,
review the results for duopoly markets, which and let t1 X1 (resp. t2 X2) denote the proba-
are closely related to the results in prisoner’s bility that player 1 (resp. 2) plays Cooperate. It is
dilemma. easily seen that the minimax payoffs to players
1 and 2 are both 1 in this game. We say that a
strategy combination is individually rational
Prisoner’s Dilemma
(resp. strictly individually rational) if both
To make the discussion as clear as possible, we
players’ payoffs are at least (resp. exceed) their
will focus on a particular example of the pris-
minimax payoffs. We then have the following
oner’s dilemma, which is given below. Similar
theorem in Suzuki and Muto (2000).
results hold in general prisoner’s dilemma games.
Theorem 14 Let
Prisoner’s Dilemma:
T ¼ fðt 1 , t 2 Þj 1=4 < t 1 1, t 2 ¼ 1g[
Player 2
fðt 1 , t 2 Þj t 1 ¼ 1, 1=4 < t 2 1g,
Cooperate Defect
Player 1 Cooperate 4,4 0,5 and define the singleton set K1 (t1, t2) = {(t1, t2)}
Defect 5,0 1,1 for each (t1, t2) T. Let K2 = {(0, 0), (1, 1/4)} and
K3 = {(0, 0), (1/4, 1)}. Then the sets K2 and K3 and
the singleton sets K1(t1, t2) with (t1, t2) T are the
2
farsighted stable sets of the mixed extension of the
(Cooperate, Cooperate) (Cooperate, Defect) prisoner’s dilemma, and there are no other types of
12
1 1 farsighted stable sets. (The cutoff probabilities in
(Defect, Cooperate)
2
(Defect, Defect) the set T pertain to the example above, and these
values may depend on the payoffs of the prisoner’s
We first present a farsighted stable set derived dilemma. The overall characterization result of the
when two players use only pure strategies. For farsighted stable sets nonetheless holds in general
shorthand, let C denote Cooperate and D denote prisoner’s dilemma games.)
Defect. In this case, the set of strategy combina- This theorem shows that if the two players are
tions is X = {(C, C), (C, D), (D, C), (D, D)}, farsighted and make a joint but not binding move
where in each combination, the former (resp. the in the prisoner’s dilemma, then essentially a single
latter) is player 1’s (resp. 2’s) strategy. Pareto-efficient and strictly individually rational
A myopic stable set does not exist in this game. strategy combination results as a stable outcome,
On the other hand, the singleton {(C, C)} is the i.e., K1(t1, t2). We, however, have two exceptional
unique farsighted stable set with respect to
. To cases as shown by the sets K2 and K3 in which
see this, note that (C, C)
(C, D), (C, C)
(D, C) (D, D) could be stable together with one Pareto-
32 Cooperative Games (Von Neumann-Morgenstern Stable Sets)
efficient point at which one player gains the same to (D,. . ., D). Together with Property (3), (C,. . .,
payoff as in (D, D). C) is Pareto efficient.
Given a state x, we say that x is individually
n-Person Prisoner’s Dilemma rational if for all i N , ui ðxÞ minyi X i maxyi X i
We consider an n-person prisoner’s dilemma. Let ui ðyÞ . If a strict inequality holds, we say that x is
N = {1,. . ., n} be the player set. Each player i has strictly individually rational. From (1), (3) of
two strategies: C (Cooperate) and D (Defect). Let Assumption 1 minyi X i maxyi X i ui ðyÞ ¼ f ðD, 0Þ.
Xi = {C, D} for each i N. Sometimes, we will The following theorem from Suzuki and Muto
refer to a strategy combination as a state. For (2005) shows that any strategy combination that is
each coalition S N, let XS = ∏i SXi and strictly individually rational and Pareto efficient is
XS = ∏i N\S Xi. Let xS and xS denote generic itself a singleton farsighted stable set. Moreover,
elements of XS and XS, respectively. Player i’s there are no other farsighted stable sets except in
payoff depends not only on his/her strategy the rarity that there exists a strategy combination
but also on the number of other cooperators. that is Pareto efficient and individually rational but
Player i’s payoff function ui : X ! R is given by not strictly individually rational. In such a case, we
ui(x) = fi (xi, h), where x X, xi Xi (player i’s have one more farsighted stable set, which also
choice in x), and h is the number of players includes the outcome (D,. . ., D). To state the
other than i playing C. We call the strategic form result, we define the set C(x) = {i N| xi = C}
game thus defined an n-person prisoner’s as the set of players who choose the strategy C in
dilemma game. x X.
To make the arguments simple, we assume that
all players are homogeneous and each player has Theorem 15 For the n-person prisoner’s
an identical payoff function. That is, fi’s are iden- dilemma game, if x is a strictly individually ratio-
tical and simply written as f unless any confusion nal and Pareto-efficient state, then {x} is a far-
arises. We assume the following properties on the sighted stable set. Moreover, there are no other
function f. types of farsighted stable sets except in the fol-
lowing situation. If there exists a number s
such
Assumption 1 that f (C, s
1) = f (D, 0) and each strategy
1. f (D, h) > f (C, h) for all h = 0, 1,. . ., n – 1. combination y with |C(y)| = s
is Pareto efficient,
2. f (C, n 1) > f (D, 0). then there exists exactly one more farsighted sta-
3. f (C, h) and f (D, h) are strictly increasing in h. ble set given by {x0 X: |C (x0 )| = s
} [ {(D, ,
D)}.
Property (1) states that every player prefers
playing D to playing C regardless of which strat-
egies other players play. Property (2) states that if Provision of Discrete Public Goods
all players play C, then each of them gains a A common economic application of the prisoner’s
payoff higher than the one in (D,. . ., D). Property dilemma is the provision of public goods. Con-
(3) states that if the number of cooperators sider a simple game in which every player has
increases, every player becomes better off regard- only two strategies: to “contribute (C)” and to “not
less of which strategy is played. contribute (D).” Contributing to the production of
a public good comes with a cost. Suppose that a
It holds from Property (1) that (D,. . ., D) is the positive amount of the public good can be pro-
unique Nash equilibrium of the game. Here for x, duced even if only one player contributes.
y X, we say that y is Pareto superior to x if Because everyone can enjoy the benefits of the
ui(y) ui(x) for all i N and ui(y) > ui(x) for public good once it is produced, every player has a
some i N. The state x X is said to be Pareto dominant strategy to not contribute. Thus, this
efficient if there is no y X that is Pareto superior situation would be modeled as a prisoner’s
to x. By Property (2), (C,. . ., C) is Pareto superior dilemma game.
Cooperative Games (Von Neumann-Morgenstern Stable Sets) 33
Now, suppose that the public good is provided in Duopoly Market Games
discrete amounts and that it requires a minimum of We consider two types of duopoly markets:
r
players contributing to produce the good where Cournot quantity-setting duopoly and Bertrand
r
2. Using the function f defined in the previous price-setting duopoly. For simplicity, we will con-
subsection, we consider a game that satisfies the sider a simple duopoly model in which firms’ cost
following conditions. Recall that h represents the functions and the market demand function are
number of other players choosing C. linear. Similar results, however, hold in more gen-
eral duopoly models.
Assumption 2 There are two firms, 1 and 2, each producing a
1. f (D, h) > f (C, h) for all h r
, f (D, h) = f (C, homogeneous good with the same marginal cost
h) for all h r
2, f (D, r
1) < f (C, r
1) c > 0. No fixed cost is assumed.
2. f (C, n 1) > f (D, 0)
3. f (C, 0) = = f (C, r
2) < f (C, r
1) = 1. Cournot duopoly: Firms’ strategic variables
= f (C, n 1) and f (D, 0) = = f (D, r
1) < f are their production levels. Let x1 and x2 be
(D, r
) = = f (D, n 1) production levels of firms 1 and 2, respectively.
The market price p(x1, x2) for x1 and x2 is given
The first condition states that choosing D is by
strictly better than C if the number of other contrib-
utors is at least r
, which is when the public good pðx1 , x2 Þ ¼ maxða ðx1 þ x2 Þ,0Þ,
can be produced without any additional contribu-
tion. Therefore, there is some incentive to free ride, where a > c. We restrict the domain of produc-
but choosing D is strictly worse than C when there tion of both firms to 0 xi a c, i = 1, 2. This is
are exactly r
1 other players choosing C. We also reasonable since a firm would not overproduce to
assume that choosing C or D leads to identical make a nonpositive profit. When x1 and x2 are
results when the public good is not produced. The produced, firm i’s profit is given by
second condition is the same as the prisoner’s
dilemma. The third condition states that the benefit pi ðx1 , x2 Þ ¼ ðpðx1 , x2 Þ cÞxi :
of choosing C or D depends on the number of other
contributors only through whether the public good Thus, Cournot duopoly is formulated as the
is produced or not. Thus, instead of strictly increas- following strategic form game
ing in h, the function f is mostly flat with respect to h
Just as in the prisoner’s dilemma, every strategy where the player set is N = {1, 2}; each player’s
combination that is Pareto efficient and strictly indi- strategy set is a closed interval between 0 and
vidually rational constitutes a singleton farsighted a c, i.e., X1 = X2 = [0, a c]; and their payoff
stable set. Unlike the prisoner’s dilemma, however, functions are pi, i = 1, 2. Let X = X1 X2. The
many other types of farsighted stable sets may exist. joint profit of two firms is maximized when
In fact, every outcome that is strictly individually x1 + x2 = (a c) /2.
rational but not Pareto efficient, except for (C, C,. . .,
C), is included in some farsighted stable set. See 2. Bertrand duopoly: Firms’ strategic variables
Kawasaki and Muto (2009) for details. are their price levels. Let
Y
ðpÞ ¼ ðp cÞDðpÞ: von Neumann-Morgenstern stability together
with firms’ farsighted behavior attains efficiency
We restrict the domain of price level p of both (from the standpoint of firms) also in Bertrand
firms to c p a. This assumption is also duopoly. We refer the reader to Suzuki and Muto
reasonable since a firm would avoid a negative (2006) for the details.
profit. The total profit ∏(p) is maximized at
p = (a + c)/2, which is called the monopoly Theorem 18 Let p = (p1, p2) be the pair of
price. Let p1 and p2 be prices chosen by firms monopoly prices, i.e., p1 = p2 = (a + c)/2. Then
1 and 2, respectively. We assume that if firms’ the singleton {p} is the unique farsighted stable set.
prices are equal, then they share equally the total
profit; otherwise, all sales go to the lower-pricing Some General Results for Strategic Form Games
firm of the two. Thus, firm i’s profit is given by We have shown in the previous parts some appli-
cations of farsighted stable sets to the prisoner’s
8Q dilemma and duopoly games. Below we explain
< Q ð pÞ if pi < pj
ri pi , pj ¼ ð p Þ=2 if pi ¼ pj very briefly some general findings regarding far-
:0 i if pi > pj sighted stable sets of strategic form games.
for i,j ¼ 1,2,i 6¼ j: Kawasaki (2015) considers general strategic
form games with two players and provides a suffi-
Hence, Bertrand duopoly is formulated as the cient condition for a strictly individually rational and
strategic form game Pareto-efficient outcome to be a singleton farsighted
stable set. The term individually rational used in this
article is defined using the minimax value instead of
GB ¼ N , fY i gi¼1,2 , fri gi¼1,2 ,
the maximin value (obtained by interchanging the
max and min operations), but in the games intro-
where N = {1, 2}, Y1 = Y2 = [c, a] and ri (i = 1, 2) duced here, the maximin value and the minimax
is i’s payoff function. Let Y = Y1 Y2. value coincide, which may not be the case in gen-
It is well known that a Nash equilibrium is eral. We have seen that similar results hold for the
uniquely determined in either market: x1 = x2 = prisoner’s dilemma game with possibly more than
(a c)/3 in the Cournot market and p1 = p2 = c in two players, but Kawasaki (2015) notes that similar
the Bertrand market. The following theorem holds results do not generally hold in those instances.
for the farsighted stable sets in a Cournot duopoly. Hirai (2017) shows that these results can be
recovered for three of more players under a certain
Theorem 17 Let (x1, x2) X be any strategy pair class of games, connected with the concept of
with x1 + x2 = (a c)/2. Then the singleton punishment dominance in Nakayama (1998).
{(x1, x2)} is a farsighted stable set. Furthermore, Informally, a strategy is punishment dominant
every farsighted stable set is of the form {(x1, x2)} toward the set of other players if that particular
with x1 + x2 = (a c)/2 and x1, x2 0. strategy brings the payoffs of the other players
collectively lower than any other strategy. In the
As mentioned before, any strategy pair (x1, x2) prisoner’s dilemma and in the public goods provi-
with x1 + x2 = (a c)/2 and x1, x2 0 maximizes sion model, D is a punishment dominant strategy. It
the firms’ joint profit. This suggests that the von is also known in that in these classes of games, the
Neumann-Morgenstern stability together with maximin and the minimax values coincide.
firms’ farsighted behavior yields joint profit maxi-
mization even if firms’ collaboration is not binding. Further Research on (Myopic and Farsighted)
As for Bertrand duopoly, we have the follow- Stable Sets in Strategic Form Games
ing theorem, which claims that the monopoly The literature on farsighted stable sets introduced
price pair is itself a farsighted stable set, and no here is by no means exhaustive. Here we briefly
other farsighted stable set exists. Therefore, the list other papers that studied farsighted stable sets
Cooperative Games (Von Neumann-Morgenstern Stable Sets) 35
in strategic form games and mention a few papers Myopic stable sets for the prisoner’s dilemma
that studied myopic stable sets in strategic form and the duopoly markets have been applied in
games. Masuda (2002) analyzed farsighted stable Nakanishi (2001) and in Muto and Okada (1996,
sets in average return games and obtain similar 1998), respectively. Myopic stable sets have also
results to the prisoner’s dilemma. Diamantoudi been applied to other economic models such as
(2005) defined a farsighted stable set for a cartel international trade. Nakanishi (1999) analyzed
price leadership model and proves its existence export quota games, while Oladi (2005) consid-
nonconstructively; on the other hand, in order to ered tariff retaliation games. For further studies on
characterize these farsighted stable sets, Kamijo other solution concepts related to stable sets for
and Muto (2010) reformulated the model into a strategic form games, see Kaneko (1987) and
strategic form game similar to the prisoner’s Mariotti (1997).
dilemma and obtained similar results to Suzuki
and Muto (2005). Their results imply that some of Farsighted Stable Sets in Cooperative Games
the conditions in the payoff function can be weak- In this section, we review some results on far-
ened to obtain the results of Suzuki and Muto sighted stable sets and related solution concepts
(2005). In location games, Shino and Kawasaki to games other than strategic form games. Such
(2012) showed the existence of a farsighted stable games include not only characteristic function
set that supports two different location profiles: form games, both transferable utility (TU) and
minimum differentiation and local monopoly. nontransferable utility (NTU), but also network
Kawasaki et al. (2015) apply the solution concept formation games, coalition form games, and
to an international trade model similar to that of matching markets. The literature on these models
Oladi (2005) and Nakanishi (1999) with two has grown very rapidly.
countries where each chooses a tariff rate on Part of the challenge in applying Chwe’s
imports. The game resembles closely to that of framework in this setting is defining an appropri-
the duopoly games introduced above, and they ate enforceability condition. Because unlike stra-
obtain very similar results in that model. tegic form games, some cooperative game models
The enforceability relation for strategic form do not specify every detail about what players can
games states that when a coalition S deviates from or cannot do. For example, when a coalition devi-
the status quo x, the other players are assumed to ates, nothing is stated about the outcome
be choosing their strategies in x. This assumption pertaining to the other players. This information
is very closely related to the assumptions made in was not necessary in defining the domination rela-
equilibrium concepts. Also, note that in this tion for the core and classical stable set. In the
model, we allow coalitions to form freely. How- following, we look at several models from the
ever, if we want to stick to the assumptions made perspective of how the enforceability relations
in noncooperative game theory, we can similarly are defined.
define an enforceability relation that explicitly
forbid joint moves by two or more players. Characteristic Function Form Games and
An indirect domination relation based on the Coalitional Sovereignty
enforceability relation with the restriction This section focuses on the farsighted stable sets
explained in the previous paragraph can be of characteristic function form games. Recall that
defined. A stable set defined by this indirect dom- the original stable sets were first defined for char-
ination is called a noncooperative farsighted sta- acteristic function form games. The results intro-
ble set. Nakanishi (2009) considered the duced here then allow us to make some
noncooperative farsighted stable sets of the pris- comparisons between the classical stable set and
oner’s dilemma and shows the existence of a the farsighted stable set. First, we introduce some
unique noncooperative farsighted stable set, results from Beal et al. (2008), which analyzed
which includes the outcome in which all players farsighted stable sets of transferable utility
choose D and some Pareto-efficient outcome. (TU) characteristic function form games. Then,
36 Cooperative Games (Von Neumann-Morgenstern Stable Sets)
we review a result from Bhattacharya and Brosi generalizing the existence portion of the result by
(2011), which focused on the nontransferable util- Beal et al. (2008).
ity (NTU) characteristic function form games: In both TU games and NTU games, the respec-
tive papers derive what can be seen as existence
1. TU games: Let (N, v) be a TU game. We first results suggesting that farsighted stable sets are
formulate this in terms of Chwe’s framework. more likely to exist than classical stable sets. The
The set of outcomes X corresponds to the set of main reason for this is that it is easier for one
imputations, typically labeled by I(N, v). The outcome to indirectly dominate the other. One
preferences of a player i over the set of out- possibly problematic byproduct of such indirect
comes is such that player i likes the imputation domination relation is the seeming arbitrariness of
that gives this player a higher amount. That is, the imputations supported by farsighted stable
for two imputations x, y, x ≺ i y if and only if sets. In particular, for TU games, Beal et al.
xi < yi. As for the enforceability condition, the (2008) give the following characterization result.
relation !S is defined in the following way:
Theorem 19 Suppose that for a TU game (N, v),
X
x!S y , yi vð S Þ v(N) > Si N v({i}) holds. Then, all farsighted
iS stable sets are singleton sets containing an
imputation x such that for some coalition
Beal et al. (2008) show the existence of far- S, Si S xi v(S) and xi > v({i}) for all i S.
sighted stable sets in TU game and in fact charac-
terize the farsighted stable sets of TU games (more This result then implies the surprising result
details to be laid out later). that for superadditive games in which the Shapley
value is not in the core, the Shapley value consti-
2. NTU games: Bhattacharya and Brosi (2011) tutes a singleton farsighted stable set. (This result
also established the existence of a farsighted does not exclude the possibility that when the
stable set for NTU games under mild condi- Shapley value is in the core, it is a farsighted
tions. The relation !S is defined in a way stable set. We refer the reader to Beal et al.
similar to that in TU games. Formally, let (N, (2008) for details.) Another implication is that
V) be an NTU game, where the characteristic imputations in the interior of the core cannot be
function V is now a set-valued function that supported by a farsighted stable set.
maps to each coalition S a region of the Ray and Vohra (2015a) argue that the peculiar-
n-dimensional payoff space representing the ity of these farsighted stable sets stems from the
feasible payoffs of a coalition S. fact that the enforceability condition does not
satisfy what they call coalitional sovereignty. Spe-
To formulate NTU games in terms of the prim- cifically, taking the TU game as an example, when
itives of Chwe’s framework, let X = V (N); pref- x ! S y holds for some coalition S, the only condi-
erences are defined in the same way as TU games, tion that needs to be satisfied is Si S yi v(S).
and the relation !S is defined in the following Therefore, yj for j 2
= S can be chosen arbitrarily by
way: players in S for their liking. In fact, this situation is
unavoidable, since imputations must satisfy the con-
x!S y , y V ðS Þ dition Si N xi = v(N), so that if members in S were
to move to an imputation that benefits each member,
Let bi denote the maximum payoff for i that can then someone outside S must receive a smaller
be attained in the set V ({i}) and define a vector amount.
x V (N) to be individually rational if xi bi for In the words of Ray and Vohra (2015a),
all i N. Bhattacharya and Brosi (2011) showed coalitional sovereignty is violated in such models,
that there exists a farsighted stable set if the set of and the main purpose of their paper is to construct
individually rational vectors is bounded, thereby a model that respects coalitional sovereignty that
Cooperative Games (Von Neumann-Morgenstern Stable Sets) 37
value would be split among the players depending 3. Bala-Goyal enforceability: The enforceability
upon which network is formed. condition is built off of the noncooperative
In terms of Chwe’s framework, we now have model of network formation in Bala and Goyal
the set of players, the set of outcomes being the set (2000). The main feature of this enforceability
of possible networks, and the preferences that are relation is that links can be formed and severed
defined directly on the networks. What remains to by only one player so that no consent of the
be defined is the enforceability relation !S. We other player is needed to form a link. This
introduce three versions of !S, as classified in relation seems a bit forceful for undirected net-
Page and Wooders (2009), which are defined in works, but for directed networks, because the
the network formation literature. It should be link ij can be seen as the link from i to j, it may
noted that the original formulations in their not be unnatural to think of models in which this
paper are for directed networks, but they can be link can be set up by i alone. However, it may be
translated to undirected networks as well: a bit of a stretch for i to have the power to form
link ji, so this move is not allowed in the defi-
1. Jackson-Wolinsky enforceability: The main nition for directed networks. Moreover,
building blocks for the enforceability relation coalitional deviations are not allowed.
are singletons and pairs. Let i, j N and g be a
network. If g0 = g + ij, then we have g ! i,j g0 An appropriate indirect domination relation
where i 6¼ j. If g0 = g ij, we have g ! i g0 and can be defined for each enforceability relation.
g ! j g0 , and for notational purposes, we allow Page et al. (2005) defined a framework called
g ! i,j g0 . To form a link between two players, supernetworks which translate these concepts
consent is needed from both players, while the visually into a directed network model defined in
deletion of a link can be carried out by just one the following way. First, each node represents a
player. Also, coalitional deviations of size network g, so that in essence, a network is built on
greater than two are not allowed so that the top of a network, hence the terminology super-
only coalitions that can deviate are singletons network. There are two types of directed edges,
and pairs. Therefore, g ! S g0 cannot hold for defined for each coalition S: a move edge and a
any pair of networks g, g0 when |S| 3. This preference edge. A move edge corresponding to a
enforceability relation is taken from the con- coalition S from network g to g0 is drawn if g ! S
cept of pairwise stability in Jackson and g0 . A preference edge corresponding to S is drawn
Wolinsky (1996). from g to g0 if g0 S g, depending upon how S is
2. Jackson-van den Nouweland enforceability: defined.
This enforceability relation, taken from the The most commonly used of the three enforce-
concept of strong stability defined by Jackson ability relations is the Jackson-Wolinsky enforce-
and van den Nouweland (2005), now allows ability, and for the remainder of the section, we
coalitions of size greater than two to deviate so employ this enforceability relation to define an
that multiple links can be formed and/or elim- indirect domination relation. Instead of defining
inated in one step. It is essentially the indirect domination in the usual way in which S
coalitional extension of the Jackson-Wolinsky is typically defined, we look at an indirect domi-
enforceability relation. Formally, g ! S g0 if nation that uses the weaker notion of S –, i.e., ≿i
and only if the following conditions are satis- for all i S and i for some i S – as Jackson
fied. If ij g0 \g, then {i, j} S. If ij g\g0 , and Wolinsky (1996) defined their concept of
then {i, j} \ S 6¼ ∅. The first condition states pairwise stability using this weaker version. We
that the coalition S must contain all players give the formal definition below. Using the
involved in forming a link that was not there Jackson-Wolinsky enforceability relation, we say
before. The second condition states that S must that a network g is indirectly dominated by g and
contain at least one of the players involved in is denoted by g g0 if there exists a sequence of
the link that is destroyed. networks g = g0, g1,. . ., gp = g0 and pairs
Cooperative Games (Von Neumann-Morgenstern Stable Sets) 39
(i1, j1),. . ., (ip, jp), where we include the possibility farsighted stable set. Below are several facts that
that for some k we have ik = jk; the following relate the two solution concepts that were proved
conditions are satisfied: (i) gk1 !ik , jk gk , in Herings et al. (2009).
(ii) gk1 ≾l g0 for all l {ik, jk}, and (iii) gk1 ≺l g0
for some l {ik, jk}. Theorem 21 The following statements hold:
In general, a stable set defined by the indirect
domination described above may not exist. 1. Every vNM farsighted stable set is an HMV
Herings et al. (2009) proposed an alternative far- farsighted stable set.
sighted solution concept, which we call the 2. Every singleton HMV farsighted stable set is a
Herings-Mauleon-Vannetelbosch (HMV) far- vNM farsighted stable set.
sighted stable set. To distinguish from the far- 3. If K is the unique HMV farsighted stable set,
sighted stable set defined in the earlier sections, then it is also the unique vNM farsighted
we will at times refer to it as the vNM farsighted stable set.
stable set. We present the formal definition in the
following paragraphs, but we make one remark on In the various games introduced thus far, the
the intuition behind the solution concept. The implicit assumption was that players were suffi-
main modification in the HMV farsighted stable cient farsighted in the sense that players can fore-
set is that internal stability is weakened to guaran- see the sequence of outcomes in an indirect
tee the existence of an HMV farsighted stable set. domination of arbitrary (but finite) length. How-
Due to the multiplicity of such sets, they then took ever, it may be the case that players that we
the minimal set with respect to set inclusion that observe may not have such unlimited foresight,
satisfies their modified internal stability and exter- and in the framework of network formation, a
nal stability. We once again use the notation Kg = solution concept that takes into account limited
{g0 K | g0
g} [ {g} as the set of likely foresight is called “horizon-k” farsighted stable
outcomes when the status quo is g. A set K is an set defined in Herings et al. (2018). The motiva-
HMV farsighted stable set if it satisfies the fol- tion behind considering players with limited fore-
lowing conditions (where we allow i=j in the sight originates from Kirchsteiger et al. (2016)
following): which analyze by experiments whether players
are indeed farsighted or myopic in a network
1. g K ) there does not exist g0 2 = K with formation game. They find that in many examples,
g !i,j g0 and g00 Kg0 such that g ≲k g00 players form a network that is supported by an
holds for k {i, j} and with strict preference intermediate level of farsighted behavior over
for either i or j. myopic behavior. To our knowledge, this paper
= K ) there exists g0 with g !i,j g0 and g00
2. g 2 is the only one that looks at the issue of farsighted
Kg0 such that g ≲k g00 holds for k {i, j} and versus myopic through experiments, and this
with strict preference for either i or j. direction could be of interest going forward.
3. There is no set K0 ⊊, K that satisfies 1 and 2.
Coalition Formation Games
The set of all networks satisfies the first two While in network formation games, the objective
conditions vacuously. Therefore, because the set of is to form bilateral links between two players, in
networks is finite as long as the set of players is coalition formation games, players partition them-
finite, there exists at least one HMV farsighted stable selves into coalitions so that players interact with
set. (This property has been one of the reasons why other players multilaterally. The objective in this
this version of the stable set has been used in the free game is to divide players into a stable partition,
trade agreement model of Zhang et al. 2013.) where stability can be defined in many ways.
Also, if the condition g0 2 = K is removed from Let N be the set of players. A coalition struc-
the first condition, then the first two conditions are ture is defined as simply a partition of N. That is, a
equivalent to the conditions specified by the vNM coalition structure is given by P = {S1, S2,. . ., Sk}
40 Cooperative Games (Von Neumann-Morgenstern Stable Sets)
stable set. A corollary of the above fact is that the extended EBA (EEBA). A coalition structure is
LFCSS contains the core. an EEBA if it is an element of a stable set of an
A sufficient condition for the LFCSS to coin- indirect domination relation very similar to what
cide with the core is that the hedonic game sat- they had defined in their earlier paper
isfies what is called the top coalition property, (Diamantoudi and Xue 2003). This concept is
introduced in Banerjee et al. (2001). Informally, built off of the EBA in that it considers sequences
the top coalition property is such that there exists a of deviations by coalitions, but unlike the EBA,
coalition structure {S1, S2,. . ., Sk} such that those the deviating coalitions no longer need to be
in S1 agree that S1 is the most preferred coalition nested and can also now merge in some steps of
among subsets of N, those in S2 agree that S2 is the the deviation.
most preferred coalition among subsets of N\S1, Diamantoudi and Xue (2007) then analyzed
etc. Under this condition, the coalition structure the relationship between the coalition structures
{S1, S2,. . ., Sk} is the only coalition structure in the that are in the EEBA and efficient coalition struc-
core. Diamantoudi and Xue (2003) showed that tures. In their result, they find a sufficient condi-
this coalition structure as a singleton is the tion for the grand coalition to be an EEBA.
LFCSS. Herings et al. (2010), just as in Herings et al.
(2009) for the network formation model, also
General Model considered an HMV farsighted stable set for the
Diamantoudi and Xue (2007) considered far- coalition formation games. Their results in this
sighted solution concepts for general coalition framework are very similar to those that they
formation games in which there are externalities. obtain for network formation games.
They first gave an alternative definition of the Funaki and Yamato (2014) took another
notion of equilibrium binding agreements approach and considered a different enforceability
(EBA), defined by Ray and Vohra (1997). The relation P !S P0. The relation defined earlier
original definition of an EBA is defined recur- includes, in one step, disintegration of some coa-
sively and involves a nestedness assumption on litions when members of S leave the coalitions
the coalitions that can deviate in the sequence of that they were part of in P and integration when
deviations, much like how the CPNE is defined forming the coalition S P0. Funaki and Yamato
for strategic form games. (2014) argued that these two moves should be
Diamantoudi and Xue (2007) first formulated treated separately, since in certain situations, dis-
the EBA as an element of a stable set defined by a integration and integration cannot both occur
suitably defined domination relation, which they simultaneously as was assumed in the EEBA.
call R&V domination. This formulation allows Moreover, in their formulation, they only allow
the recursive nature of the EBA to be incorporated one disintegration or integration to occur in each
into the domination relation. Within this domina- step so that the number of coalitions either
tion relation, the coalitional deviations that are decreases by one by the integration of two sepa-
considered are only those that involve a coalition rate coalitions or increases by one by the disinte-
splitting from one so that the resulting coalition gration of one coalition into two separate
structure now includes the deviating coalition and coalitions. This way of interpreting !S as
the remaining coalition. Thus, coalitions can only representing one basic step is similar to the rea-
become smaller in this sequence of deviations, soning behind allowing only individual devia-
and in this sense, it is said that the deviations tions in strategic form games. Formally, their
have a nested structure. Therefore, for the EBA enforceability relation can be stated as follows:
to make sense, the starting coalition structure P ! S P0 if (i) {T P|T N \S} = {T0 P0 |
needs to be the grand coalition structure {N}. T0 N \S}; (ii) S P0 is such that S = S1 [ S2 for
By relaxing this nested structure when consid- some S1, S2 P and |P0 | = |P | 1; and (iii) S
ering the EBA, Diamantoudi and Xue (2007) P0 is such that there is some T P with T \S P0
defined a new solution concept called the and |P0 | = |P | + 1.
42 Cooperative Games (Von Neumann-Morgenstern Stable Sets)
The first condition simply states that the coali- that those unaffected by this change in partnership
tions unaffected by the deviation by S are intact. are matched to the same partner, under m. Condi-
The second condition states that the deviation by tion (iii) states that all other k, which would be the
S must either result in two coalitions S1 and S2 partners of i and j, if they were matched in m, are
merging to form S (the first case) or a coalition single in m0 . This transition of m to m0 is borrowed
T splitting into two coalitions S and T \ S (the essentially from the process of “satisfying
second case). Hence, in the first case, the number blocking pairs” in Roth and Vande Vate (1990),
of coalitions in P0 must be one less than the num- although we do not assume that (i, j) is a blocking
ber of coalitions in P, while in the second case, the pair so that they need not be better off in the
number of coalitions in P must be one more than matching m0 . Also, we allow for the possibility
the number of coalitions in P. that i = j in the definition, so that we cover the
Using the above relation, an indirect domina- case in which only a single agent deviates.
tion relation can be defined in the usual way. For simplicity, we do not allow for simulta-
Funaki and Yamato (2014) focused on coalition neous deviations by groups of three or more
structures with the property that each coalition agents. Therefore, m !S m0 holds only if S =
structure indirectly dominates all other coalition {i, j} for some i, j. The results presented in this
structures. A coalition structure is said to be section occur when coalition deviations are allo-
sequentially stable in that instance. Equivalently, wed and defined as a suitable extension of the
each sequentially stable coalition structure consti- enforceability condition defined for pairs. See
tutes a singleton stable set defined by the indirect Mauleon et al. (2011) and Klaus et al. (2011) for
domination relation that uses the stepwise details.
enforceability relation defined in the previous As in marriage games, the objective in room-
paragraph. mate games is to form pairs. The difference, how-
They obtained a sufficient condition when the ever, is that in roommate games, we do not have
grand coalition as a coalition structure is sequen- the set of players being partitioned into two dis-
tially stable and an algorithm to check that the joint sets, as in marriage games. In such sense,
grand coalition is a sequentially stable coalition sometimes roommate games are called one-sided
structure. matching games, as opposed to the two sidedness
in marriage games. When considering the map-
Marriage Games and Roommate Games ping m, for describing a matching, we do not have
In the next two sections, we revisit two models: the restriction that m(i) be a member of the oppo-
the marriage game and the house barter game. site group if not matched to itself. However, this
Also in this section, we review results on room- restriction is also lifted when considering
mate games, which can be seen as a generalized blocking pairs, and this relative ease of forming
model of marriage games. The results for mar- a blocking pair contributes to the lack of general
riage games were first obtained by Mauleon existence of a core matching in roommate games,
et al. (2011), and the results for roommate games as is shown in Gale and Shapley (1962).
were shown by Klaus et al. (2011). The roommate game is given by the compo-
Let us first consider the marriage game (M, W, nents (N, R) where N is the set of players, no
R). All of the components in the Chwe’s model, longer partitioned into two disjoint sets, and
with the exception of !S, are apparent, so it each i N has a preference relation Ri over N.
remains to define the enforceability relation !S. As in the marriage game, a matching is a function
Given two matchings m and m0 and the coalition m: N ! N such that it is a bijection and satisfies the
S = {i, j}, m ! i,j m0 , if (i) m0 (i) = j, (ii) m0 (k) = m(k) condition m(i) = j if and only if m(j) = i. A pair (i,
for k 2= {i, j, m (i), m (j)}, and (iii) m0 (k) = k for all j) N N is said to block m if and only if j Pi m(i)
other k. and i Pj m(j) hold. We allow the case in which i = j
Condition (i) states that i is matched with so that the previous definition also covers the case
j under the new matching m0 . Condition (ii) states in which m is blocked by a single person.
Cooperative Games (Von Neumann-Morgenstern Stable Sets) 43
A matching m is said to be a core matching if it is (2011) proved that there are no other types of
not blocked by any pair or single agent. farsighted stable sets in marriage games. Klaus
A matching m is said to be individually rational et al. (2011) showed through an example the pos-
if it is not blocked by a single agent. sible existence of a farsighted stable set other than
The indirect domination relation considered in that described in the theorem, but they prove that
both Mauleon et al. (2011) and Klaus et al. (2011) there are no farsighted stable sets with exactly two
was defined in the usual way using the enforce- matchings. In their example, a core matching does
ability condition. Therefore, unlike the network not exist, thus showing that a farsighted stable set
formation model, we require in the indirect dom- can exist more often than core matchings.
ination relation that when a pair deviates, both The results for roommate markets are thus
members must be better off in the final matching much less clearer, and it seems unlikely to give
that is achieved in the sequence. any general results on the existence of farsighted
Mauleon et al. (2011) and Klaus et al. (2011) stable sets other than when a core matching exists.
both proved the following lemma for the marriage One direction that is taken in Mauleon et al.
game and roommate game, respectively. The (2014) is to find out when the indirect domination
lemma provides a relationship between indirect relation and the domination relation coincide. If
domination and blocking by pairs. such a condition on preferences can be found, then
a core matching would simultaneously be a
Lemma Let m and m0 be individually rational matching that cannot be indirectly dominated.
matchings. A matching m indirectly dominates m0
if and only if there does not exist a pair (i, j) that
House Barter Games
blocks m such that m0 (i) = j.
Next, we consider the house barter model of
Shapley and Scarf (1974) which appeared in an
The previous lemma then leads to the connec-
earlier section. For this model, the enforceability
tion between core matchings and farsighted stable
relation is defined in a way that is similar to the
sets of matching games and roommate games. We
coalition formation model. The following version,
state the results of the two games in the same
defined in Kawasaki (2010), was given in Klaus
theorem below.
et al. (2010).
For an allocation x, draw a directed graph
Theorem 23 The following hold for both the
where the set of nodes is the set of players and
marriage game and the roommate game:
there is a directed edge (i, j) if x(i) = j. Because x is
a bijection on N, each i is in exactly one cycle of
1. Let m be a core matching, and then {m} is a
this graph which is called the trading cycle of
farsighted stable set.
i under allocation x. Denote by Cx,i the unique
2. If m is a singleton farsighted stable set, then {m}
trading cycle of allocation x that includes agent i.
must be a core matching.
For any two allocations, x and y, the relation x ! S
y holds if and only if the following three condi-
The first part establishes the existence of a
tions are satisfied:
farsighted stable set if a core matching exists –
which is always satisfied in marriage games but
• yð S Þ ¼ S
not so in roommate games. The classic three-
• yðiÞ ¼ xðiÞif i [j S C x,j
that the allocation of the goods to those unaffected by adom should lie between the cores defined by
by the coalition S should be unchanged in the new sdom and wdom. Wako (1999) showed that the
allocation y. Finally, those who are affected are core defined by adom coincides with the set of
then assigned to their original endowment. This is competitive allocations. Furthermore, Toda
not the only way to define the enforceability con- (1997) showed that the core is the unique stable
dition for these games. For another example, see set defined by adom. Kawasaki (2010) showed
Serrano and Volij (2008), although this is in a that the analogue of these two results also holds
different framework. for the respective sets defined by an indirect dom-
We can now define indirect domination and ination relation based on adom. The definition is
direct domination for this framework. It should as follows.
be noted that in the previous sections, we had An allocation x is said to indirectly and anti-
treated direct domination as if it were equivalent symmetrically weakly dominate another alloca-
to the myopic domination relations that have tion y and denote this by x iadom y if there exists
already been defined. This is not the case for the a sequence of allocations y = x0, x1,. . ., xp = x and
house barter model because of the details on the coalitions S1, S2,. . ., Sp such that for each j = 1,
assignments to N\S. For an example, see 2,. . ., p, (i) xj1 !Sj xj, (ii) x(i)Rixj 1(i) for all i
Kawasaki (2010). Sj with strict preference for some i Sj and (iii) if
An allocation x is said to be a competitive for i Sj xj 1Ii x, then xj 1(i) = xj (i).
allocation if there exists a price system (pi)i N, The interpretation for the first two conditions is
where pi denotes the price of the good initially the same as in the indirect domination relation
owned by agent i, satisfying the following condi- described in the earlier sections. The third condi-
tion: j i xi ) pj > pi. Shapley and Scarf (1974) tion incorporates the behavioral assumption in the
showed that the set of competitive allocations adom relation into the iadom relation here. If an
coincides with the allocations that can be obtained agent is indifferent between the good that is cur-
by the top trading cycle method. rently allocated and the one that will be allocated
Shapley and Scarf (1974) and Roth and Post- in the end, then that good that is allocated to that
lewaite (1977) showed that a competitive alloca- agent is unchanged in the reallocation described in
tion exists, and every competitive allocation is in (i). Kawasaki (2010) showed that the set of com-
the core. Meanwhile, Wako (1984) showed that petitive allocations is the unique stable set defined
the strict core is a subset of the set of competitive by the iadom relation.
allocations, and this inclusion can be strict. Thus, Because the iadom relation is quite compli-
the set of competitive allocations can lie strictly cated, Klaus et al. (2010) analyzed the stable sets
between the strict core and the core. using the usual indirect domination relation used
Wako (1999) defined a domination relation, in the literature built off of the enforceability
called antisymmetric weak domination (adom), condition in Kawasaki (2010). The definition of
such that the set of competitive allocations is the this indirect domination relation involves the exis-
core defined by this domination relation. The for- tence of a sequence of allocations and coalitions
mal definition is as follows. satisfying (i) above, and conditions (ii) and (iii)
An allocation x is said to antisymmetrically are replaced by the condition x(i) Pi xj1(i).
weakly dominate another allocation y if there Because this indirect domination is essentially
exists a coalition S such that x weakly dominates the indirect domination in the Chwe’s framework
y via S and x(i) Ii y(i) holds only when x(i) = y(i) defined for this game, we will still use the term
and denote this relationship by x adom y. The farsighted stable set to describe a stable set
difference between adom and wdom is that in defined by this indirect domination relation.
adom, the agents who find the two allocations Klaus et al. (2010) showed that Kawasaki’s
indifferent do not receive a different good. By (2010) result can be obtained using indirect dom-
definition, adom is stronger than wdom but ination relation for markets that satisfy the follow-
weaker than sdom. Therefore, the core defined ing condition: for each i 6¼ j,
Cooperative Games (Von Neumann-Morgenstern Stable Sets) 45
will in turn have impacts on developments of Ehlers L (2007) Von Neumann-Morgenstern stable sets in
economics, politics, sociology, and many applied matching problems. J Econ Theory 134:537–547
Einy E, Shitovitz B (1996) Convex games and stable sets.
social sciences. Games Econ Behav 16:192–201
Einy E, Holzman R, Monderer D, Shitovitz B (1996) Core
Acknowledgments This work was supported by the and stable sets of large games arising in economics.
Japan Society for the Promotion of Science (JSPS) Grant J Econ Theory 68:200–211
Numbers JP16H03121 and JP17K13696. Funaki Y, Yamato T (2014) Stable coalition structures
under restricted coalitional changes. Int Game Theory
Rev 16. https://doi.org/10.1142/S0219198914500066
Gale D, Shapley LS (1962) College admissions and the
stability of marriage. Am Math Mon 69:9–15
Bibliography Gomes A (2005) Multilateral contracting with externali-
ties. Econometrica 73:1329–1350
Anesi V (2006) Committee with farsighted voters: a new Gomes A, Jehiel P (2005) Dynamic process of social and
interpretation of stable sets. Soc Choice Welf economic interactions: on the persistence of inefficien-
27:595–610 cies. J Pol Econ 113:626–667
Anesi V (2010) Noncooperative foundations of stable sets Graziano MG, Meo C, Yannelis NC (2015) Stable sets for
in voting games. Games Econ Behav 70:488–493 asymmetric information economies. Int J Econ Theory
Aumann R, Peleg B (1960) Von Neumann-Morgenstern 11:137–154
solutions to cooperative games without side payments. Greenberg J (1990) The theory of social situations: an
Bull Am Math Soc 66:173–179 alternative game theoretic approach. Cambridge Uni-
Bala V, Goyal S (2000) A noncooperative model of net- versity Press, Cambridge
work formation. Econometrica 68:1181–1229 Greenberg J, Monderer D, Shitovitz B (1996) Multistage
Bando K (2014) On the existence of a strictly strong Nash situations. Econometrica 64:1415–1437
equilibrium under the student-optimal deferred accep- Greenberg J, Luo X, Oladi R, Shitovitz B (2002)
tance algorithm. Games Econ Behav 87:269–287 (Sophisticated) stable sets in exchange economies.
Banerjee S, Konishi H, Sonmez T (2001) Core in a simple Games Econ Behav 39:54–70
coalition formation game. Soc Choice Welf Griesmer JH (1959) Extreme games with three values. In:
18:135–153 Tucker AW, Luce RD (eds) Contribution to the theory
Barberá S, Gerber A (2003) On coalition formation: dura- of games, vol IV. Annals of mathematics studies,
ble coalition structures. Math Soc Sci 45:185–203 vol 40. Princeton University Press, Princeton,
Beal S, Durieu J, Solal P (2008) Farsighted coalitional pp 189–212
stability in TU-games. Math Soc Sci 56:303–313 Gusfield D, Irving R (1989) The stable marriage problem:
Bernheim D, Peleg B, Whinston M (1987) Coalition-proof structure and algorithms. MIT Press, Boston
Nash equilibria: Concepts. J Econ Theory 42:1–12 Harsanyi J (1974) An equilibrium-point interpretation of
Bhattacharya A, Brosi V (2011) An existence result for stable sets and a proposed alternative definition. Manag
farsighted stable sets of games in characteristic function Sci 20:1472–1495
form. Int J Game Theory 40:393–401 Hart S (1973) Symmetric solutions of some production
Bogomolnaia A, Jackson MO (2002) The stability of economies. Int J Game Theory 2:53–62
hedonic coalition structures. Games Econ Behav Hart S (1974) Formation of cartels in large markets. J Econ
38:201–230 Theory 7:453–466
Bott R (1953) Symmetric solutions to majority games. In: Heijmans J (1991) Discriminatory von Neumann-
Kuhn HW, Tucker AW (eds) Contribution to the theory Morgenstern solutions. Games Econ Behav 3:438–452
of games, volume II, Annals of mathematics studies, Herings PJJ, Mauleon A, Vannetelbosch V (2009) Far-
vol 28. Princeton University Press, Princeton, sightedly stable networks. Games Econ Behav
pp 319–323 67:526–541
Chwe MS-Y (1994) Farsighted coalitional stability. J Econ Herings PJJ, Mauleon A, Vannetelbosch V (2010) Coali-
Theory 63:299–325 tion formation among farsighted agents. Games
Diamantoudi E (2005) Stable cartels revisited. Econ The- 1:286–298
ory 26:907–921 Herings PJJ, Mauleon A, Vannetelbosch V (2017) Stable
Diamantoudi E, Xue L (2003) Farsighted stability in sets in matching problems with coalitional sovereignty
hedonic games. Soc Choice Welf 21:39–61 and path dominance. J Math Econ 71:14–19
Diamantoudi E, Xue L (2007) Coalitions, agreements and Herings PJJ, Mauleon A, Vannetelbosch V (2018) Stability
efficiency. J Econ Theory 136:105–125 of networks under horizon-K farsightedness. Econ The-
Diermeier D, Fong P (2012) Characterization of the von ory. https://doi.org/10.1007/s00199-018-1119-7
Neumann-Morgenstern stable set in a non-cooperative Hirai T (2017) Single payoff farsighted stable sets in stra-
model of dynamic policy-making with a persistent tegic games with punishment strategies. Int J Game
agenda setter. Games Econ Behav 76:349–353 Theory. https://doi.org/10.1007/s00182-017-0597-3
Dutta B, Vohra R (2017) Rational expectations and far- Jackson MO, van den Nouweland A (2005) Strongly stable
sighted stability. Theor Econ 12:1191–1227 networks. Games Econ Behav 51:420–444
Cooperative Games (Von Neumann-Morgenstern Stable Sets) 47
Jackson MO, Wolinsky A (1996) A strategic model of social Mauleon A, Vannetelbosch V, Vergote W (2011) Von
and economic networks. J Econ Theory 71:44–74 Neumann-Morgenstern farsightedly stable sets in two-
Jordan JS (2006) Pillage and property. J Econ Theory sided matching. Theor Econ 6:499–521
131:26–44 Mauleon A, Molis E, Vannetelbosch V, Vergote W (2014)
Kamijo Y, Muto S (2010) Farsighted coalitional stability of Dominance invariant one-to-one matching problems.
a price leadership cartel. Jpn Econ Rev 61:455–465 Int J Game Theory 43:925–943
Kaneko M (1987) The conventionally stable sets in non- Moulin H (1995) Cooperative microeconomics: a game-
cooperative games with limited observations I: defini- theoretic introduction. Princeton University Press, Princeton
tion and introductory argument. Math Soc Sci Muto S (1979) Symmetric solutions for symmetric
13:93–128 constant-sum extreme games with four values. Int
Kawasaki R (2010) Farsighted stability of the competitive J Game Theory 8:115–123
allocations in an exchange economy with indivisible Muto S (1982a) On Hart production games. Math Oper Res
goods. Math Soc Sci 59:46–52 7:319–333
Kawasaki R (2015) Maximin, minimax, and von Muto S (1982b) Symmetric solutions for (n, k) games. Int
Neumann-Morgenstern farsighted stable sets. Math J Game Theory 11:195–201
Soc Sci 74:8–12 Muto S, Okada D (1996) Von Neumann-Morgenstern sta-
Kawasaki R, Muto S (2009) Farsighted stability in provi- ble sets in a price-setting duopoly. Econ Econ 81:1–14
sion of perfectly lumpy public goods. Math Soc Sci Muto S, Okada D (1998) Von Neumann-Morgenstern sta-
58:98–109 ble sets in Cournot competition. Econ Econ 85:37–57
Kawasaki R, Sato T, Muto S (2015) Farsightedly stable Nakanishi N (1999) Reexamination of the international
tariffs. Math Soc Sci 76:118–124 export quota game through the theory of social situa-
Kerber M, Rowat C (2011) A Ramsey bound on stable sets tions. Games Econ Behav 27:132–152
in Jordan pillage games. Int J Game Theory Nakanishi N (2001) On the existence and efficiency of the
40:461–466 von Neumann-Morgenstern stable set in an n-player
Kirchsteiger G, Mantovani M, Mauleon A, Vannetelbosch prisoner’s dilemma. Int J Game Theory 30:291–307
V (2016) Limited farsightedness in network formation. Nakanishi N (2009) Noncooperative farsighted stable set
J Econ Behav Organ 128:97–120 in an n-player prisoners’ dilemma. Int J Game Theory
Konishi H, Ray D (2003) Coalition formation as a dynamic 38:249–261
process. J Econ Theory 110:1–41 Nakayama M (1998) Self-binding coalitions. Keio Econ
Klaus B, Klijn F, Walzl M (2010) Farsighted house alloca- Stud 35:1–8
tion. J Math Econ 46:817–824 Núñez M, Rafels C (2013) Von Neumann-Morgenstern
Klaus B, Klijn F, Walzl M (2011) Farsighted stability for solutions in the assignment market. J Econ Theory
roommate markets. J Pub Econ Theory 13:921–933 148:1282–1291
Lucas WF (1968) A game with no solution. Bull Am Math Oladi R (2005) Stable tariffs and retaliation. Rev Int Econ
Soc 74:237–239 13:205–215
Lucas WF (1990) Developments in stable set theory. In: Owen G (1965) A class of discriminatory solutions to
Ichiishi T et al (eds) Game theory and applications. simple N-person games. Duke Math J 32:545–553
Academic, New York, pp 300–316 Owen G (1968) n-Person games with only l, n-l, and
Lucas WF, Rabie M (1982) Games with no solutions and n-person coalitions. Proc Am Math Soc 19:1258–1261
empty cores. Math Oper Res 7:491–500 Owen G (1995) Game theory, 3rd edn. Academic, New York
Lucas WF, Michaelis K, Muto S, Rabie M (1982) A new Page FH, Wooders M (2009) Strategic basins of attraction,
family of finite solutions. Int J Game Theory the path dominance core, and network formation
11:117–127 games. Games Econ Behav 66:462–487
Lucas WF (1992) Von Neumann-Morgenstern stable sets. Page FH, Wooders MH, Kamat S (2005) Networks and
In: Aumann RJ, Hart S (eds) Handbook of game theory farsighted stability. J Econ Theory 120:257–269
with economic applications, vol 1. North-Holland, Peleg B (1986) A proof that the core of an ordinal convex
Amsterdam, pp 543–590 game is a von Neumann-Morgenstern solution. Math
Luo X (2001) General systems and f-stable sets: a formal Soc Sci 11:83–87
analysis of socioeconomic environments. J Math Econ Quint T, Wako J (2004) On houseswapping, the strict core,
36:95–109 segmentation, and linear programming. Math Oper Res
Luo X (2009) On the foundation of stability. Econ Theory 29:861–877
40:185–201 Ray D, Vohra R (1997) Equilibrium binding agreements.
Manlove D (2013) Algorithmics of matching under pref- J Econ Theory 73:30–78
erences. World Scientific, Singapore Ray D, Vohra R (2015a) The farsighted stable set.
Mariotti M (1997) A model of agreements in strategic form Econometrica 83:977–1011
games. J Econ Theory 74:196–217 Ray D, Vohra R (2015b) Coalition formation. In: Young
Masuda T (2002) Farsighted stability in average return HP, Zamir S (eds) Handbook of game theory,
games. Math Soc Sci 44:169–181 vol 4. Elsevier/North Holland, pp 239–326
Mauleon A, Vannetelbosch V (2004) Farsighted and cau- Rosenmüller J (1977) Extreme games and their solutions,
tiousness in coalition formation games with positive Lecture notes in economics and mathematical systems,
spillovers. Theory Dec 56:291–324 vol 145. Springer, Berlin
48 Cooperative Games (Von Neumann-Morgenstern Stable Sets)
Rosenmüller J, Shitovitz B (2000) A characterization of Shubik M (1982) Game theory in the social sciences:
vNM-stable sets for linear production games. Int concepts and solutions. MIT Press, Boston
J Game Theory 29:39–61 Shubik M (1985) A game-theoretic approach to political
Roth AE, Postlewaite A (1977) Weak versus strong dom- economy. MIT Press, Boston
ination in a market with indivisible goods. J Math Econ Simonnard M (1966) Linear programming. Prentice-Hall,
4:131–137 Englewood Cliffs
Roth AE, Sotomayor MO (1990) Two-sided matching: a Solymosi T, Raghavan TES (2001) Assignment games
study in game-theoretic modeling and analysis. Cam- with stable core. Int J Game Theory 30:177–185
bridge University Press, Cambridge Sung SC, Dimitrov D (2007) On myopic stability concepts
Roth AE, Vande Vate JH (1990) Random paths to stability for hedonic games. Theory Dec 62:31–45
in two-sided matching. Econometrica 58:1475–1480 Suzuki A, Muto S (2000) Farsighted stability in prisoner’s
Serrano R, Volij O (2008) Mistakes in cooperation: the dilemma. J Oper Res Soc Jpn 43:249–265
stochastic stability of Edgeworth’s recontracting. Suzuki A, Muto S (2005) Farsighted stability in n-person
Econ J 118:1719–1741 prisoner’s dilemma. Int J Game Theory 33:431–445
Shapley LS (1953) Quota solutions of n-person games. In: Suzuki A, Muto S (2006) Farsighted behavior leads to
Kuhn HW, Tucker TW (eds) Contribution to the theory efficiency in duopoly markets. In: Haurie A et al (eds)
of games, vol II. Annals of mathematics studies, vol 28. Advances in dynamic games. Birkhauser, Boston,
Princeton University Press, Princeton, pp 343–359 pp 379–395
Shapley LS (1959) The solutions of a symmetric market Toda M (1997) Implementation and characterizations of
game. In: Tucker AW, Luce RD (eds) Contribution to the competitive solution with indivisibility. Mimeo
the theory of games, vol IV. Annals of mathematics von Neumann J, Morgenstern O (1953) Theory of games
studies, vol 40. Princeton University Press, Princeton, and economic behavior, 3rd edn. Princeton University
pp 145–162 Press, Princeton
Shapley LS (1962) Simple games: an outline of the Wako J (1984) A note on the strong core of a market with
descriptive theory. Behav Sci 7:59–66 indivisible goods. J Math Econ 13:189–194
Shapley LS (1964) Solutions of compound simple games. Wako J (1991) Some properties of weak domination in an
In: Tucker AW et al (eds) Advances in game theory. exchange market with indivisible goods. Jpn Econ Rev
Annals of mathematics studies, vol 52. Princeton Uni- 42:303–314
versity Press, Princeton, pp 267–305 Wako J (1999) Coalitional-proofness of the competitive
Shapley LS (1971) Cores of convex games. Int J Game allocations in an indivisible goods market. Fields Inst
Theory 1:11–26 Commun 23:277–283
Shapley LS, Scarf H (1974) On cores and indivisibilities. Wako J (2010) A polynomial-time algorithm to find von
J Math Econ 1:23–37 Neumann-Morgenstern stable matchings in marriage
Shapley LS, Shubik M (1972) The assignment game I: the games. Algorithmica 58:188–220
core. Int J Game Theory 1:111–130 Xue L (1997) Nonemptiness of the largest consistent set.
Shino J, Kawasaki R (2012) Farsighted stable sets in J Econ Theory 73:453–459
Hotelling’s location games. Math Soc Sci 63:23–30 Xue L (1998) Coalitional stability under perfect foresight.
Shitovitz B, Weber S (1997) The graph of Lindahl corre- Econ Theory 11:603–627
spondence as the unique von Neumann-Morgenstern Zhang J, Xue L, Zu L (2013) Farsighted free trade net-
abstract stable set. J Math Econ 27:375–387 works. Int J Game Theory 42:375–398
feasible payoffs for each coalition,
Cooperative Games what payoff will be awarded to
each player? One can take a
Roberto Serrano positive or normative approach to
Department of Economics, Brown University, answering this question, and
Providence, RI, USA different solution concepts in the
theory lean towards one or the
other.
Article Outline Core It is a solution concept that assigns
to each cooperative game the set of
Glossary payoffs that no coalition can
Definition of the Subject improve upon or block. In a
Introduction context in which there is
Cooperative Games unfettered coalitional interaction,
The Core the core arises as a good positive
The Shapley Value answer to the question posed in
Future Directions cooperative game theory. In other
Bibliography words, if a payoff does not belong
to the core, one should not expect
Glossary to see it as the prediction of the
theory if there is full cooperation.
Characteristic or coalitional function The Shapley It is a solution that prescribes a
most usual way to represent a value single payoff for each player,
cooperative game. which is the average of all
Cooperative game Strategic situation involving marginal contributions of that
coalitions, whose formation assumes the exis- player to each coalition he or she is
tence of binding agreements among players. a member of. It is usually viewed
Core Solution concept that assigns the set of as a good normative answer to the
payoffs that cannot be improved upon by any question posed in cooperative
coalition. game theory. That is, those who
Game theory Discipline that studies strategic contribute more to the groups that
situations. include them should be paid more.
Shapley value Solution concept that assigns the
average of marginal contributions to coalitions.
Solution concept Mapping that assigns predic-
tions to each game. Although there were some earlier contributions,
the official date of birth of game theory is usually
taken to be 1944, year of publication of the first
Definition of the Subject edition of the Theory of Games and Economic
Behavior, by John von Neumann and Oskar
Cooperative It is one of the two counterparts of Morgenstern (1944). The core was first proposed
game theory. It studies the by Francis Ysidro Edgeworth in 1881 (Edgeworth
interactions among coalitions of 1881), and later reinvented and defined in game
game theory players. Its main theoretic terms in Gillies (1959). The Shapley
question is this: Given the sets of value was proposed by Lloyd Shapley in his 1953
© Springer Science+Business Media, LLC, part of Springer Nature 2020 49
M. Sotomayor et al. (eds.), Complex Social and Behavioral Systems,
https://doi.org/10.1007/978-1-0716-0368-0_98
Originally published in
R. A. Meyers (ed.), Encyclopedia of Complexity and Systems Science, © Springer Science+Business Media LLC 2017
https://doi.org/10.1007/978-3-642-27737-5_98-2
50 Cooperative Games
Ph.D. dissertation (Shapley 1953). Both the core one can take a normative or prescriptive approach,
and the Shapley value have been applied widely, to set up a number of normative goals, typically
shed light on problems in different disciplines, embodied in axioms, and try to derive their logical
including economics and political science. implications. Although authors sometimes dis-
agree on the classification of the different solution
concepts according to these two criteria – as we
Introduction shall see, the understanding of each solution con-
cept is enhanced if one can view it from very
Game theory is the study of games, also called distinct approaches – in this article we shall exem-
strategic situations. These are decision problems plify the positive approach with the core and the
with multiple decision makers, whose decisions normative approach with the Shapley value.
impact one another. It is divided into two branches: While this may oversimplify the issues, it should
non-cooperative game theory and cooperative be helpful to a reader new to the subject.
game theory. The actors in non-cooperative game The rest of the article is organized as follows.
theory are individual players, who may reach Section “Cooperative Games” introduces the basic
agreements only if they are self-enforcing. The model of a cooperative game, and discusses its
non-cooperative approach provides a rich language assumptions as well as the notion of solution con-
and develops useful tools to analyze games. One cepts. Section “The Core” is devoted to the core,
clear advantage of the approach is that it is able to and section “The Shapley Value” to the Shapley
model how specific details of the interaction value. In each case, some of the main results for
among individual players may impact the final each of the two are described, and examples are
outcome. One limitation, however, is that its pre- provided. Section “Future Directions” discusses
dictions may be highly sensitive to those details. some directions for future research.
For this reason it is worth also analyzing more
abstract approaches that attempt to obtain conclu-
sions that are independent of such details. The Cooperative Games
cooperative approach is one such attempt, and it
is the subject of this article. Representations of Games. The Characteristic
The actors in cooperative game theory are coa- Function
litions, that is, groups of players. For the most Let us begin by presenting the different ways to
part, two facts, that coalitions can form and that describe a game. The first two are the usual ways
each coalition has a feasible set of payoffs avail- employed in non-cooperative game theory.
able to its members, are taken as given the coali- The most informative way to describe a game is
tions and their sets of feasible payoffs as called its extensive form. It consists of a game tree,
primitives, the question tackled is the identifica- specifying the timing of moves for each player and
tion of final payoffs awarded to each player. That the information available to each of them at the
is, given a collection of feasible sets of payoffs, time of making a move. At the end of each path of
one for each coalition, can one predict or recom- moves, a final outcome is reached and a payoff
mend a payoff (or set of payoffs) to be awarded to vector is specified. For each player, one can define
each player? Such predictions or recommenda- a strategy, i.e., a complete contingent plan of action
tions are embodied in different solution concepts. to play the game. That is, a strategy is a function
Indeed, one can take several approaches to that specifies a feasible move each time a player is
answering the question just posed. From a posi- called upon to make a move in the game.
tive or descriptive point of view, one may want to One can abstract from details of the interaction
get a prediction of the likely outcome of the inter- (such as timing of moves and information available
action among the players, and hence, the resulting at each move), and focus on the concept of strate-
payoff be understood as the natural consequence gies. That is, one can list down the set of strategies
of the forces at work in the system. Alternatively, available to each player, and arrive at the strategic
Cooperative Games 51
or normal form of the game. For two players, for 3. For each x ℝ|S|,
example, the normal form is represented in a
jSj
bimatrix table. One player controls the rows, and @V ðSÞ \ fxg þ ℝþ
the other the columns. Each cell of the bimatrix is
occupied with an ordered pair, specifying the pay- is bounded.
off to each player if each of them chooses the
strategy corresponding to that cell. 4. For each S N, there exists a continuously
One can further abstract from the notion of differentiable representation of V(S), i.e.,
strategies, which will lead to the characteristic a continuously differentiable function gS:
function form of representing a game. From the ℝ|S| ! ℝ such that
strategic form, one makes assumptions about the
V ðSÞ ¼ x ℝjSj gS ðxÞ 0 :
strategies used by the complement of a coalition
of players to determine the feasible payoffs for the
5. For each S N, V(S) is non-leveled, i.e., for
coalition (see, for example, the derivations in
every x @V(S) the gradient of g S at x is
Aumann and Peleg (1960), von Neumann and
positive in all its coordinates.
Morgenstern (1944)). This is the representation
most often used in cooperative game theory.
With the assumptions made, @V(S) is its Pareto
Thus, here are the primitives of the basic model
frontier, i.e., the set of vectors xS V(S) such that
in cooperative game theory. Let N = {1,. . ., n} be a
there does not exist yS V(S) satisfying that yi xi
finite set of players. Each non-empty subset of N is
for all i S with at least one strict inequality.
called a coalition. The set N is referred to as the
Other assumptions usually made relate the possi-
grand coalition. For each coalition S, we shall
bilities available to different coalitions. Among
specify a set V(S) ℝ|S| containing |S| -dimensional
them, a very important one is balancedness,
payoff vectors that are feasible for coalition S. This
which we shall define next:
is called the characteristic function, and the pair
A collection T of coalitions is balanced if there
(N, V) is called a cooperative game. Note how a
exists a set of weights w(S) [0, l] for each S T
reduced form approach is taken because one does P
such that for every i N, S T , Sfig wðSÞ ¼ 1
not explain what strategic choices are behind each
one can think of these weights as the fraction of
of the payoff vectors in V(S). In addition, in this
time that each player devotes to each coalition he is
formulation, it is implicitly assumed that the
a member of, with a given coalition representing
actions taken by the complement coalition (those
the same fraction of time for each player. The game
players in N\S) cannot prevent S from achieving
(N, V) is balanced if xN V(N) whenever (xS)
each of the payoff vectors in V(S). There are more
V(S) for every S in a balanced collection T . That is,
general models in which these sorts of externalities
the grand coalition can always implement any
across coalitions are considered, but we shall
“time-sharing arrangement” that the different sub-
ignore them in this article.
coalitions may come up with.
The characteristic function defined so far is
Assumptions on the Characteristic Function often referred to as a non-transferable utility
Some of the most common technical assumptions (NTU) game. A particular case is the transferable
made on the characteristic function are the utility (TU) game case, in which for each coalition
following: S N, there exists a real number v(S) such that
( )
1. For each S N, V(S) is closed. Denote by X
jSj
V ð SÞ ¼ xℝ : xi vðSÞ :
@V(S) the boundary of V(S). Hence, iS
@V(S) V(S)
2. For each S N, V(S) is comprehensive, i.e., for Abusing notation slightly, we shall denote a TU
jSj
each x V(S), fxg ℝþ V ðSÞ game by (N, v). In the TU case there is an underlying
52 Cooperative Games
nummeraire – money – that can transfer utility or (2005) for a recent survey). Today, there are inter-
payoff at a one-to-one rate from one player to any esting results of these different kinds for many
other. Technically, the theory of NTU games is far solution concepts, which include axiomatic char-
more complex: it uses convex analysis and fixed acterizations and non-cooperative foundations.
point theorems, whereas the TU theory is based on Thus, one can evaluate the appeal of the axioms
linear inequalities and combinatorics. and the non-cooperative procedures behind each
solution to defend a more normative or positive
Solution Concepts interpretation in each case.
Given a characteristic function, i.e., a collection of
sets V(S), one for each S, the theory formulates its
predictions on the basis of different solution con- The Core
cepts. We shall concentrate on the case in which
the grand coalition forms, that is, cooperation is The idea of agreements that are immune to
totally successful. Of course, solution concepts coalitional deviations was first introduced to eco-
can be adapted to take care of the case in which nomic theory by Edgeworth in (Edgeworth 1881),
this does not happen. which defined the set of coalitionally stable alloca-
A solution is a mapping that assigns a set of tions of an economy under the name “final settle-
payoff vectors in V(N) to each characteristic func- ments.” Edgeworth envisioned this concept as an
tion game (N, V). Thus, a solution in general pre- alternative to competitive equilibrium (Walras
scribes a set, which can be empty, or a singleton 1874), of central importance in economic theory,
(when it assigns a unique payoff vector as a frac- and was also the first to investigate the connections
tion of the fundamentals of the problem). The between the two concepts. Edgeworth’s notion,
leading set-valued cooperative solution concept which today we refer to as the core, was
is the core, while one of the most used single- rediscovered and introduced to game theory in Gil-
valued ones is the Shapley value for TU games. lies (1959). The origins of the core were not
There are several criteria to evaluate the reason- axiomatic. Rather, its simple and appealing defini-
ableness or appeal of a cooperative solution As tion appropriately describes stable outcomes in a
outlined above, in a normative approach, one can context of unfettered coalitional interaction.
propose axioms, abstract principles that one would The core of the game (N, V) is the set of payoff
like the solution to satisfy, and the next step is to vectors
pursue their logical consequences. Historically, this
was the first argument to justify the Shapley value. CðN, V Þ ¼ fx V ðN Þ : ∄S N, xS V ðSÞn@V ðSÞg:
Alternatively, one could start by defending a solu-
tion on the basis of its definition alone. In the case In words, it is the set of feasible payoff vectors
of the core, this will be especially natural: in a for the grand coalition that no coalition can upset.
context in which players can freely get together in If such a coalition S exists, we shall say that S can
groups, the prediction should be payoff vectors that improve upon or block x, and x is deemed unsta-
cannot be improved upon by any coalition. One ble. That is, in a context where any coalition can
can further enhance one’s positive understanding get together, when S has a blocking move, coali-
of the solution concept by proposing games in tion S will form and abandon the grand coalition
extensive form or in normal form played non- and its payoffs x S in order to get to a better payoff
cooperatively by players whose self-enforcing for each of the members of the coalition, a plan
agreements lead to a given solution. This is simply that is feasible for them.
to provide non-cooperative foundations or non-
cooperative implementation to the cooperative Non-Emptiness
solution in question, and it is an important research The core can prescribe the empty set in some
agenda initiated by John Nash in (Nash 1953), games. A game with an empty core is to be under-
referred to as the Nash program (see Serrano stood as a situation of strong instability, as any
Cooperative Games 53
payoffs proposed to the grand coalition are vul- • for every i N, zi is top-ranked for agent
nerable to coalitional blocking. i among all bundles z satisfying that pz poi,
Example Consider the following simple major- • and i Nzi = i Noi
ity 3-player TU game, in which the votes of at
least two players makes the coalition winning. In words, this is what the concept expresses.
That is, we represent the situation by the follow- First, at the equilibrium prices, each agent
ing characteristic function: v(S) = 1 for any demands zi, i.e., wishes to purchase this bundle
S containing at least two members, v({i}) = 0 among the set of affordable bundles, the budget
for all i N. Clearly, C(N,v) = Ø. Any feasible set. And second, these demands are such that all
payoff agreement proposed to the grand coalition markets clear, i.e., total demand equals total
will be blocked by at least one coalition. supply.
An important sufficient condition for the non- Note how the notion of a competitive equilib-
emptiness of the core of NTU games is rium relies on the principle of private ownership
balancedness, as shown in Scarf (1967): (each individual owns his or her endowment,
which allows him or her to access markets and
Theorem 1 (Scarf (1967)) Let the game (N, V) be purchase things). Moreover, each agent is a price-
balanced. Then C(N, V) 6¼ Ø. taker in all markets. That is, no single individual
For the TU case, balancedness is not only can affect the market prices with his or her actions;
sufficient, but it becomes also necessary for the prices are fixed parameters in each individual’s
non-emptiness of the core: consumption decision. The usual justification for
the price-taking assumption is that each individual
Theorem 2 (Bondareva (1963); Shapley (1967)) is “very small” with respect to the size of the
Let (N, v) be a TU game. Then, (N, v) is balanced economy, and hence, has no market power.
if and only of C(N, V) 6¼ Ø. One difficulty with the competitive equilib-
rium concept is that it does not explain where
The Connections with Competitive prices come from. There is no single agent in the
Equilibrium model responsible for coming up with them.
In economics, the institution of markets and the Walras in (Walras 1874) told the story of an auc-
notion of prices are essential to the understanding tioneer calling out prices until demand and supply
of the allocation of goods and the distribution of coincide, but in many real-world markets there is
wealth among individuals. For simplicity in the no auctioneer. More generally, economists attri-
presentation, we shall concentrate on exchange bute the equilibrium prices to the workings of the
economies, and disregard production aspects. forces of demand and supply, but this appears to
That is, we shall assume that the goods in question be simply repeating the definition. So, is there a
have already been produced in some fixed different way one can explain competitive equi-
amounts, and now they are to be allocated to librium prices?
individuals to satisfy their consumption needs. As it turns out, there is a very robust result that
An exchange economy is a system in answers this question. We refer to it as the equiv-
which each agent i in the set N has a consumption alence principle (see, e.g., Aumann 1987), by
set Z i ℝlþ of commodity bundles, as well as a which, under certain regularity conditions, the
preference relation over Z i and an initial endow- predictions provided by different game-theoretic
ment oi Zi of the commodities. A feasible solution concepts, when applied to an economy
allocation of goods in the economy is a list of with a large enough set of agents, tend to converge
bundles (Zi)i N such that Zi Zi and i Nzi to the set of competitive equilibrium allocations.
i Noi An allocation is competitive if it is One of the first results in this tradition was pro-
supported by a competitive equilibrium. vided by Edgeworth in 1881 for the core. Note
A competitive equilibrium is a price-allocation how the core of the economy can be defined in the
pair (p, (Ζi)i N), where p ℝl\{0}is such that space of allocations, using the same definition as
54 Cooperative Games
above. Namely, a feasible allocation is in the core In all these characterizations, the key axiom is
if it cannot be blocked by any coalition of agents that of consistency, also referred to as the reduced
when making use of the coalition’s endowments. game property. Consistency means that the out-
Edgeworth’s result was generalized later by comes prescribed by a solution should be “invari-
Debreu and Scarf in (Debreu and Scarf 1963) for ant” to the number of players in the game. More
the case in which an exchange economy is repli- formally, let (N, V) be a game, and let s be a
cated an arbitrary number of times (Anderson solution. Let x s(N, V) Then, the solution is
studies in (Anderson 1978) the more general consistent if for every S N, xS s(S, VxS)
case of arbitrary sequences of economies, not where (S, VxS) is the reduced game for S given
necessarily replicas). An informal statement of payoffs x, defined as follows. The feasible set for
the Debreu-Scarf theorem follows: S in this reduced game is the projection of V(N) at
xN\S i.e., remains after paying those outside of S:
Theorem 3 (Debreu and Scarf (1963)) Consider
an exchange economy. Then, V xS ðSÞ ¼ yS : yS , xNnS V ðN Þ :
1. The set of competitive equilibrium allocations However, the feasible set of T S,T 6¼ S,
is contained in the core. allows T to make deals with any coalition outside
2. For each non-competitive core allocation of the of S, provided that those services are paid at the
original economy, there exists a sufficiently rate prescribed by xN\S:
large replica of the economy for which the
replica of the allocation is blocked. V xS ðT Þ ¼ yT [QNnS yT , xQ V ðT [ QÞ :
The first part states a very appealing property It can be shown that the core satisfies consis-
of competitive allocations, i.e., their coalitional tency with respect to this reduced game. More-
stability. The second part, known as the core con- over, consistency is the central axiom in the
vergence theorem, states that the core “shrinks” to characterization of the core, which, depending
the set of competitive allocations as the economy on the version one looks at, uses a host of other
grows large. axioms; see Peleg (1985, 1986), Serrano and
In Aumann (1964), Aumann models the econ- Volij (1998).
omy as an atomless measure space, and demon-
strates the following core equivalence theorem:
Non-cooperative Implementation
To obtain a non-cooperative implementation of
Theorem 4 (Aumann (1964)) Let the economy
the core, the procedure must embody some feature
consists of an atomless continuum of agents.
of anonymity, since the core is usually a large set
Then, the core coincides with the set of competi-
and it contains payoffs where different players are
tive allocations.
treated very differently. For instance, if the proce-
For readers who wish to pursue the topic fur-
dure always had a fixed set of moves, typically the
ther, (Anderson 2008) provides a recent survey.
prediction would favor the first mover, making it
impossible to obtain an implementation of the
Axiomatic Characterizations entire set of payoffs.
The axiomatic foundations of the core were pro- The model in Perry and Reny (1994) builds in
vided much later than the concept was proposed. this anonymity by assuming that negotiations take
These characterizations are all inspired by Peleg’s place in continuous time, so that anyone can speak
work. They include (Peleg 1985, 1986), and at the beginning of the game, and at any point in
(Serrano and Volij 1998) – the latter paper also pro- time, instead of having a fixed order. The player
vides an axiomatization of competitive allocations that gets to speak first makes a proposal consisting
in which core convergence insights are exploited. of naming a coalition that contains him and a
Cooperative Games 55
feasible payoff for that coalition Next, the players vectors where all arbitrage opportunities in the
in that coalition get to respond. If they all accept market have been wiped out. Also, procedures in
the proposal, the coalition leaves and the game Serrano and Vohra (1997) implement the core, but
continues among the other players. Otherwise, a do not rely on the TU assumption, and they use a
new proposal may come from any player in N. It is procedure in which the order of moves can be
shown that, if the TU game has a non-empty core endogenously changed by players. Finally, yet
(as well as any of its subgames), a class of station- another way to build anonymity in the procedure
ary self-enforcing predictions of this procedure is by allowing the proposal to be made by brokers
coincide with the core. If a core payoff is proposed outside of the set N, as done in Pérez-
to the grand coalition, there are no incentives for Castrillo (1994).
individual players to reject it. Conversely, a non-
core payoff cannot be sustained because any An Application
player in a blocking coalition has an incentive to Consider majority games within a parliament.
make a proposal to that coalition, who will accept Suppose there are 100 seats, and decisions are
it (knowing that the alternative, given stationarity, made by simple majority so that 51 votes are
would be to go back to the non-core status quo). required to pass a piece of legislation.
Moldovanu and Winter (1995) offers a discrete- In the first specification, suppose there is a very
time version of the mechanism: in this work, the large party – player 1 -, who has 90 seats. There
anonymity required is imposed on the solution are five small parties, with 2 seats each. Given the
concept, by looking at the order-independent simple majority rules, this problem can be
equilibria of the procedure. represented by the following TU characteristic
The model in Serrano (1995) sets up a market function: v(S) = 1 if S contains player 1, and
to implement the core. The anonymity of the v(S) = 0 otherwise. The interpretation is that
procedure stems from the random choice of bro- each winning coalition can get the entire surplus –
ker. The broker announces a vector (x1,. . ., xn), pass the desired proposal. Here, a coalition is
where the components add up to v(N). One can winning if and only if player 1 is in it. For this
interpret xi as the price for the productive asset problem, the core is a singleton: the entire unit of
held by player i. Following an arbitrary order, the surplus is allocated to player 1, who has all the
remaining players either accept or reject these power. Any split of the unit surplus of the grand
prices. If player i accepts, he sells his asset to the coalition (v(N) = 1) that gives some positive frac-
broker for the price x i and leaves the game. Those tion of surplus to any of the small parties can be
who reject get to buy from the broker, at the called blocked by the coalition of player 1 alone.
out prices, the portfolio of assets of their choice if Consider now a second problem, in which
the broker still has them. If a player rejects, but player 1, who continues to be the large party, has
does not get to buy the portfolio of assets he would 35 seats, and each of the other five parties has
like because someone else took them before, he 13 seats. Now, the characteristic function is as
can always leave the market with his own asset. follows: v(S) = 1 if and only if S either contains
The broker’s payoff is the worth of the final port- player 1 and two small parties, or it contains four
folio of assets that he holds, plus the net monetary of the small parties; v(S) = 0 otherwise. It is easy
transfers that he has received. It is shown in Ser- to see that now the core is empty: any split of the
rano (1995) that the prices announced by the unit surplus will be blocked by at least one coali-
broker will always be his top-ranked vectors in tion. For example, the entire unit going to player
the core. If the TU game is such that gains from 1 is blocked by the coalition of all five small
cooperation increase with the size of coalitions, a parties, which can award 0.2 to each of them.
beautiful theorem of Shapley in (Shapley 1971) is But this arrangement, in which each small party
used to prove that the set of all equilibrium pay- gets 0.2 and player 1 nothing, is blocked as well,
offs of this procedure will coincide with the core. because player 1 can bribe two of the small parties
Core payoffs are here understood as those price (say, players 2 and 3) and promise them 1/3 each,
56 Cooperative Games
keeping the other third for itself, and so on. The Theorem 5 (Shapley (1953)) There is a unique
emptiness of the core is a way to describe the single-valued solution to TU games satisfying
fragility of any agreement, due to the inherent efficiency, symmetry, additivity and dummy. It is
instability of this coalition formation game. what today we call the Shapley value, the function
that assigns to each player i the payoff
main extensions that have been proposed: the chosen again at random among them), but with
Shapley l-transfer value (Shapley 1969), the Har- probability 1 d, the proposer leaves the game.
sanyi value (Harsanyi 1963), and the Maschler- He is paid 0 and his resources are removed, so that
Owen consistent value (Maschler and Owen in the next period, proposals to the remaining n 1
1992). They were axiomatized in Aumann players cannot add up to more than v(N \ {i})
(1985), de Clppel et al. (2004), Hart (1985), A new proposer is chosen at random among the
respectively. set N \ {i}, and so on.
As shown in Hart and Mas-Colell (1996), there
The Connections with Competitive exists a unique stationary self-enforcing predic-
Equilibrium tion of this procedure, and it actually coincides
As was the case for the core, there is a value with the Shapley value payoffs for any value of d.
equivalence theorem. The result holds for the (Stationarity means that strategies cannot be his-
TU domain (see Aumann 1975; Aumann and tory dependent). As d ! 1, the Shapley value
Shapley 1974; Shapley 1964). It can be shown payoffs are also obtained not only in expectation,
that the Shapley value payoffs can be supported but with independence of who is the proposer.
by competitive prices. Furthermore, in large One way to understand this result, as done in
enough economies, the set of competitive payoffs Hart and Mas-Colell (1996), is to check that the
“shrinks” to approximate the Shapley value. rules of the procedure and stationary behavior in it
However, the result cannot be easily extended to are in agreement with Shapley’s axioms. That is,
the NTU domain. While it holds for the l-transfer the equilibrium relies on immediate acceptances
value, it need not obtain for the other extensions. of proposals, stationary strategies treat substitute
For further details, the interested reader is referred players similarly, the equations describing the
to Hart (2008) and the references therein. equilibrium have an additive structure, and
dummy players will have to receive 0 because
Non-cooperative Implementation no resources are destroyed if they are asked to
Reference Gul (1989) was the first to propose a leave. It is also worth stressing the important
procedure that provided some non-cooperative role in the procedure of players’ marginal contri-
foundations of the Shapley value. Later, other butions to coalitions: following a rejection, a pro-
authors have provided alternative procedures and poser incurs the risk of being thrown out and the
techniques to the same end, including (Hart and others of losing his resources, which seem to
Mas-Colell 1996; Krishna and Serrano 1995; suggest a “price” for them.
Pérez-Castrillo and Wettstein 2001; Winter 1994). In Krishna and Serrano (1995), the authors
We shall concentrate on the description of the study the conditions under which stationarity can
procedure proposed by Hart and Mas-Colell in be removed to obtain the result. Also, Pérez-
Hart and Mas-Colell (1996). Generalizing an Castrillo and Wettstein (2001) uses a variant of
idea found in Mas-Colell (1988), which studies the Hart and Mas-Colell procedure, by replacing
the case of d = 0 – see below -, Hart and Mas- the random choice of proposers with a bidding
Colell propose the following non-cooperative stage, in which players bid to obtain the right to
procedure. With equal probability, each player make proposals.
i N is chosen to publicly make a feasible
proposal to the others: (x1,. . ., xn) is such that the An Application
sum of its components cannot exceed v(N). The Consider again the class of majority problems in a
other players get to respond to it in sequence, parliament consisting of 100 seats. As we shall
following a prespecified order. If all accept, the see, the Shapley value is a good way to understand
proposal is implemented; otherwise, a random the power that each party has in the legislature.
device is triggered. With probability 0 d < 1 Let us begin by considering again the problem
the same game continues being played among the in which player 1 has 90 seats, while each of the
same n players (and thus, a new proposer will be five small parties has 2 seats. It is easy to see that
58 Cooperative Games
the Shapley value, like the core in this case, each, one mid-size party (player 3) with 5 seats,
awards the entire unit of surplus to player 1: and three small parties, each with one seat. First,
effectively, each of the small parties is a dummy note that each of the three small parties has
player, and hence, the Shapley value awards zero become a dummy player: no winning coalition
to each of them. where he belongs becomes losing if he leaves
Consider a second problem, in which player the coalition, and so players 4, 5 and 6 are paid
1 is a big party with 35 seats, and there are 5 small zero by the Shapley value. Now, note that, despite
parties, with 13 seats each. The Shapley value the substantial difference of seats between each
awards 1/3 to the large party, and, by symmetry, large party and the mid-size party, each of them is
2/15 to each of the small parties. To see this, we identical in terms of marginal contributions to a
need to see when the marginal contributions of winning coalition. Indeed, for i = 1,2,3 player i0 s
player 1 to any coalition are positive. Recall that marginal contribution to a coalition is positive
there are 6! possible orders of players. Note how, only if he arrives second or third or fourth or
if player 1 arrives first or second in the room in fifth (and out of the preceding players in the coa-
which the coalition is forming, his marginal con- lition, exactly one is one of the non-dummy
tribution is zero: the coalition was losing before he players). Note how the Shapley value captures
arrived and continues to be a losing coalition after nicely the changes in the allocation of power due
his arrival. Similarly, his marginal contribution is to each different political scenario. In this case, the
also zero if he arrives fifth or sixth to the coalition; fierce competition between the large parties for
indeed, in this case, before he arrives the coalition the votes of player 3, the swinging party to form a
is already winning, so he adds nothing to it. Thus, majority, explains the equal share of power among
only when he arrives third or fourth, which hap- the three.
pens a third of the times, does he change the nature
of the coalition, from losing to winning. This
explains his Shapley value share of 1/3. In this Future Directions
game, the Shapley value payoffs roughly corre-
spond to the proportion of seats that each This article has been a first approach to coopera-
party has. tive game theory, and has emphasized two of its
Next, consider a third problem in which there most important solution concepts. The literature
are two large parties, while the other four parties on these topics is vast, and the interested reader is
are very small. For example, let each of the large encouraged to consult the general references listed
parties have 48 seats (say, players 1 and 2), while below. For the future, one should expect to see
each of the four small parties has only one seat. progress of the theory into areas that have been
Now, the Shapley value payoffs are 0.3 to each of less explored, including games with asymmetric
the two large parties, and 0.1 to each of the small information and games with coalitional externali-
ones. To see this, note that the marginal contribu- ties. In both cases, the characteristic function
tion of a small party is only positive when he model must be enriched to take care of the added
comes fourth in line, and out of the preceding complexities.
three parties in the coalition, exactly one of them Relevant to this encyclopedia are issues of
is a large party, i.e., 72 orders out of the 5! orders complexity. The complexity of cooperative solu-
in which he is fourth. That is, (72/5!) (1/6) = 1/ tion concepts has been studied (see, for instance,
10 In this case, the competition between the large Deng and Papadimitriou 1994). In terms of com-
parties for the votes of the small parties increases putational complexity, the Shapley value seems to
the power of the latter quite significantly, with be easy to compute, while the core is harder,
respect to the proportion of seats that each of although some classes of games have been iden-
them holds. tified in which this task is also simple.
Finally, consider a fourth problem with two Finally, one should insist on the importance of
large parties (players 1 and 2) with 46 seats novel and fruitful applications of the theory to
Cooperative Games 59
shed new light on concrete problems. In the case Harsanyi JC (1963) A simplified bargaining model for the
of the core, for example, the insights of core n-person cooperative game. Int Econ Rev 4:194–220
Hart S (1985) An axiomatization of Harsanyi s non-
stability in matching markets have been success- transferable utility solution. Econometrica
fully applied by Alvin Roth and his collaborators 53:1295–1314
to the design of matching markets in the “real Hart S (2008) Shapley value. In: Durlauff S, Blume L (eds)
world” (e.g., the job market for medical interns The new Palgrave dictionary of economics, 2nd edn.
McMillan, London
and hospitals, the allocation of organs from doners Hart S, Mas-Colell A (1989) Potencial, value and consis-
to patients, and so on) – see Roth (2002). tency. Econometrica 57:589–614
Hart S, Mas-Colell A (1996) Bargaining and value.
Econometrica 64:357–380
Krishna V, Serrano R (1995) Perfect equilibria of a model
Bibliography of n-person non-cooperative bargaining. I J Game The-
ory 24:259–272
Mas-Colell A (1988) Algunos comentarios sobre la teoria
Primary Literature cooperativa de los juegos. Cuadernos Economicos
Anderson RM (1978) An elementary core equivalence 40:143–161
theorem. Econometrica 46:1483–1487 Maschler M, Owen G (1992) The consistent Shapley value
Anderson RM (2008) Core convergence. In: Durlauff S, for games without side payments. In: Selten
Blume L (eds) The new Palgrave dictionary of eco- R (ed) Rational interaction: essays in honor of John
nomics, 2nd edn. McMillan, London Harsanyi Springer, New York
Aumann RJ (1964) Markets with a continuum of traders. Moldovanu B, Winter E (1995) Order independent equi-
Econometrica 32:39–50 libria. Games Econ Behav 9:21–34
Aumann RJ (1975) Values of markets with a continuum of Nash JF (1953) Two person cooperative games.
traders. Econometrica 43:611–646 Econometrica 21:128–140
Aumann RJ (1985) An axiomatization of the non- Peleg B (1985) An axiomatization of the core of coopera-
transferable utility value. Econometrica 53:599–612 tive games without side payments. J Math Econ
Aumann RJ (1987) Game theory. In: Eatwell J, Milgate M, 14:203–214
Newman P (eds) The new Palgrave dictionary of eco- Peleg B (1986) On the reduced game property and its
nomics. Norton, New York converse. I J Game Theory 15:187–200
Aumann RJ, Peleg B (1960) Von Neumann-Morgenstern Pérez-Castrillo D (1994) Cooperative outcomes through
solutions to cooperative games without side payments. non-cooperative games. Games Econ Behav
Bull Am Math Soc 66:173–179 7:428–440
Aumann RJ, Shapley LS (1974) Values of non-atomic Pérez-Castrillo D, Wettstein D (2001) Bidding for the
games. Princeton University Press, Princeton surplus: a non-cooperative approach to the Shapley
Bondareva ON (1963) Some applications of linear value. J Econ Theory 100:274–294
programming methods to the theory of cooper- Perry M, Reny P (1994) A non-cooperative view of
ative games (in Russian). Problemy Kibernetiki coalition formation and the core. Econometrica
10(119):139 62:795–817
de Clppel G, Peters H, Zank H (2004) Axiomatizing the Roth AE (2002) The economist as engineer: game theory,
Harsanyi solution, the symmetric egalitarian solution experimentation and computation as tools for design
and the consistent solution for NTU-games. I J Game economics. Econometrica 70:1341–1378
Theory 33:145–158 Scarf H (1967) The core of an N person game.
Debreu G, Scarf H (1963) A limit theorem on the core of an Econometrica 38:50–69
economy. Int Econ Rev 4:235–246 Serrano R (1995) A market to implement the core. J Econ
Deng X, Papadimitriou CH (1994) On the complexity of Theory 67:285–294
cooperative solution concepts. Math Oper Res Serrano R (2005) Fifty years of the Nash program,
19:257–266 1953–2003. Investigaciones Económicas 29:219–258
Edgeworth FY (1881) Mathematical Psychics. Kegan Paul Serrano R, Vohra R (1997) Non-cooperative implementa-
Publishers, London. (reprinted in 2003) Newman tion of the core. Soc Choice Welf 14:513–525
P (ed) F. Y. Edgeworths Mathematical Psychics and Serrano R, Volij O (1998) Axiomatizations of neoclassical
Further Papers on Political Economy. Oxford Univer- concepts for economies. J Math Econ 30:87–108
sity Press, Oxford Shapley LS (1953) A value for n-person games. In: Tucker
Gillies DB (1959) Solutions to general non-zero-sum AW, Luce RD (eds) Contributions to the theory of
games. In: Tucker AW, Luce RD (eds) Contributions games II. Princeton University Press, Princeton,
to the theory of games IV. Princeton University Press, pp 307–317
Princeton, pp 47–85 Shapley LS (1964) Values of large games VII: a general
Gul F (1989) Bargaining foundations of Shapley value. exchange economy with money. Research Memoran-
Econometrica 57:81–95 dum 4248-PR RAND Corporation, Santa Monica
60 Cooperative Games
Shapley LS (1967) On balanced sets and cores. Nav Res Young HP (1985) Monotonic solutions of cooperative
Logist Q 14:453–460 games. I J Game Theory 14:65–72
Shapley LS (1969) Utility comparison and the theory of
games. In: La Décision: Agrégation et Dynamique des
Ordres de Préférence. CNRS, Paris Books and Reviews
Shapley LS (1971) Cores of convex games. I J Game Myerson RB (1991) Game theory: an analysis of conflict.
Theory 1:11–26 Harvard University Press, Cambridge
von Neumann J, Morgenstern O (1944) Theory of games Osborne MJ, Rubinstein A (1994) A course in game the-
and economic behavior. Princeton University Press, ory. MIT Press, Cambridge
Princeton Peleg B, Sudholter P (2003) Introduction to the theory of
Walras L (1874) Elements of pure economics, or the theory cooperative games, 2nd edn. Kluwer/Springer, Amster-
of social wealth. English edition: Jaffé W (ed) Reprinted dam/Berlin
in 1984 by Orion Editions, Philadelphia Roth AE, Sotomayor M (1990) Two-sided matching:
Winter E (1994) The demand commitment bargaining and a study in game-theoretic modeling and analysis. Cam-
snowballing of cooperation. Econ Theory 4:255–273 bridge University Press, Cambridge
choice of action not just once but rather a
Dynamic Games with an choice of action for every possible decision
Application to Climate node for the player concerned.
Change Models Payoffs The utility or returns to a player from
playing a game. These payoffs typically
Prajit K. Dutta depend on the strategies chosen – and the con-
Department of Economics, Columbia University, sequent actions taken – by the player herself as
New York, NY, USA well as those chosen by the other players in
the game.
Game horizon The length of time over which the
Article Outline game is played, i. e., over which the players
take actions. The horizon may be finite – if
Glossary there are only a finite number of opportunities
Definition of the Subject for decision-making – or infinite – when there
Introduction are an infinite number of decision-making
The Dynamic – or Stochastic – Game Model opportunities.
Equilibrium Equilibrium A vector of strategies, one for each
The Dynamic – or Stochastic – Game: Results player in the game, such that no player can
Existence unilaterally improve her payoffs by altering
Characterization her strategy, if the others’ strategies are kept
Feasible Payoffs fixed.
Individually Rational Payoffs Climate change The consequence to the earth’s
Dynamics atmosphere of economic activities such as the
Global Climate Change – Issues, Models production and consumption of energy that
Models result in a build-up of greenhouse gases such
Global Climate Change – Results as carbon dioxide.
Global Pareto Optima
A Markov-Perfect Equilibrium: “Business as
Usual” Definition of the Subject
All SPE
Generalizations The study of dynamic games is an important topic
Future Directions within game theory. Dynamic games involve the
Bibliography study of problems that are (a) inherently dynamic
in nature (even without a game-theoretic angle)
Glossary and (b) are naturally studied from a strategic per-
spective. Towards that end the structure general-
Players The agents who take actions. These izes dynamic programming – which is the most
actions can be – depending on application – popular model within which inherently dynamic
the choice of capital stock, greenhouse emis- but non-strategic problems are studied. It also
sions, level of savings, level of Research & generalizes the model of repeated games within
Development expenditures, price level, quality which strategic interaction is often studied but
and quantity of effort, etc. which structure cannot handle dynamic problems.
Strategies Full contingent plans for the actions A large number of economic problems fit these
that players take. Each strategy incorporates a two requirements.
player receives in each period based on the action finite number of stage games any one of which
vector that was picked and the state. gets played at a time.
The basic variables are:
Example 2 When the number of players is one,
Definition i. e., I = 1, then we have a dynamic programming
t Time period (0, 1, 2, . . . T). problem. When the number of states is one, i. e.,
i Players (1, . . ., I). #(S) = 1, then we have a repeated game problem.
s(t) State at the beginning of period t, (Alternatively, repeated games constitute the spe-
s(t) S. cial case where the conditional distribution brings
ai(t) Action taken by player i in period, a state s always back to itself, regardless of
ai(t) Ai. action). Hence these two very familiar models
a(t) (a1(t), a2(t), . . . aI(t)) Vector actions are embedded within the framework of dynamic
taken in period t. games.
pi(t) pi(s(t), a(t)) Payoff of player i in
period t.
q(t) q(s(t + 1)| s(t), a(t)) Conditional distri- Histories and Strategies
bution of state at the beginning of period t + 1. Preliminaries – A history at time t , h(t), is a list of
d The discount factor, d [0 , 1). prior states and action vectors up to time t (but not
The state variable affects play in two ways as including a(t))
stated above. In any given period, the payoff to a
player depends not only on the actions that she and hðtÞ ¼ sð0Þ, að0Þ, sð1Þ, að1Þ, ..., sðtÞ:
other players take but it also depends on the state in
that period. Furthermore, the state casts a shadow Let the set of histories be denoted H(t).
on future payoffs in that it evolves in a Markovian A strategy for player i at time t, si(t), is a complete
fashion with the state in the next period being conditional plan that specifies a choice of action
determined – possibly stochastically – by the state for every history. The choice may be probabilistic,
in the current period and the action vector played i. e., may be an element of P(Ai), the set of distri-
currently. butions over Ai. So a strategy at time t is
The initial value of the state, s(0), is exoge-
nous. So is the discount factor d and the game si ðtÞ : HðtÞ ! PðAi Þ:
horizon, T. Note that the horizon can be finite or
infinite. All the rest of the variables are endoge- A strategy for the entire game for player i, si, is
nous, with each player controlling its own endog- a list of strategies, one for every period: si = si(0),
enous variable, the actions. Needless to add, both si(1), . . . si(t), . . . Let s = (s1, s2, . . . sI) denote a
state as well as action variables can be multi- vector of strategies, one for each player.
dimensional and when we turn to the climate A particular example of a strategy for player i is
change application it will be seen to be multi- a pure strategy si where si(t) is a deterministic
dimensional in natural ways. choice (from Ai). This choice may, of course, be
conditional on history, i. e., may be a map from
Example 1 S infinite – The state space can H(t) to Ai. Another example of a strategy for
be countably or uncountably infinite. It will player i is one where the player’s choice si(t)
be seen that the infinite case, especially the may be probabilistic but the conditioning vari-
uncountably infinite one, has embedded within it ables are not the entire history but rather only the
a number of technical complications and – partly current state. In other words such a strategy is
as a consequence – much less is known about described by a map from S to P(Ai) – and is called
this case. a Markovian strategy. Additionally, when the map
S finite – In this case, imagine that we have a is independent of time, the strategy is called a
repeated game like situation except that there are a stationary Markovian strategy, i. e., a stationary
64 Dynamic Games with an Application to Climate Change Models
Markovian strategy for player i is described by a to in short as SPE – if not only is Eq. 2 true for s
mapping: fi : S ! P(Ai). but it is true for every restriction of the strategy
vector s to every subgame h(t), i. e., is true for
Example 3 Consider, for starters, a pure strategy s j h(t) as well. In other words, s is a SPE if
vector s, i. e., a pure strategy choice for every i.
Suppose lurther that q the conditional distribution on Ri ðs hðtÞÞ Ri si si j hðtÞ , for all i, si , hðtÞ:
states is also deterministic. In that case, there is, in a (3)
natural way, a unique history that is generated by s:
As is well-known, not all NE satisfy the further
hðt; sÞ ¼ sð0Þ, að0; sÞ, sð1; sÞ, requirement of being a SPE. This is because a NE
only considers the outcome path associated with
að1; sÞ, . . . , sðt; sÞ
that strategy vector s – or, when the outcome
path is probabilistic, only considers those out-
where a(t; s) = s(t; h(t; s)) and s(t + 1; s)
come paths that have a positive probability of
= q(s(t + 1)| s(t; s), a(t; s)). This unique his-
occurrence. That follows from the inequality
tory associated with the strategy vector s is also
Eq. 2. However, that does not preclude the possi-
called the outcome path for that strategy. To every
bility that players may have no incentive to follow
such outcome path there is an associated lifetime
through with s if some zero probability history
payoff
associated with that strategy is reached. (Such a
history may be reached either by accident or
X
T
Ri ðsÞ ¼ dt pi ðsðt; sÞ, aðt; sÞÞ: (1) because of deviation/experimentation by some
t¼0 player). In turn that may have material relevance
because how players behave when such a history
If s is a mixed strategy, or if the conditional is reached will have significance for whether or
distribution q, is not deterministic, then there will not a player wishes to deviate against s .
be a joint distribution on the set of histories H(t) Equation 3 ensures that – even after a deviation –
generated by the strategy vector s and the condi- s will get played and that deviations are
tional distribution q in the obvious way. More- unprofitable.
over, there will be a marginal distribution on the Recall the definition of a stationary Markovian
state and action in period t, and under that mar- strategy (SMS) above. Associated with that class
ginal, an expected payoff pi(s(t; s), a(t; s)). of strategies is the following definition of equilib-
Thereafter lifetime payoffs can be written exactly rium. A stationary Markov strategy vector f is a
as in Eq. 1. Consider the game that remains after Markov Perfect Equilibrium (MPE) if
every history h(t). This remainder is called a sub-
game. The restriction of the strategy vector s to Ri ð f Þ Ri f i , f i , for all i, f i :
the subgame that starts after history h(t), is
denoted s j h(t). Hence, a MPE restricts attention to SMS both
on and off the outcome path. Furthermore, it only
considers – implicitly – histories that have a pos-
Equilibrium itive probability of occurrence under f . Neither
“restriction” is a restriction when T is infinite
A strategy vector s is said to be a Nash Equilib- because when all other players play a SMS player
rium (or NE) of the game if i has a stationary dynamic programming problem
to solve in finding his most profitable strategy
Ri ðs Þ Ri si , si , for all i, si : (2) and – as is well-known – he loses no payoff
possibilities in restricting himself to SMS as
A strategy vector s is said to be a Subgame well. And that best strategy is a best strategy on
Perfect (Nash) Equilibrium of the game – referred histories that have zero probabilities of occurrence
Dynamic Games with an Application to Climate Change Models 65
as well as histories that have a positive probability Proof The proof will be presented by way of a
of occurrence. In particular therefore, when T is fixed point argument. The domain for the fixed
infinite, a MPE is also a SPE. point will be the set of stationary Markovian
strategies:
(
The Dynamic – or Stochastic – Game:
Results Mj ¼ f i : S ! PðAi Þ, s:t for all s,
)
The main questions that we will now turn to are: X
f i ðai ; sÞ ¼ 1, f i ðai ; sÞ 0 :
ai
1. Is there always a SPE in a dynamic – or
stochastic – game? Properties of Mi: In the pointwise convergence
2. Is there a characterization for the set of SPE topology, Mi is compact. That this is so follows
akin to the Bellman optimality equation of from a standard diagonalization argument by way
dynamic programming? If yes, what properties of which a subsequence can be constructed from
can be deduced of the SPE payoff set? any sequence of SMS f ni such that the subse-
3. Is there a Folk Theorem for dynamic games – 0 0
quence, call it f ni has the property that f ni ðsÞ
akin to that in Repeated Games? converges to some f i ðsÞ for every s. Clearly, f i
0 0
4. What are the properties of SPE outcome paths? Mi. The diagonalization argument requires S to be
countable and Ai to be finite.
The answers to questions 1–3 are very complete Mi is also clearly convex since its elements are
for finite dynamic games, i. e., games where the probability distributions on Ai at every state.
state space S is finite. The answer is also complete The mapping for which we shall seek a fixed
for questions 1 and 2 when S is countably infinite point is the best response mapping:
but when the state space is uncountably infinite, the
question is substantively technically difficult and Bi ð f Þ ¼ fgi Mi : Ri ðgi , f i Þ Ri ð f i , f i Þ, for all f i :
there is reason to believe that there may not always
be a SPE. The finite game arguments for question Since the best response problem for player i is a
3 is conceptually applicable when S is (countably stationary dynamic programming problem, it fol-
or uncountably) infinite provided some technical lows that there is an associated value function for
difficulties can be overcome. That and extending the problem, say vi, such that it solves the opti-
the first two answers to uncountably infinite mality equation of dynamic programming
S remain open questions at this point. Not a lot is (
known about Question 4.
vi ðsÞ ¼ max pi s, li , f i ðsÞ
li
) (4)
X
0 0
Existence þd vi ðs Þqðs j s, li , f i ðsÞÞ
s0
probability of players other than i picking the vni ðsÞ ¼ pi s, gni , f ni ðsÞ
action vector ai. Similarly. X
" # þd vni ðs0 Þq s0 j s, gni , f ni ðsÞ g: (10)
X X s0
0 0
qðs j s, li , f i ðsÞÞ ¼ qðs j s, ai , ai Þli ðai Þ
ai ai
Clearly the left-hand side of Eq. 10 converges
f i ðai , sÞ:
to the left-hand side of Eq. 9. Lets check the right-
(6) hand side of each equation. Evidently
Additionally, it follows that the best best
" #
response, i. e., gi, solves the optimality equation, i. e., X X
pi ðs, ai , ai Þgni ðai Þ f ni ðai ; sÞ
ai ai
vi ðsÞ ¼ pi ðs, gi , f i ðsÞÞ " #
X X X
þd vi ðs0 Þqðs0 j s, gi , f i ðsÞÞ (7) ! pi ðs, ai , ai Þg0i ðai Þ f 0i ðai ; sÞ
s0 ai ai
ð ð
Remark 1 Note that the finiteness of Ai is
vni dqn ! v0i dq0 : (12)
crucial. Else, the very last argument would not go
through, i. e., knowing that gni ðai Þf ni ðai; s g0i
ðai Þf 0i ðai ; s ! 0 for every action vector a would (Of course in the previous qn is a more
0 sentencen
not guarantee that the sum would converge to zero compact stand-in for q s j s, li , f i ðsÞ and q0 for
as well.
q s0 j s, li , f 0i ðsÞ ). There are a limited number of
cases where Eq. 12 is known to be true. These
Remark 2 If the horizon were finite one could use results typically require qn to converge to q0 in
the same argument to prove that there exists a Mar- some strong sense. In the dynamic game context
kovian strategy equilibrium, though not a stationary what this means is that very strong convergence
Markovian equilibrium. That proof would combine restrictions need to be placed on the transition
the arguments above with backward induction. In probability q. This is the underlying logic behind
other words, one would first use the arguments results reported in (Duffie et al. 1994; Mertens and
above to show that there is an equilibrium at every Parthasarathy 1987; Nowak 1985; Rieder 1979).
state in the last period T. Then the value function so Such strong convergence properties are typi-
generated, vTi, would be used to show that there is an cally not satisfied when q is deterministic – which
equilibrium in period T 1 using the methods case comprises the bulk of the applications of the
above thereby generating the relevant value function theory. Indeed simply imposing continuity when
for the last two periods, vT1
i . And so on. q is deterministic appears not to be enough to
The natural question to ask at this point is generate an existence result. Harris et al. (1995)
whether the restriction of countable finiteness of and Dutta and Sundaram (1993) contain results
S can be dropped (and – eventually – the finiteness that show that there may not be a SPE in finite
restriction on Ai). The answer, unfortunately, is horizon dynamic games when the transition func-
not easily. The problems are two-fold: tion q is continuous. Whether other often used
properties of q and pi – such as concavity and
1. Sequential Compactness of the Domain monotonicity – can be used to rescue the issue
Problem – If S is uncountably infinite, then it remains an open question.
is difficult to find a domain Mi that is sequen-
tially compact. In particular, diagonalization
arguments do not work to extract candidate Characterization
strategy and value function limits.
2. Integration to the Limit Problem – Note as the The Bellman optimality equation has become a
other players change their strategies, f ni , con- workhorse for dynamic programming analysis. It
tinuation payoffs to player i change in two is used to derive properties of the value function
ways. They change first because the value func- and the optimal strategies. Moreover it provides an
tion vni changes, i. e., vni 6¼ vm i if n 6¼ m. Second, attractive and conceptually simple way to view a
the expected continuation value changes because multiple horizon problem as a series of one-stage
the measure over which the value function is being programming problems by exploiting the recursive
0 n
integrated, q s j s, li , f i ðsÞ , itself changes, i. e., structure of the optimization set-up. A natural ques-
q s0 j s, li , f ni ðsÞ 6¼ q s0 j s, li , f m
i ðsÞ . This tion to ask, since dynamic games are really multi-
is the well-known – and difficult – integration player versions of dynamic programming, is
to the limit problem: simply knowing that vni whether there is an analog of the Bellman equation
“converges” to v0i in some sense – such as for these games. Abreu et al. – APS (1990), in an
pointwise – and knowing that the integrating important and influential paper, showed that this is
measure qn “converges” to q0 in some sense – indeed the case for repeated games. They defined
such as in the weak topology – does not, in an operator, hereafter the APS operator, whose
general, imply that largest fixed point is the set of SPE payoffs in a
68 Dynamic Games with an Application to Climate Change Models
repeated game and whose every fixed point is a the technical pitfalls are. Again we start with the
subset of the set of SPE payoffs. (Thereby provid- infinite horizon model, i. e., where T = 1. When
ing a necessary and sufficient condition for SPE T is finite, the result and arguments can be modi-
equilibrium payoffs in much the same way that the fied in a straightforward way as will be indicated
unique fixed point of the Bellman operator consti- in a remark following the proof.
tutes the value function for a dynamic program- But first, some definitions. Suppose for now
ming problem). As with the Bellman equation, the that S is countable.
key idea is to reduce the multiple horizon problem APS Operator – Consider a compact-valued
to a (seemingly) static problem. correspondence, W defined on domain S which
In going from repeated to dynamic games there takes values that are subsets of RI. Define the
are some technical issues that arise. We turn now APS operator on W, call it LW, as follows:
to that analysis pointing out along the way where
8 9
>
> v RI : ∃^f PðAÞ and >
>
>
> >
>
>
> w : S A S ! W, >
>
< uniformly bounded, s:t: vi ¼ pi s, ^f >
> =
LW ðsÞ ¼ P (13)
> þd s0 wi s, ^f , s0 q s0 j s, ^f
>
>
> P >
>
0 >
>
>
> p i s, a i , ^
f þ d w i s, a i , ^
f , s >
>
>
:
i s 0 i >
;
0 ^
q s j s, ai , fi , for all ai , i
holds. By repeated application of this idea, we can for all n (and s). Hence, W 1(s) = \nW n(s) =
create a sequence of strategies for periods t ¼ 0, 1, limn ! 1W n(s) is non-empty and compact.
!
2, . . . f~, f ðs0 Þ, f ðs, s0 Þ , . . . such that at each Let us now show that W 1 is a fixed point of the
period Eq. 13 holds. Call the strategy so formed, f. APS operator, i. e., that LW 1 = W 1.
This strategy can then not be improved upon by a
single-period deviation. A standard argument Lemma 2 LW 1 = W 1, or, equivalently,
shows that if a strategy cannot be profitably devi- L(limn ! 1W n) = limn ! 1LW n.
ated against in one period then it cannot be profit-
ably deviated against even by deviations in Proof Clearly, by monotonicity, L(limn ! 1W n)
multiple periods. (This idea of “unimprovability” limn ! 1LW n So consider a v LW n(s),
is already present in dynamic programming). for all n. By Eq. 13 there is at each n an associ-
Within the context of repeated games, it was artic- ated first-period play f n and a continuation pay-
ulated by Abreu (1988). off wn(s, f n, s0) such that the inequality is
Proof of c: Note two properties of the APS satisfied and
operator: X
vi ¼ pi ðs, f n Þ þ d wni ðs, f n , s0 Þqðs0 j s, f n Þ:
Lemma 1 LW is a compact-valued correspon- s0
dence (whenever W is compact-valued).
By the diagonalization argument, and using the
Proof Consider Eq. 13. Suppose that vn LW(s) countability of S, we can extract a (subsequential)
n
for all n, with associated ^f and wn. By diagonal- limit f 1 = limn ! 1f n and w1 = limn ! 1wn
n
ization, there exists a subsequence s.t. vn ! v0 , ^f Clearly, w1 W 1. Since equalities and
! ^f 0 and wn ! w0. This arguments uses the inequalities are maintained in the limit, equally
countability of S and the n
finiteness of Ai. From clearly
Eq. 14 evidently pi s, ^f ! pi s, ^f 0 and simi- X
P
vi ¼ pi ðs, f 1 Þ þ d w1 1 0 0 1
larly from Eq. 15 0 w i s , ^f n , s0 q s0 j s, ^f n i ðs, f , s Þqðs j s, f Þ
s
P 0 0 0 s0
goes to 0 wi s, ^f , s q s j s , ^f 0 . Hence the
s
inequality in Eq. 13 is preserved and v0 and
LW(s).□
X 1
It is not difficult to see that – on account of the vi pi s, ai , f 1
i þd wi s, ai , f 1 0
i , s q
boundedness of pi – if W has a uniformly bounded
s0
0
selection, then so does LW. Note that the operator s j s, ai , f 1
i , for all ai , i
is also monotone in the set-inclusion sense, i. e., if
W 0(s) W(s) for all s then LW 0 LW. thereby proving that v L(limn ! 1W n(s). The
The APS algorithm finds the set of SPE payoffs lemma is proved.□
by starting from a particular starting point, an Since the set of SPE payoffs, V (s), is a subset
initial set W 0(s) that is taken to be the set of all of W 0(s) – and LV (s) = V (s) it further follows
feasible payoffs from initial state s. (And hence V (s) W 1(s), for all s. From the previous
the correspondence W 0 is so defined for every lemma, and part b), it follows that V (s)
W 1(s),
initial state). Then define, W 1 = LW 0. More gen- for all s. Hence, V = W 1. Theorem is proved.□
erally, W n + 1 = LW n , n 0. It follows that A few remarks are in order.
W 1 W 0. This is because W 1 requires a payoff
that is not only feasible but additionally satisfies the Remark 1 If the game horizon T is finite, there is
incentive inequality of Eq. 13 as well. From the an immediate modification of the above argu-
monotone inclusion property above it then follows ments. In the algorithm above, take W 0 to be the
that, more generally, W n + 1 W n , n 0. Fur- set of SPE payoffs in the one-period game (with
thermore, W n(s) is a non-empty, compact set payoffs pi for player i).
70 Dynamic Games with an Application to Climate Change Models
Use the APS operator thereafter to define W n + 1 not violate any legal prohibitions against explicit
= LW n , n 0. It is not too difficult to show that contracts that specify such behavior. For detrac-
W n is the set of SPE payoffs for a game that lasts tors, the “anything goes” implication of the Folk
n + 1 periods (or has n remaining periods after the Theorem is a clear sign of its weakness – or the
first one). weakness of the SPE concept – in that it robs the
theory of all predictive content. Moreover there is
Remark 2 Of course an immediate corollary of a criticism, not entirely correct, that the strategies
the above theorem is that the set of SPE payoffs required to sustain certain behaviors are so com-
V (s) is a compact set for every initial state s. plex that no player in a “real-world” setting could
Indeed one can go further and show that V is in be expected to implement them.
fact an upper hemi-continuous correspondence. Be that as it may, the Folk Theorem question in
The arguments are very similar to those used the context of Dynamic Games then is: is it the case
above – plus the Maximum Theorem. that feasibility and individual rationality are also
(almost) enough to guarantee that a payoff is a
Remark 3 Another way to think of Theorem 2 is SPE payoff at high enough d? Two sets of obstacles
that it is also an existence theorem. Under the con- arise in settling this question. Both emanate from the
ditions outlined in the result, the SPE equilibrium set same source, the fact that the state does not remain
has been shown to be non-empty. Of course this is fixed in the play of the games, as it does in the case
not a generalization of Theorem 1 since Theorem of Repeated Games. First, one has to think long and
2 does not assert the existence of a MPE. hard as to how one should define individual ratio-
nality. Relatedly, how does one track feasibility? In
Remark 4 When the state space Ai is infinite or both cases, the problem is that what payoff is feasi-
the state space S is uncountably infinite we run ble and individually rational depends on the state
into technical difficulties. The complications arise and hence changes after every history h(t). More-
from not being able to take limits. Also, as in the over, it also changes with the discount factor d. The
discussion of the Integration to the Limit problem, second set of problems stems from the fact that a
integrals can fail to be continuous thereby render- deviation play can unalterably change the future in a
ing void some of the arguments used above. dynamic game – unlike a repeated game where the
basic game environment is identical every period.
Folk Theorem Consequently one cannot immediately invoke the
The folk theorem for Repeated Games – logic of repeated game folk theorems which basi-
Fudenberg and Maskin (1986) following up on cally work because any deviation has only short-
earlier contributions – is very well-known and the term consequences while the punishment of the
most cited result of that theory. It proves that the deviation is long-term. (And so if players are patient
necessary conditions for a payoff to be a SPE they will not deviate).
payoff – feasibility and individual rationality – Despite all this, there are some positive results
are also (almost) sufficient provided the discount that are around. Of these, the most comprehensive
factor d is close enough to 1. This is the result that is one due to Dutta (1995). To set the stage for that
has become the defining result of Repeated result, we need a few crucial preliminary results.
Games. For supporters, the result and its logic of For this sub-section we will assume that S is
proof are a compelling demonstration of the finite – in addition to Ai.
power of reciprocity, the power of longterm rela-
tionships in fostering cooperation through
the lurking power of “punishments” when coop- Feasible Payoffs
eration breaks down. It is considered equally
important and significant that such long-term rela- Role of Markovian Strategies – Let F(s, d) denote
tionships and behaviors are sustained through the set of “average” feasible payoffs from initial
implicit promises and threats which therefore do state s and for discount factor 5. By that I mean
Dynamic Games with an Application to Climate Change Models 71
(
The lemmas above simplify the answer to the
Fðs, dÞ ¼ v RI : ∃ strategy s question: What is a feasible payoff in a dynamic
) game? Note that they also afford a dimensional
X
T reduction in the complexity and number of strat-
t
s:t:n ¼ ð1 dÞ d pi ðsðt; sÞ, aðt; sÞÞ : egies that one needs to keep track of to answer the
t¼0
question. Whilst there are an uncountably infinite
Let F(s, d) denote the set of “average” feasi- number of strategies – even with finite S and Ai –
ble payoffs from initial state s and for discount including the many that condition on histories in
factor d that are generated by pure stationary arbitrarily complex ways – the lemmas establish
Markovian strategies – PSMS. Recall that a that all we need to track are the finite number of
SMS is given by a map fi from S to the probability PSMS. Furthermore, whilst payoffs do depend on
distributions over Ai, so that at state s(t) player d, if the discount factor is high enough then the set
i chooses the mixed strategy fi(s(t)). A pure of feasible payoffs is well-approximated by the set
SMS is one where the map fi is from S to Ai. In of feasible long-run average payoffs to PSMS.
other words, One further preliminary step is required how-
ever. This has to do with the fact that while
( v F(s, d) can be exactly reproduced by a
Fðs, dÞ ¼ v RI : ∃ PSMS f period 0 average over PSMS payoffs, after that
period continuation payoffs to the various com-
)
X
T ponent strategies may generate payoffs that could
t
s:t:v ¼ ð1 dÞ d pi ðsðt; f Þ, aðt; f ÞÞ : be arbitrarily distant from v. This, in turn, can be
t¼0 problematical since one would need to check for
deviations at every one of these (very different)
Lemma 3 Any feasible payoff in a dynamic game payoffs. The next lemma addresses this problem
can be generated by averaging over payoffs to by showing that there is an averaging over the
stationary Markov strategies, i. e., F(s, d) = component PSMS that is ongoing, i. e., happens
coF(s, d), for all (s, d). periodically and not just at period 0, but which,
consequently, generates payoffs that after all his-
Proof Note that F(s, d) = co [extreme points tories stays arbitrarily close to v.
F(s, d)]. In turn, all extreme points of F(s, d) are For any two PSMS f 1 and f 2 denote a time-cycle
generated by an optimization problem of the form: strategy as follows: for T 1 periods play proceeds
P
maxs Ii¼1 ai vi ðs, dÞ. That optimization problem along f 1, then it moves for T 2 periods to f 2 After
is a dynamic programming problem. Standard the elapse of the T 1 + T 2 periods play comes back
results in dynamic programming show that the to f 1 for T 1 periods and f 2 for T 2 periods. And so
optimum is achieved by some stationary Markov- on. Define l1 = T 1/(T 1 + T 2). In the obvious
ian strategy. way, denote a general time-cycle strategy to be
Let F(s) denote the set of feasible payoffs one that cycles over any finite number of PSMS
under the long-run average criterion. The next f k where the proportion of time spent at strategy f k
result will show that this is the set to which is lk and allows the lenghs of time to depend on the
discounted average payoffs converge: initial state at the beginning of the cycle.
Lemma 4 F(s, d) ! F(s), as d ! 1, for all s. Lemma 5 Pick any v \sF(s). Then for all
e > 0 there is a time cycle strategy such that its
Proof Follows from the fact that (a) F(s) = long-run average payolfis within e of v after all
coF(s) where F(s) is the set of feasible long-run histories.
average payoffs generated by stationary Markov-
ian strategies, and (b) F(s, d) ! F(s). Part Proof Suppose that v = klk(s)vk(s) where vk(s)
(b) exploits the finiteness of S (and Ai).□ is the long-run average payoff is the kth PSMS
72 Dynamic Games with an Application to Climate Change Models
when the initial state is s. Ensure that T k is chosen close to 1. So sufficiency will require a condition
such that (a) the average payoff over those periods such as vi(s(t), d) > maxs mi(s, d) for all s(t). Call
P k
under that PSMS – 1=T k Tt¼01 pi s t; f k , such a strategy dynamically Individually Rational.
k From the previous lemmas, and the fact that
a t; f Þ – is within e or vk(s) for all s. And
mi(s, d) ! mi(s), as d ! 1, where mi(s) is the
(b) that T k(s)/lT l(s) is arbitrarily close to lk(s)
long-run average min-max level for player i the
for all s.□
following result is obvious. The min-max limiting
Since F(s, d) ! F(s) it further follows that
result is due to Mertens and Neyman (1983).
the above result also holds under discounting:
Lemma 7 Pick any v \sF(s) such that
Lemma 6 Pick any v \sF(s). Then for all
vi > maxsmi(s) for all s. Then there is a time-
e > 0 there is a time cycle strategy and a discount
cycle strategy which is dynamically Individually
cut-off d(e) < 1 such that the discounted average
Rational for high d.
payoffs to that strategy are within e of v for all
We are now ready to state and prove the
d > d(e) and after all histories.
main result:
Proof Follows from the fact that ð1 dÞ=
PT k 1 t k k Theorem 3 (Folk Theorem) Suppose that S and Ai
1 dT t¼0 d pi s t; f , a t; f goes to T1k
PT k 1 are finite sets. Suppose furthermore that T is infi-
k k
t¼0 pi s t; f , a t; f as d ! 1.□ nite and that \sF(s) has dimension I (where I is the
number of players). Pick any v \sF(s) such
that vi > maxsmi(s) for all s. Then, for all e > 0,
there is a discount cut-off d(e) < 1 and a time-
Individually Rational Payoffs
cycle strategy that for d > d(e) is a SPE with
payoffs that are within e of v.
Recall that a min-max payoff is a payoff level that
a player can guarantee by playing a best response.
In a Repeated Game that is defined at the level of Proof Without loss of generality, let us set
the component stage game. Since there is no ana- maxsmi(s) = 0 for all i. From the fact that \sF(s)
log of that in a dynamic game, the min-max needs has dimension I it follows that we can ffid I payoff
to be defined over the entire game – and hence is vectors in that set – vi, i = 1, . . . , I – such that for
sensitive to initial state and discount factor: all i (a) vi 0, (b) vji > vii , j 6¼ i, and (c) vi > vii .
That we can find these vectors such that (b) is
mi ðs, dÞ ¼ min max Ri ðsj s, dÞ: satisfied follows from the dimensionality of the
si si set. That we can additionally get the vectors to
satisfy (a) and (c) follows from the fact that it is a
Evidently, given (s, d), in a SPE it cannot be convex set and hence an appropriate “averaging”
that player i gets a payoff vi(s, d) that is less than with a vector such as v achieves (a) while an
mi(s, d). Indeed that inequality must hold at all “averaging” with i0s worst payoff achieves (c).
states for a strategy to be a SPE, i.e., for all s(t) Now consider the following strategy: Norm –
it must be the case that vi(s(t), d) mi(s(t), d). Start with a time-cycle strategy that generates
But, whilst necessary, even that might not be a payoffs after all histories that are within e of v.
sufficient condition for the strategy to be a SPE. Choose a high enough d as required. Continue
The reason is that if player i can deviate and take with that strategy if there are no deviations against
the game to, say, s0 at t + 1, rather than s(t + 1), he it. Punishment – If there is, say if player i deviates,
would do so if vi(s(t), d) < mi(s0, d) since con- then min-max i for T periods and thereafter pro-
tinuation payoffs from s0 have to be at least as ceed to the time-cycle strategy that yields payoffs
large as the latter level and this deviation would within e of vi after all histories. Re-start the pun-
be worth essentially that continuation when d is ishment whenever there is a deviation.
Dynamic Games with an Application to Climate Change Models 73
Choose T in such a fashion that the payoff to Even within some special models where the
the min-max period plus vii is strictly less than vi. single-player optima are well-behaved, the SPE
That ensures there is no incentive to deviate of the corresponding dynamic game need not
against the norm provided the punishment is be. A classic instance is the neo-classical aggre-
carried out. That there is incentive for players gative growth model. In that model, results going
j 6¼ i to punish player i follows from the fact back 50 years show that the optimal solutions
that vij > vjj the former payoff being what they converge monotonically to a steady-state, the
get from punishing and the latter from not so-called “golden rule” (For references, see
punishing i. That there is incentive for player Majumdar et al. (2000)). However, examples
i not to deviate against his own punishment fol- can be constructed – and may be found in Dutta
lows from the fact that re-starting the punishment and Sundaram (1996) and Dockner et al. (1998) –
only lowers his payoffs. The theorem is proved. where there are SPE in these models that can
A few remarks are in order. have arbitrarily complex state dynamics which
for some range of discount factor values descend
Remark 1 If the game horizon T is finite, there is into chaos. And that may happen with Stationary
likely a Folk Theorem along the lines of the result Markov Perfect Equilibrium. (It would be less of
proved for Repeated Games by Benoit and a stretch to believe that SPE in general can have
Krishna (1987). To the best of my knowledge it complex dynamics. The Folk Theorem already
remains, however, an open question. suggests that it might be so).
There are, however, many questions that
remain including the breadth of SPE that have
Remark 2 When the state space Ai is infinite or
regular dynamics. One may care less for complex
the state space S is uncountably infinite we again
dynamic SPE if it can be shows that the “good
run into technical difficulties. There is an analog
ones” have regular dynamics. What also remains
to Lemmas 3 and 4 in this instance and under
to be explored is whether adding some noise in
appropriate richer assumptions the results can be
the transition equation can remove most complex
generalized – see Dutta (1993). Lemmas 5–7 and
dynamics SPE.
the Folk Theorem itself does use the finiteness of
S to apply uniform bounds to various approxima-
tions and those become problematical when the
Global Climate Change – Issues, Models
state space is infinite. It is our belief that never-
theless the Folk Theorem can be proved in this
Issues
setting. It remains, however, to be done.
The dramatic rise of the world’s population in the
last three centuries, coupled with an even more
dramatic acceleration of economic development
Dynamics in many parts of the world, has led to a transfor-
mation of the natural environment by humans that
Recall that the fourth question is: what can be said is unprecedented in scale. In particular, on account
about the dynamics of SPE outcome paths? The of the greenhouse effect, global warming has
analogy that might be made is to the various con- emerged as a central problem, unrivaled in its
vergence theorems – sometimes also called “turn- potential for harm to life as we know it on planet
pike theorems” – that are known to be true in single- Earth. Seemingly the consequences are every-
player dynamic programming models. Now even where: melting and break-up of the world’s ice-
within those models – as has become clear from belts whether it be in the Arctic or the Antarctic;
the literature of the past 20 years in chaos and cycles heat-waves that set all-time temperature highs
theory for example – it is not always the case that whether it be in Western Europe or sub-Saharan
there are regularities exhibited by the optimal solu- Africa; storms increased in frequency and ferocity
tions. Matters are worse in dynamic games. whether it be Hurricane Katrina or typhoons in
74 Dynamic Games with an Application to Climate Change Models
Japan or flooding in Mumbai. In addition to Al problem one needs a dynamic and fully strategic
Gore’s eminently readable book, “An Inconve- approach. A natural methodology for this then is
nient Truth”, two authoritative recent treatments the theory of Subgame Perfect (Nash) equilibria of
are the Stern Review on the Economics of Climate dynamic games – which we have discussed at some
Change, October, 2006 and the IPCC Synthesis length in the preceding sections.
Report, November, 2007. Here are three – Although there is considerable uncertainty
additional – facts drawn from the IPCC Report: about the exact costs of global warming, the
two principal sources will be a rise in the sea-
1. Eleven of the last 12 years (1995–2006) have level and climate changes. The former may wash
been amongst the 12 warmest years in the away low-lying coastal areas such as Bangladesh
instrumental record of global surface tempera- and the Netherlands. Climate changes are more
tures (since 1850). difficult to predict; tropical countries will
2. If we go on with “Business as Usual”, by 2100 become more arid and less productive agricultur-
global sea levels will probably have risen ally; there will be an increased likelihood of
by 9 to 88 cm and average temperatures by hurricanes, fires and forest loss; and there will
between 1.5 and 5.5 C. be the unpredictable consequences of damage to
Various factors contribute to global the natural habitat of many living organisms. On
warming, but the major one is an increase in the other hand, emission abatement imposes its
greenhouse gases (GHGs) – primarily, carbon own costs. Higher emissions are typically asso-
dioxide – so called because they are transpar- ciated with greater GDP and consumer amenities
ent to incoming shortwave solar radiation but (via increased energy usage). Reducing emis-
trap outgoing longwave infrared radiation. sions will require many or all of the following
Increased carbon emissions due to the burning costly activities: cutbacks in energy production,
of fossil fuel is commonly cited as the principal switches to alternative modes of production,
immediate cause of global warming. A third investment in more energy-efficient equipment,
relevant fact is: investment in R&D to generate alternative
3. Before the Industrial Revolution, atmospheric sources of energy, etc.
CO2 concentrations were about 270–280 parts The principal features of the global warming
per million (ppm). They now stand at almost problem are:
380 ppm, and have been rising at about
1.5 ppm annually. • The Global Common – although the sources of
carbon buildup are localized, it is the total
The IPCC Synthesis (2007) says “Warming of stock of GHGs in the global environment that
the climate system is unequivocal, as is now evi- will determine the amount of warming.
dent from observations of increases in global aver- • Near-irreversibility – since the stock of green-
age air and ocean temperatures, widespread house gases depletes slowly, the effect of cur-
melting of snow and ice, and rising global average rent emissions can be felt into the distant future.
sea level” (IPCC Synthesis Report 2007). • Asymmetry – some regions will suffer more
It is clear that addressing the global warming than others.
problem will require the coordinated efforts of the • Nonlinearity – the costs can be very nonlinear;
world’s nations. a rise in one degree may have little effect but a
In the absence of an international government, rise in several degrees may be catastrophic.
that coordination will have to be achieved by way • Strategic Setting – Although the players
of an international environmental treaty. For a (countries) are relatively numerous, there are
treaty to be implemented, it will have to align the some very large players, and blocks of like-
incentives of the signatories by way of rewards for minded countries, like the US, Western Europe,
cutting greenhouse emissions and punishments for China, and Japan. That warrants a strategic
not doing so. For an adequate analysis of this analysis.
Dynamic Games with an Application to Climate Change Models 75
The theoretical framework that accommodates including the problem of global warming. We
all of these features is an asymmetric dynamic shall describe the Dutta and Radner work in detail
commons model with the global stock of green- and also discuss some of the Dockner, Long and
house gases as the (common) state variable. The Sorger research. In particular, the transition equa-
next sub-section will discuss a few models which tion is identical.in the two models (and described
have most of the above characteristics. below). What is different is the payoff functions.
We turn now to a simplified climate change
model to illustrate the basic strategic ideas. The
Models model is drawn from Dutta and Radner (2008a).
In the basic model there is no population growth
Before presenting specific models, let us briefly and no possibility of changing the emissions pro-
relate the climate change problem to the general ducing technologies in each country. (Population
dynamic game model that we have seen so far, and growth is studied in Dutta and Radner (2006)
provide a historical outline of its study. GHGs while certain kinds of technological changes are
form – as we saw above – a global common. allowed in Dutta and Radner (2004). These
The study of global commons is embedded in models will be discussed later). However, the
dynamic commons game (DCG). In such a game countries may differ in their “sizes”, their emis-
the state space S is a single-dimensional variable sions technologies, and their preferences.
with a “commons” structure meaning that each There are I countries. The emission of (a scalar
player is able to change the (common) state. In index of) greenhouse gases during period t by
particular, the transition function is of the form country i is denoted by ai(t). [Time is discrete,
! with t = 0 , 1 , 2 , . . . ad inf.] Let A(t) denote
X
I the global (total) emission during period t;
sðt þ 1Þ ¼ q sðtÞ ai ðtÞ :
i¼1 X
I
AðtÞ ¼ ai ðtÞ: (16)
The first analysis of a DCG may be found in i¼1
(Levhari and Mirman 1980). That paper consid-
The total (global) stock of greenhouse gases
ered
the P particular functional
h form is iwhich
I PI a (GHGs) at the beginning of period t is denoted
q sðtÞ i¼1 ai ðtÞ ¼ sðtÞ i¼1 ai ðtÞ for by g(t). (Note, for mnemonic purposes we are
a fixed fraction a. (And, additionally, Levhari and denoting the state variable – the amount of
Mirman assumed the payoffs pi to be logarith- “gas” – g). The law of motion – or transition
mic). Consequently, the paper was able to derive function q in the notation above – is
in closed form a (linear) MPE and was able to
analyze its characteristics. gðt þ 1Þ ¼ AðtÞ þ sgðtÞ, (17)
Subsequently several authors – Sundaram
(1989), Sobel (1990), Benhabib and Radner where s is a given parameter (0 < s < 1). We may
(1992), Rustichini (1992), Dutta and Sundaram interpret (1 s) as the fraction of the beginning-of-
(1993), Sorger (1998) – studied this model in period stock of GHG that is dissipated from the
great generality, without making the specific func- atmosphere during the period. The “surviving”
tional form assumption of Levhari and Mirman, stock, sg(t), is augmented by the quantity of global
and established several interesting qualitative emissions, A(t), during the same period.
properties relating to existence of equilibria, wel- Suppose that the payoff of country i in
fare consequences and dynamic paths. period t is
More recently in a series of papers by Dutta
pi ðtÞ ¼ hi ½ai ðtÞ ci gðtÞ: (18)
and Radner on the one hand and Dockner and
his co-authors on the other, the DCG model has The function hi represents, for example, what
been directly applied to environmental problems country i's gross national product would be at
76 Dynamic Games with an Application to Climate Change Models
different levels of its own emissions, holding the Dockner et al. (1998) impose linearity in the
global level of GHG constant. This function emissions payoff function h (whereas in Dutta and
reflects the costs and benefits of producing and Radner it is assumed to be strictly concave) while
using energy as well as the costs and benefits of their cost to g is strictly convex (as opposed to
other activities that have an impact on the emis- the above specification in which it is linear).
sions of GHGs, e. g, the extent of forestation. It The consequent differences in results we will discuss
therefore seems natural to assume that hi is a later.
strictly concave C2 function that reaches a maxi-
mum and then decreases thereafter. Global Climate Change – Results
The parameter ci > 0 represents the marginal
cost to the country of increasing the global stock In this section we present two sets of results from
of GHG. Of course, it is not the stock of GHG the Dutta and Radner (2008a) paper. The first set
itself that is costly, but the associated climatic of results characterize two benchmarks – the
conditions. As discussed below, in a more general global Pareto optima, and a simple MPE, called
model, the cost would be nonlinear. “Business As Usual” and compares them. The
Histories, strategies – Markovian strategies – second set of results then characterizes the entire
and outcomes are defined in exactly the same way SPE correspondence and – relatedly – the best and
as in the general theory above – and will, hence, worst equilibria. Readers are referred to that paper
not be repeated. Thus associated with each strat- for further results from this model and for a
egy vector s is a total discounted payoff for each numerical calibration of the model. Furthermore,
player for the results that are presented, the proofs are
merely sketched.
X
1
vi ðs, g0 Þ
dt pi ðt; s, g0 Þ:
t¼0 Global Pareto Optima
Similarly, SPE and MPE can be defined in Let x = (xi) be a vector of positive numbers, one for
exactly the same way as in the general theory. each country. A Global Pareto Optimum (GPO)
The linearity of the model is undoubtedly restric- corresponding to x is a profile of strategies that
maximizes the weighted sum of country payoffs,
tive in several ways. It implies that the model is X
unable to analyze catastrophes or certain kinds of v¼ xiV i , (19)
i
feedback effects running back from climate change
which we shall call global welfare. Without loss of
to economic costs. It has, however, two advantages:
generality, we may take the weights, xi, to sum to I.
first, its conclusions are simple, can be derived in
closed-form and can be numerically calibrated;
Theorem 4 Let V ^ ðgÞ be the maximum attainable
hence may have a chance of informing policy-
makers. Second, there is little consensus on what global welfare starting with an initial GHG stock
is the correct form of non-linearity in costs. Partly equal to g. That function is linear in g;
the problem stems from the fact that some costs are ^ ðgÞ ¼ ^u wg,
V
not going to be felt for another 50 to 100 years and
forecasting the nature of costs on that horizon 1 X
w¼ xi ci ,
length is at best a hazardous exercise. Hence, 1 ds i (20)
X
instead of postulating one of many possible non- x h ð^a Þ dwA ^
i i i i
linear cost functions, all of which may turn out to be ^u ¼ :
1d
incorrect for the long-run, one can opt instead to
work with a cost function which may be thought of The optimal strategy is to pick a constant
as a linear approximation to any number of actual action – emission – every period and after all
non-linear specifications. histories, ^a i where its level is determined by
Dynamic Games with an Application to Climate Change Models 77
P
j xj cj
d unclear in the Dockner, Long and Sorger model is
GPO : h0i ð^a i Þ ¼ , why the multiple players would have the same
xi ð1 dsÞ (26)
dci target steady-state g. It would appear natural
BAU : h0i ai ¼ : that, with asymmetric payoffs, each player would
1 ds
have a different steady-state. The existence of a
Since MRAP equilibrium would appear problematical
X consequently. The authors impose a condition
xi ci < xj cj , that implies that there is not too much asymmetry.
j
All SPE
it follows that
P We now turn to the second set of results – a full
dci d j xj cj characterization of SPE in Dutta and Radner
< :
1 ds xi ð1 dsÞ (2008a). We will show that the SPE payoff corre-
spondence has a surprising simplicity; the set of
Since hi is concave, it follows that equilibrium payoffs at a level g is a simple linear
translate of the set of equilibrium payoffs from
ai > ^a i : (27) some benchmark level, say, g = 0. Consequently,
it will be seen that the set of emission levels that
Note that this inequality holds except in the can arise in equilibrium from level g is identical to
trivial case in which all welfare weights are zero those that can arise from equilibrium play at a
(except one). This result is known as the tragedy of GHG level of 0. Note that the fact that the set of
the commons – whenever there is some externality equilibrium possibilities is invariant to the level of
to emissions, countries tend to over-emit in equi- g is perfectly consistent with the possibility that,
librium. In turn, all this follows from the fact that in in a particular equilibrium, emission levels vary
the BAU equilibrium each country only considers with g. However, the invariance property will
its own marginal cost and ignores the cost imposed make for a particularly simple characterization of
on other countries on account of its emissions; in the best and worst equilibria.
the GPO solution that additional cost is, of course, Let X(g) denote the set of equilibrium payoff
accounted for. It follows that the GPO is strictly vectors with initial state g, i. e., each element of
Pareto superior to the MPE for an open set of X(g) is the payoff to some SPE starting from g.
welfare weights xi (and leads to a strictly lower
steady-state GHG level for all welfare weights). Theorem 6 The equilibrium payoff correspon-
One can contrast these results with those in dence X is linear; there is a compact set U ℜI
Dockner, Long and Sorger (1998) that studies a such that for every initial state g
model in which the benefits are linear in emission –
i. e., hi is linear – but convex in costs ci(.). The XðgÞ ¼ U fw1 g, w2 g, . . . wI gg
consequence of linearity in the benefit function h is
that the GPO and BAU solutions have a “most where wi = ci/(1 sd), i = 1 , . . . I. In partic-
rapid approach “(MRAP) property – if (1 s)g, ular, consider any SPE, any period t and any
the depreciated stock in the next period, is less than history of play up until t. Then the payoff vector
a most preferred g, it is optimal to jump the system for the continuation strategies must necessarily be
to g. Else it is optimal to wait for depreciation to of the form
bring the stock down to g. In other words, linearity
in benefits implies a “one-shot” move to a desired v ðw1 gt , w2 gt , . . . wI gt Þ:
level of gas g, which is thereafter maintained,
while linearity in cost (as in the Dutta and Radner The theorem is proved by way of a bootstrap
model) implies a constant emission rate. What is argument. We presume that a (candidate) payoff
Dynamic Games with an Application to Climate Change Models 79
set has this invariance and show that the linear Third, the second-best is exactly realized at high
structure of the model confirms the conjecture. discount factors, rather than asymptotically
Consequently, we generate another candidate pay- approached as the discount factor tends to 1.
off set – which is also state-invariant. Then we Sanctions will be required if countries break
look for a fixed point of that operator. In other with the second-best policy and without loss of
words, we employ the APS operator to generate generality we can restrict attention to the worst
the SPE correspondence. Since that has already such sanction. We turn now to a characterization
been discussed in the previous section, it is of this worst equilibrium (for, say, country i). One
skipped here. definition will be useful for this purpose:
We will now use the above result to character-
ize the best – and the worst – equilibria in Definition 1 An i-less second-best equilibrium is
the global climate change game. Consider the the solution to a second-best problem in which the
second-best problem (from initial state g and for welfare weight of i is set equal to zero, i. e., xi = 0.
a given vector of welfare weights x = By the previous theorem, every such problem
(xi; i = 1, . . . I)), i. e., the problem of maximiz- has a solution in which on the equilibrium path,
ing a weighted sum of equilibrium payoffs: emissions are a constant. Denote that emission
level a(xi):
X
I
max xi V i ðgÞ, V ðgÞ XðgÞ: Theorem 8 There exists a “high” emission level
i¼1 P P
aðiÞ (with j6¼i aj ðiÞ > j6¼i aj ) and an i-less
Note that we consider all possible equilibria, second-best equilibrium a(xi) such that country
i. e., we consider equilibria that choose to condi- i0s worst equilibrium is:
tion on current and past GHG levels as well as
equilibria that do not. The result states that the
1. Each country emits at rate aj ðiÞ for one period
best equilibrium need not condition on GHG
(no matter what g is), j = 1, . . . I.
levels:
2. From the second period onwards, each country
emits at the constant rate aj(xi), j = 1, . . . I.
Theorem 7 There exists a constant emission level
a
a1 , a2 , . . . aI – such that no matter what the
And if any country k deviates at either stages
initial level of GHG, the second-best policy is to 1 or 2, play switches to k0s worst equilibrium from
emit at the constant rate a . In the event of a the very next period after the deviation.
deviation from this constant emissions policy by Put another way, for every country i, a sanction is
country i, play proceeds to i0s worst equilibrium. made up of two emission rates, a(i) and a(xi). The
Furthermore, the second-best emission rate is former imposes immediate costs on country i. The
always strictly lower than the BAU rate, i. e., a way it does so is by increasing the emission levels of
< a . Above a critical discount factor (less than countries j 6¼ i. The effect of this is a temporary
1), the second-best rate coincides with the GPO increase in incremental GHG but due to the irrevers-
emission rate ^ a. ibility of gas accumulation, a permanent increase in
The theorem is attractive for three reasons: country i0scosts, enough of an increase to wipe out
first, it says that the best possible equilibrium any immediate gains that the country might have
behavior is no more complicated than BAU obtained from the deviation. Of course this
behavior; so there is no argument for delaying a additional emission also increases country j0s
treaty (to cut emissions) merely because the status costs. For the punishing countries, however,
quo is simple. Second, the cut required to imple- this increase is offset by the subsequent perma-
ment the second-best policy is an across the board nent change, the switch to the emission vector
cut – independently of anything else, country a(xi), which permanently increases their quota
i should cut its emissions by the amount ai ai . at the expense of country i0s.
80 Dynamic Games with an Application to Climate Change Models
The models discussed thus far are base-line where the function hi has all of the standard prop-
models and do not deal with two important issues erties mentioned above. The damage due to the
relating to climate change – technological change stock of GHG, g(t), is assumed to be (in units of
and capital accumulation. Technological change GDP):
is important because that opens access to technol-
ogies that do not currently exist, technologies that ci Pi ðtÞgðtÞ:
may have considerably lower “emissions to
energy” ratios, i. e., cleaner technologies. Capital The cost of reducing the emission factor from
accumulation is important because an important fi(t) to fi(t + 1) is assumed to be:
question is whether or not curbing GHGs is inim-
ical to growth. The position articulated by both ’i ½f i ðtÞ f i ðt þ 1Þ:
developing countries like Indian and China as
well as by developed economies like the United Immediately it is clear that the state variable
States is that it is: placing curbs on emissions now encompasses not just the common stock
would restrict economic activities and hence g but, additionally, the emission factor profile as
restrain the competitiveness of the economy. well as the sizes of population and capital stock. In
In Dutta and Radner (2004; 2008b) the follow- other words, s = (g, f, K, P). Whilst this signif-
ing modification was made to the model studied in icant increase in dimensionality might suggest
the previous section. It was presumed that the that it would be difficult to obtain clean character-
actual emission level associated with energy izations, the papers show that there is some sepa-
usage ei is fiei where fi is an index of (un) rability. The MPE “Business as Usual” has a
cleanliness – or emission factor – higher values separable structure – energy usage ei(t) and emis-
implying larger emissions for the same level of sion factor choice fi(t + 1) – depend solely on
energy usage. It was presumed that the emission country i0s capital stock and population alone. It
factor could be changed at cost but driven no varies by period – unlike in the base-line model
lower than some minimum mi In other words, discussed above – as the exogenous variables
vary. Furthermore, the emission factor fi(t + 1)
0 ei ðtÞ, (28) stays unchanged till the population and capital
mi f i ðt þ 1Þ f i ðtÞ: (29) stock cross a threshold level beyond which the
cleanest technology gets picked. (This bang-bang
Capital accumulation and population growth is character follows from the linearity of the model).
also allowed in the model but taken to be exoge- The Global Pareto Optimal solution has similar
nous. The dynamics of those two variables are features – the energy usage in country i is directly
governed by: driven by the capital stock and population of that
country. Furthermore the emission factor choice
X
I
follows the same bang-bang character as for the
gðtÞ ¼ sgðt 1Þ þ f i ðtÞei ðtÞ, (30)
i¼1 MPE. However, there is a tragedy of the common
in that in the MPE (versus the Pareto optimum) the
K i ðt þ 1Þ ¼ H ½K i ðtÞ, K i ðtÞ↗and unbounded in t, energy usage is higher – at every state – and the
(31) switch to the cleanest technology happens later.
The output (gross-domestic product) of coun- Within the general theory of dynamic games
try i in period t is there are several open questions and possible
Dynamic Games with an Application to Climate Change Models 81
directions for future research to take. On the Benoit J-P, Krishna V (1987) Finitely repeated games.
existence question, there needs to be a better Econometrica 53:905–922
Dockner E, Nishimura K (1999) Transboundary boundary
resolution of the case where the state space S is problems in a dynamic game model. Jpn Econ Rev
uncountably infinite. This is not just a technical 50:443–456
curiosity. In applications, typically, in order to Dockner E, Long N, Sorger G (1996) Analysis of Nash
apply calculus techniques, we take the state var- equilibria in a class of capital accumulation games.
J Econ Dyn Control 20:1209–1235
iable to be a subset of some real space. The Duffie D, Geanakoplos J, Mas-Colell A, Mclennan
problem is difficult but one hopes that ancillary A (1994) Stationary Markov equilibria. Econometrica
assumptions – such as concavity and 62(4):745–781
monotonicity – will be helpful. These assump- Dutta P (1991) What do discounted optima converge to?
A theory of discount rate asymptotics in economic
tions come “cheaply” because they are routinely models. J Econ Theory 55:64–94
invoked in economic applications. Dutta P (1995) A folk theorem for stochastic games. JET
The characterization result via APS techniques 66:1–32
has a similar technical difficulty blocking its path, Dutta P, Radner R (2004) Self-enforcing climate change
treaties. Proc Nat Acad Sci USA 101(14):5174–5179
as the existence question. The folk theorem needs Dutta P, Radner R (2006) Population growth and techno-
to be generalized as well to the S infinite case. logical change in a global warming model. Econ The-
Here it is our belief though that the difficulty is not ory 29:251–270
conceptual but rather one where the appropriate Dutta P, Radner R (2008a) A strategic model of global
warming model: Theory and some numbers. J Econ
result needs to be systematically worked out. As Behav Organ (forthcoming)
indicated above, the study of the dynamics of SPE Dutta P, Radner R (2008b) Choosing cleaner technologies:
paths is in its infancy and much remains to be Global warming and technological change
done here. (in preparation)
Dutta P, Sundaram R (1993) How different can strategic
Turning to the global climate change applica- models be? J Econ Theory 60:42–61
tion, this is clearly a question of utmost social Fudenberg D, Maskin E (1986) The Folk theorem in
importance. The subject here is very much in the repeated games with discounting or incomplete infor-
public consciousness yet academic study espe- mation. Econometrica 54:533–554
Harris C, Reny P, Robson A (1995) The existence of
cially within economics is only a few years old. subgame perfect equilibrium in continuous games
Many questions remain: generalizing the models with almost perfect information: a case for extensive-
to account for technological change and endoge- form correlation. Econometrica 63:507–544
nous capital accumulation, examination of a car- Inter-Governmental Panel on Climate Change (2007) Cli-
mate change, the synthesis report. IPCC, Geneva
bon tax, of cap and trade systems for emission Levhari D, Mirman L (1980) The great fish war: an exam-
permits, of an international bank that can selec- ple using a dynamic cournot-Nash solution. Bell J Econ
tively foster technological change, . . . There are – 11:322–334
as should be immediately clear – enough interest- Long N, Sorger G (2006) Insecure property rights and
growth: the role of appropriation costs, wealth effects
ing important questions to exhaust many disserta- and heterogenity. Economic Theory 28:513–529
tions and research projects! Mertens, Neyman (1983)
Mertens J-F, Parthasarathy T (1987) Equilibria for
Discounted Stochastic Games. Research Paper 8750,
CORE. University Catholique de Louvain
Nowak A (1985) Existence of equilibrium stationary strat-
Bibliography egies in discounted noncooperative stochastic games
with uncountable state space. J Optim Theory Appl
Abreu D (1988) On the theory of infinitely repeated games 45:591–603
with discounting. Econometrica 56:383–396 Parthasarathy T (1973) Discounted, positive and non-
Abreu D, Pearce D, Stachetti E (1990) Towards a general cooperative stochastic games. Int J Game
theory of discounted repeated games with discounting. Theory:2–1
Econometrica 58:1041–1065 Rieder U (1979) Equilibrium plans for non-zero sum
Benhabib J, Radner R (1992) The joint exploitation of a Markov games. In: Moeschlin O, Pallasche D (eds)
productive asset: A game-theoretic approach. Econ Game theory and related topics. North-Holland,
Theory 2:155–190 Amsterdam
82 Dynamic Games with an Application to Climate Change Models
Rustichini A (1992) Second-best equilibria for games of Stern N (2006) Review on the economics of climate change.
joint exploitation of a productive asset. Economic The- HM Treasury, London. www.sternreview.org.uk
ory 2:191–196 Stern Review on the Economics of Climate Change Oct
Shapley L (1953) Stochastic games. In: Proceedings of (2006)
national academy of sciences, Jan 1953 Sundaram R (1989) Perfect equilibrium in a class of
Sobel M (1990) Myopic solutions of affine dynamic symmetric dynamic games. J Econ Theory
models. Oper Res 38:847–853 47:153–177
Sorger G (1998) Markov-perfect Nash equilibria in a class
of resource games. Econ Theory 11:79–100
constructing games. One approach is to focus on
Static Games the possible outcomes of the decision-makers’
interaction by abstracting from the actions or deci-
Oscar Volij sions that may lead to these outcomes. The main
Ben-Gurion University, Beer-Sheva, Israel tool used to implement this approach is the coop-
erative game. Another approach is to focus on the
actions that the decision-makers can take, the
Article Outline main tool being the non-cooperative game.
Within this approach, strategic interactions are
Glossary modeled in two ways. One is by means of
Definition of the Subject dynamic, or extensive form games, and the other
Introduction is by means of static, or strategic games. Dynamic
Nash Equilibrium games stress the sequentiality of the various deci-
Analysis of Some Finite Games sions that agents can make. An essential compo-
Existence nent of a dynamic game is the description of who
Mixed Strategies moves first, who moves second, etc. Static games,
The War of Attrition (cont.) on the other hand, abstract from the sequentiality
Equilibrium in Beliefs of the possible moves, and model interactions as
Correlated Equilibrium simultaneous decisions, where the decisions may
Rationality, Correlated Equilibrium and well be complicated plans of actions that dictate
Equilibrium in Beliefs different moves for different situations that may
Rationality and Correlated Equilibrium arise. All extensive form games can be modeled as
Bayesian Games static games, and all strategic form games can be
The Asymmetric Information Version of the War modeled as extensive form games. But some sit-
of Attrition uations may be more conveniently modeled as
Evolutionary Stable Strategies one or the other kind of game.
Future Directions This chapter reviews the main ideas and results
Bibliography related to static games, as well as some interesting
relationships that connect equilibrium concepts with
Glossary the idea of rationality. The objective is to introduce
the reader to the area of static games and to stimulate
Player A participant in a game his interest for further knowledge of game theory in
Action set The set of actions that a player may general. For a comprehensive exposition of some
choose results not covered in this chapter, the reader is
Action profile A list of actions, one for each referred to the many excellent textbooks available
player on game theory. Binmore (2007), Fudenberg and
Payoff The utility a player obtains from a given Tirole (1991), Osborne (2004), Osborne and Rubin-
action profile stein (1994) constitute only a partial list.
Although the definition of a static game is a
very simple one, static games are a very flexible
Definition of the Subject model which allows us to analyze many different
situations. In particular, one can use them to ana-
Game theory concerns the interaction of decision lyze strategic interactions that involve either com-
makers. This interaction is modeled by means of mon interests or diametrically opposed interests.
games. There are various approaches to Similarly, one can also use static games to model
© Springer-Verlag 2009 83
M. Sotomayor et al. (eds.), Complex Social and Behavioral Systems,
https://doi.org/10.1007/978-1-0716-0368-0_517
Originally published in
R. A. Meyers (ed.), Encyclopedia of Complexity and Systems Science, © Springer-Verlag 2009
https://doi.org/10.1007/978-3-642-27737-5_517
84 Static Games
simultaneously. But the interaction is modeled by players to follow, or simply as equilibrium out-
defining actions in such a way that lets us think of comes in the sense that if they occur, the players
the players as acting simultaneously. do not wish that they had acted differently. These
All of the above examples involve a set of action profiles are formally given by solution con-
players, and for each player there is a set of avail- cepts, which are functions that associate each stra-
able actions and a function that associates a payoff tegic game with the selected set of action profiles.
level to each of the profiles of actions that may The central solution concept in game theory is
result from the players’ choices. These are the known as Nash equilibrium. The hypothesis
three essential components of a static game, as behind this solution concept is that each player
formalized in the following definition. chooses his actions so as to maximize his utility,
given the profile of actions chosen by the other
Definition 1 A static game is a triple hN, players. To give a formal definition of the Nash
(Ai)i N, (ui)i Ni where N is a finite set of players, equilibrium concept, we first introduce some useful
and for each player iN, Ai is i’s set actions, and notation. For each player i N, let Ai ¼ k N
ui : kNAk ! ℝ is player i’s utility function. \{i}Ak be the set of the other players’ profiles
of actions. Then we can write A ¼ Ai Ai,
and each action profile can be written as
In the prisoner’s dilemma the set of players is
a ¼ (ai, ai) Ai Ai, thereby distinguishing
N ¼ {University I, University II}; the sets of
player i’s action from the other players’ profile of
actions are AI ¼ AII ¼ {Give me 1, Give him 2};
actions.
the utility function of University I is uI (Give me 1,
Give me 1) ¼ 1, uI(Give me 1, Give him 2) ¼ 3,
u1(Give him 2, Give me 1) ¼ 0, uI(Give him 2, Definition 2 The action profile a ¼ ai i N A
Give him 2) ¼ 2; and the utility function of Uni- in a game hN, (Ai)i N, (ui)iNi, is a Nash equi-
versity II is uII(Give me 1, Give me 1) ¼ 1, uII(Give librium if for each player, i N, and every action
me 1, Give him 2) ¼ 0, uII(Give him 2, Give me ai Ai of player i, ais at least
as good for player
1) ¼ 3, uI(Give him 1, Give him 1) ¼ 1. i as the action profile ai , ai . That is, if
In this chapter we sometimes refer to static
ui ða Þ ui ai , ai for all ai Ai and for all
games simply as games. For any game hN,
(Ai)iN, (ui)iNi, the set of action profiles i N.
kNAk is denoted by A, and a typical action It is a strict Nash equilibrium if the above
profile is denoted by a ¼ (ai)iN A. If A is a finite inequality is strict for all alternative actions
set, then we say that the game is finite. Player i’s
ai Ai ∖ ai .
utility function represents his preferences over the
set of action profiles. For instance, for any two
action profiles a and a0 in A, ui(a) ui(a0) means
that player i prefers action profile a to action Analysis of Some Finite Games
profile a0. Clearly, although player i has prefer-
ences over action profiles, he can only affect his Prisoner’s Dilemma Recall that the prisoner’s
own component, ai, of the profile. dilemma can be described by the following
matrix.
University II
Nash Equilibrium Give him 2 Give me 1
Give him2 2, 2 0, 3
University I
One objective of game theory is to select, for each Give me1 3, 0 1, 1
game, a set of action profiles that are interesting in
some way. These action profiles may be interpreted The action profile (Give me 1, Give me 1) is a
as predictions of the theory, or prescriptions for the Nash equilibrium. Indeed,
86 Static Games
One can check that (Box, Box) is a Nash equi- The War of Attrition Two animals, 1 and 2, are
librium and (Ballet, Ballet) is a Nash equilibrium fighting over a prey. Each animal chooses a time at
as well. It can also be checked that these are the which it intends to give up. Once one animal has
only two action profiles that constitute a Nash given up, the other obtains the prey; if both animals
equilibrium. give up at the same time then they split the prey
equally. For each i ¼ 1, 2, animal i’s willingness to
Matching Pennies The reader can check that fight for the prey is given by vi > 0. The value vi is
Matching Pennies has no Nash equilibrium. the maximum amount of time that animal i is willing
to spend to obtain the prey. Since fighting is costly,
Before we analyze the next example, we intro- each animal prefers as short a fight as possible. If
duce a technical tool that allows us to reformulate animal i obtains the prey after a fight of length t, his
the definition of Nash equilibrium more conve- utility will be vi t. We can model the situation as
niently. More importantly, this alternative defini- the game, (G ¼ h{1, 2}, (A1, A2), (u1, u2)i where
tion is the key to the standard proof of the
existence of Nash equilibrium.
• A1 ¼ [0, 1] ¼ A2 (an element t Ai repre-
sents a time at which player i plans to give up)
Definition 3 Let G ¼ hN, (Ai)iN, (ui)iNi be a 8
> t if t1 < t2
strategic game and let i N be a player. Consider a < 1
1
list of actions ai ¼ (a, . . ., ai1, ai+1, . . ., • u1 ð t 1 , t 2 Þ ¼ v1 t2 if t1 ¼ t2
>
:2
ani k N\{i}Ak of all the players other than i. v1 t2 if t1 > t2
The set of player i’s best responses to ai is 8
> t2 if t2 < t1
<
1
B i ðai Þ ¼ fai Ai : ui ðai , ai Þ ui ðbi , ai Þ • u2 ð t 1 , t 2 Þ ¼ v2 t1 if t1 ¼ t2 :
>
:2
for all bi Ai g: v2 t1 if t2 > t1
Static Games 87
8
< ðt 1 , 1Þ
> if t1 < v2
We are interested in the best response corre-
spondences. First, we calculate player l’s best B 2 ðt1 Þ ¼ f0g [ ðt1 , 1Þ if t1 ¼ v2 :
response correspondence, B 1 ðt2 Þ. There are three >
:
f0g if t1 > v2
cases to consider.
Combining the two best response correspon-
1
Case 1 t2 < v1 In this case, v1 t2 > 2 v1 t2 dences we get that t1 , t2 is a Nash equilibrium if
and v1 t2 > t1. Consequently, given and only if either t1 ¼ 0 and t2 v1 or t2 ¼ 0 and
that player 2’s action is t2, player l’s utility t1 v2 . Figure 2 depicts the set of all the Nash
function has a maximum value of v1 t2, equilibria as the intersection of the two best
which is attained at any t1 > t2. Therefore, response correspondences. Two things are worth
B 1 ðt2 Þ ¼ ðt2 , 1Þ. noting. First, it is not necessarily the case that the
Case 2 t2 ¼ v1 In this case, 0 ¼ v1 t2 > 12 v1 t2 : player who values the prey most wins the war.
Therefore, player’s 1 utility function u1(, t2) That is, there are Nash equilibria of the war of
has a maximum value of 0, which is attrition where the player with the highest willing-
attained at t1 ¼ 0 and at t1 > t2. Therefore, ness to fight for the prey gives in first, and as a
B 1 ðt2 Þ ¼ f0g [ ðt2 , 1Þ. result the object goes to the other player. Second,
Case 3 t2 > v1 In this case 12 v1 t2 < v1 t2 < 0: in none of the Nash equilibria is there a physical
As a result, player l’s utility function u1(, t2) fight. All Nash equilibria involve one player giv-
has a maximum value of 0, which is ing in immediately to the other. This second fea-
attained at t1 ¼ 0. Therefore, B 1 ðt2 Þ ¼ f0g ture seems rather unrealistic, since fights in war of
attrition -like situations are commonly observed.
. If one wants to obtain a fight of positive length in
the war of attrition one needs to either drop the
Summarizing, player l’s best response corre- Nash equilibrium concept and adopt an alternative
spondence is: one, or model the war of attrition differently. We
will adopt this second course of action later.
8
< ðt2 , 1Þ
> if t2 < v1
B 1 ðt2 Þ ¼ f0g [ ðt2 , 1Þ if t2 ¼ v1 Existence
>
:
f0g if t2 > v1
As the matching pennies example shows, not all
games have a Nash equilibrium. The following the-
which is depicted in Fig. 1.
orem, which dates back to Nash (1950) and
Similarly, player 2’s best response correspon-
Glicksberg (1952), states sufficient conditions on a
dence is:
game for it to have a Nash equilibrium. An earlier
version of this theorem for the smaller but prominent ’i ðan Þ ¼ ain ! ai ¼ ’i ðaÞ, which means that ’
class of zero-sum games can be found in von Neu- is continuous.
mann (1928) (translated in von Neumann (1959)). Now define ’ : A ! A by ’ ¼ (’1, . . ., ’Ni.
The standard proofs use Kakutani’s fixed point the- Clearly, ’ is a continuous function mapping a
orem. We present here an alternative proof, due to compact set to itself. Therefore, by Brouwer’s
Geanakoplos (2003), which uses Brouwer’s fixed fixed point theorem, it has a fixed point: ’ðaÞ ¼
point theorem instead. a. We now show that a is a Nash equilibrium of
the game. Assume not. Then, there is some i N
Theorem 1 The game hN, (Ai)i N, (ui)iNi has a with ai Ai such that U i ðai , ai Þ Ui ðaÞ ¼
Nash equilibrium if for all i N E > 0. Then, by concavity of Ui, for all 0 < e < 1,
U i ðeai þ ð1 eÞai , ai Þ U i ðaÞ
• the set Ai of actions of player i is a nonempty
compact convex subset of an Euclidean space, eU i ðai , ai Þ þ ð1 eÞU i ðaÞ Ui ðaÞ
• the utility function ui is continuous, eE > 0,
• the utility function ui is concave in Ai.
while k eai þ ð1 eÞai ai k2 ¼ e2 k ai ai k2
< eE; for small enough e. Therefore, for such
Proof (Geanakoplos) Define the correspondence
small e, the action eai þ ð1 eÞai satisfies
’i: A ↠ Ai by
’i ðaÞ ¼ arg max U i ðai , ai Þ kai ai k2 , Ui ðeai þ ð1 eÞai , ai Þ
ai Ai
where, k k denotes a norm in the relevant Euclid- k eai þ ð1 eÞai ai k2 > U i ðaÞ
ean space. Note first that ’i is a nonempty valued
correspondence because the maximand is a con- which contradicts the fact that ’i ðaÞ ¼ ai . □
tinuous function and Ai is compact. Second, note
that the function k ai ai k is convex:
equilibrium altogether, but to modify the way we Definition 4 An equilibrium in mixed strategies
model the problematic situations. The idea behind of the game, hN, (Ai)iN, (ui)iNi is a Nash equi-
mixed strategies is to first modify the game by librium of the mixed extension of the game. In
extending the set of actions available to the players, other words, it is a list of mixed strategies
and then to apply the concept of Nash equilibrium xk kN X such that for all players i N and
to this extended game. In this way one may obtain for all his mixed strategies xi,
additional Nash equilibria, some of which may
provide reasonable predictions to the game. U i xk kN U i xi , xi
Let G ¼ (N, (Ai)iN, (ui)iNi, be a finite
game. For any Ai, a probability distribution on Ai
is a function Alternatively, xk kN X is a mixed strategy
equilibrium if
xi : Ai ! ℝþ
such that xi B i xi for all i N:
X
xi ðai Þ ¼ 1: Note that for every finite game G ¼ hN,
ai Ai
(Ai)iN, (ui)iNi, its mixed extension is a strategic
The set of all probability distributions on Ai is game that satisfies the conditions Theorem 1. As a
denoted by D(Ai). A mixed strategy on Ai is a result, every finite game has a mixed strategy
random choice over elements of Ai, namely an equilibrium.
element of D(Ai). If xi is a mixed strategy on Ai,
xi(ai) denotes the probability that action ai Ai is Example 1 Consider again Matching Pennies. Its
selected when xi is adopted. Since elements of mixed extension is the game hN, (Xi)iN, (Ui)iNi,
D(Ai) can have an alternative interpretation, such where the set of players is N ¼ {1, 2}, the sets of
as beliefs about the choice of player i, we denote mixed strategies are X1 ¼ {(pH, pT)
the set of mixed strategies by Xi to distinguish it (0, 0): pH + pT ¼ 1}, and X2 ¼
from the more abstract set of probability distribu- {(qH, qT) (0, 0): qH + qT ¼ 1}, and the utility
tions on Ai. Also, we denote the set of mixed functions are given by U1((pH, pT), (qH, qT))
strategy problem as X ¼ iNXi. Denoting for ¼ pHqH + pTqT pHqT pTqH and U2((pH, pT),
each player iN, Xi¼ kN\{i}Xk, a typical (qH, qT)) ¼ pHqT + pTqH pHqH pTqT. It can be
mixed strategy profile can be written as checked that the only Nash equilibrium of this
(xk)kN ¼ (xi, xi) Xi Xi. The mixed exten- mixed extension is ((1/2, 1/2), (1/2, 1/2)). Indeed,
sion of the strategic game G is the strategic game since U1((pH, pT), (1/2;1/2)) is identically 0, it
hN, (Xi)iN, (Ui)iNi where the set of actions of attains its maximum at, among other strategies,
player i is the set of mixed strategies, Xi, and the (1/2;1/2). The same is true for U2((1/2, 1/2),
payoff function Ui : iNXi ! ℝ of player i is (qH, qT)). To see that there is no other equilibrium,
defined by note that for (qH, qT) with qH > qT, player l’s best
X response is (0, 1). But player 2’s best response to
U i ðxk ÞkN ¼ ui ðaÞPkN xk ðak Þ: (0, 1), is (1, 0). Since 0 1, (qH, qT) with qH > qT
a¼ðak ÞkN A cannot be part of an equilibrium. Similarly, for any
(qH, qT) with qH < qT, player l’s best response is
Remark 1 Since each mixed strategy of player (0, 1). But player 2’s best response to (0,1) is (1, 0).
i, xi, can be identified with a vector xi ¼ Since 1 0, (qH, qT) with qH < qT cannot be part
of an equilibrium.
ðxi ðai ÞÞaiAj ℝjAi j , the function Ui is multino-
mial in the coordinates of its variables, and, as a
result, it is continuous as a function of the We next present a characterization of the mixed
players’ mixed strategies. strategy equilibria of a game that will sometimes
90 Static Games
X
allow us to compute them in an easy way. Further, Ui xi , xi ¼ xi ðai ÞU i ai , xi
this characterization serves as the basis of an inter- ai Ai
esting interpretation of the mixed strategy equilib- X
xi ðai ÞUi ðx Þ ¼ U i ðx Þ
rium concept that we will discuss later. For this
ai Ai
purpose, we identify the action ai Ai of player
i with the mixed strategy of player i that assigns and therefore x is an equilibrium.
probability 1 to action ai, and 0 to all other actions.
Assume now that x ¼ xk k N is an equilib-
Therefore, given a player i, one of his actions
rium. Let i N. Then
ai Ai, and a profile x ¼ (xk)kN of the players’
mixed strategies, (ai, xi) denotes the mixed strat-
egy profile obtained from x by replacing i’s mixed U i ðx Þ U i ai , xi 8ai Ai ð4Þ
strategy xi by the mixed strategy of player i that
assigns probability 1 to action ai. With this notation and, in particular, condition (3) holds for all ai Ai
we can state the following identity: such that xi(ai) ¼ 0. Also, using (1) we can write
X X
X xi ðai ÞU i ðx Þ ¼ xi ðai ÞUi ai , xi :
U i ððxk Þk N Þ ¼ xi ðai ÞU i ððai , xi ÞÞ: ð1Þ ai Ai ai Ai
ai Ai
ð5Þ
Indeed,
If there is ai Ai such that xi ðai Þ > 0 and
X
U i ðx k Þk N ¼ ui ðaÞPk N xk ðak Þ U i ðx Þ > Ui ai , xi then, using (4),
a¼ðak Þk N A X X
X X xi ðai ÞU i ðx Þ > xi ðai ÞUi ai , xi
¼ ui ðaÞPk N xk ðak Þ ai Ai ai Ai
ai Ai ai Ai
X X
¼ x i ð ai Þ ui ðaÞPk N∖figxk ðak Þ in contradiction to (5). □
ai Ai ai Ai
X
¼ xi ðai ÞU i ððai , xi ÞÞ: Corollary 1 The strategy profile x ¼ xk k N is
ai Ai an equilibrium of the mixed extension of
hN, (Ai)iN, (ui)iN) and only if for all players
Identity (1) is useful to prove the following i N and for all ai Ai,
characterization of the mixed strategy Nash
equilibria. xi ðai Þ > 0 implies ai B i xi :
Lemma 1 The strategy profile x ¼ xk k N is
According to the standard interpretation, a
an equilibrium of the mixed extension of hN,
player’s mixed strategy in a game G is an action,
(Ai)i N, (ui)iN) if and only if for all players
but in a different game, namely in the mixed
i N and for all ai Ai,
extension of G. According to this interpretation,
a mixed strategy is a deliberate choice of a player
If xi ðai Þ > 0 then U i ai , xi to use a random device. A mixed strategy equilib-
¼ Ui ðx Þ ð2Þ rium then is a profile of independent random
devices, each of which is a best response to the
If xi ðai Þ ¼ 0 then U i ai , xi
Ui ðx Þ: ð3Þ others. Corollary 1 provides an alternative inter-
pretation of a mixed strategy equilibrium.
Proof Assume that x ¼ xk k N satisfies condi- According to this interpretation, a player’s mixed
tions (2) and (3). Let i N, and let xi be a mixed strategy represents the uncertainty in the minds of
strategy of player i. Then, by (1) the other players concerning the player’s action.
Static Games 91
In other words, a player’s mixed strategy is Therefore, the corresponding expected utility
interpreted not as a deliberate choice of the player of choosing time t is
but the belief, shared by all the other players, Z t
about the player’s choice. That is, if (xk)k N is Ui t, F j ¼ 1 F j ðtÞ ðtÞ þ F j ðtÞ vi t j
0
a profile of mixed strategies, then xi is the conjec- Z t
Fj tj
ture, shared by all the players other than i, about d ¼ ð1 F j ðtÞÞðtÞ þ vi t j dF j t j :
F j ðt Þ 0
i’s ultimate choice action. Consequently, xi are
the conjectures entertained by player i about his Since in the equilibrium we are looking for,
opponents’ actions. According to this interpreta- player i is indifferent among all his actions, the
tion, Corollary
1 says that a mixed strategy equi- above expression is independent of t. Namely,
librium xk k N is a profile of beliefs about each Ui(t, Fj) c. As a result, the derivative of the
player’s actions (entertained by the other players) above utility with respect to t equals 0. Formally,
according to which each player chooses an action
that is a best response to his own beliefs.
@U i t, F j
¼ t f j ðtÞ 1 F j ðtÞ þ ðvi tÞ f j ðtÞ
@t
The War of Attrition (cont.) ¼ 1 F j ðtÞ þ vi f j ðtÞ ¼ 0:
willingness to fight wins the war than the other strategic game hN, (Xi)i N, (Ui)i Ni where, as
way around. In particular, the probability that in Sect. “Mixed Strategies”, Xi is the set of proba-
player 1 gets the object is given by bility distributions over the actions in Ai, for i N,
but unlike there, the utility function Ui : X ! ℝN is
Z 1
not necessarily a multilinear function of the proba-
F2 ðtÞdF1 ðtÞ
0 bilities, but a general continuous function of the
mixed strategies. The only requirement on Ui is
which can be checked to be equal to v1vþv 2
2
> 1=2. that for all profiles of degenerate mixed strategies
In order to obtain the more intuitive result that the (ak)kN, we have Ui((ak)kN) ¼ ui((ak)kN). As
higher the willingness to fight for the prey, the before, a mixed strategy Nash equilibrium of
higher is the probability to obtain it, we will need hN, (Ai)iN, (ui)iNi is a Nash equilibrium of its
to model the war of attrition in yet a different way. mixed extension hN, (Xi)iN, (Ui)iN i. In other
Weil return to this when we introduce asymmetric words, it is a list of mixed strategies xk k N such
information to the games. that for all players i N and for all of his mixed
strategies xi,
U i xk k N U i xi , xi :
Equilibrium in Beliefs
Alternatively, xk k N is a mixed strategy
The mixed extension of the game hN, (Ai)iN, equilibrium if
(ui)iNi is constructed in two steps. First, we
enlarge the set of actions available to each player xi B i xi for all i N:
by allowing him to choose any mixed strategy on
his original action set. Second, since the action Observation 1 It is important to note that two
choices are now probability distributions over different actions of a player may be best responses
actions, we extend the players’ original prefer- to a given mixed strategy profile of the other
ences to preferences over profiles of mixed strat- players, and yet no probability mixture of the
egies. We do so by evaluating each mixed strategy two actions will be a best response to the given
profile according to the expected value of the mixed strategy profile. This will typically be the
original utilities with respect to the probability case when the function Ui is strictly convex in Xi,
distribution over action profiles induced by the since strictly convex functions attain their maxi-
mixed strategy. mum at boundary points.
The first step seems uncontroversial since it is
certainly possible for players to use random Theorem 1 shows that Nash equilibria exist
devices. But the second step is somewhat prob- when the extended utility function Ui is concave
lematic because, by evaluating mixed strategies in Xi. However, Observation 1 indicates that a
according to the expected utility of the resulting Nash equilibrium may fail to exist when Ui
lotteries, one is implicitly imposing on the players is strictly convex in Xi. Indeed, take a game,
a certain kind of risk preferences. One may won- G ¼ hN, (Ai)iN, (ui)iNi with no pure strategy
der what the implications would be if instead of Nash equilibrium, like Matching Pennies, and con-
extending the preferences by assuming that sider its mixed extension G ¼ hN, (Xi)iN, (Ui)iNi,
players are expected utility maximizers, we where for all players, their extended utility function
assume that players have more general prefer- is strictly convex. Then, for any player i N and
ences over profiles of mixed strategies. In partic- for any profile of mixed strategies xi of the other
ular, we would like to know if there is a suitable players, the set of i’s best responses B i ðxi Þ consists
generalization of Corollary 1. of only degenerate mixed strategies. Since G has no
Let G ¼ hN, (Ai)i N, (ui)i Ni be a finite pure strategy Nash equilibrium, we conclude that G
game. We define the mixed extension of G as the does not have a Nash equilibrium.
Static Games 93
Observation 2 It is also important to note that, defined the notion of an equilibrium in beliefs.
unlike in the standard expected utility case, a Before we formally present his definition we
player’s mixed strategy xi may very well be a need to introduce some notation.
best response to some profile xi of the other Since when the extended utility functions Ui
players’ mixed strategies and at the same time are concave in i’s own strategy a best response to a
may assign positive probability to an action that given profile of the other players’ strategies may
(when regarded as a degenerate mixed strategy) is be a non-degenerate mixed strategy, a mixture of
not a best response to xi . Formally, it may very best responses will typically be a mixture over
well be the case that non-degenerate mixed strategies. This mixture
induces a probability distribution over actions in
a natural way by reducing the compound mixture
Ui xk k N U i xi , xi for all xi Xi
to a simple mixture. This induced probability dis-
tribution can be interpreted as a belief over the
and yet actions ultimately chosen. For example, in
E Matching Pennies, if player 1 believes that there
U i ai , xi < U i xk k N is a probability of 1/2 that player 2 will choose the
mixed strategy (1/3;2/3) and a probability of 1/2
for some ai such xi ðai Þ > 0:
that player 2 will choose the mixed strategy
(2/3;1/3), then player 1 believes that player 2 will
This will typically occur when the function Ui
choose each one of his two actions with equal
is strictly concave in Xi.
probability. More generally, if player i assigns
The definition mixed strategy equilibrium
probability pk to the event that player j will choose
requires from each strategy in the equilibrium
mixed strategy xk Xj, for k ¼ 1, . . ., K, then
profile that it be a best response to the other
player i’s beliefs about player j’s actions are given
strategies. Corollary 1 stated that when prefer- P
by Kk¼1 pk xk X j. That is, for each action aj Aj
ences have the expected utility form, each mixed
of player j, player i believes that player j will
strategy in a mixed strategy equilibrium is also a PK k
choose ai with probability k¼1 p k x a j . For
probability mixture over best responses to the
each set T
Xi of mixed strategies, let D[T]
Xi
other strategies in the profile. This result allowed
denote the set of probability distributions over i’s
us to interpret a mixed strategy Nash equilibrium
actions that are induced by mixtures over elements
as a profile of beliefs, rather than as a profile of
of T. With this notation in hand, we can define the
probability mixtures. As explained in Observation
concept of equilibrium in beliefs.
2, however, when preferences over mixed strate-
gies are not expected utility preferences, a mixture
Definition 5 Let G ¼ hN, (Ai)iN, (ui)iNi be a
over best responses is not necessarily a best
game. For each i N, let B i : X ! Xi be the best
response. Therefore, Corollary 1 does not extend
response correspondencein the
mixed extension of
to the mixed extension where preferences are not
G. The profile of beliefs xk k N k N DðAk Þ is
of the expected utility form.
an equilibrium in beliefs if
In this setup, however, one can still interpret a xi D B i xi for all i N:
player’s mixed strategy as a belief entertained by
the other players about the actions chosen by that An
equilibrium in beliefs is a profile of beliefs
player. And a profile of such beliefs will be in xk k N . For each i N, xi is the common belief
equilibrium if the probability distribution over the of the players other than i about player i’s choice
player’s actions that represents i’s beliefs is of actions. In order for this profile of beliefs to be
obtained as a mixture of best responses of this in equilibrium, we require that for each player
player to his beliefs about the other players’ i N all the other players believe that i chooses
actions. With this idea in mind, Crawford (1990) a mixed strategy that is a best response to his
94 Static Games
beliefs, which are given by xk k N∖fig, about the 5. Player 2 believes that player 1 will ultimately
other players’ choices actions. In other words, xi choose H and T each with probability 1/2.
must be aconvex combination of best responses 6. Given these beliefs, player 2’s only best replies
of i to xk k N∖fig . are (1, 0) and (0, 1), and
1. Player 1 believes that player 2 will choose
Example 2 Consider again the mixed extension of (1, 0) with probability 1/2, and (0, 1) with
Matching Pennies hN, (Xi)iN, (Ui)iNi where the probability 1/2.
set of players is N ¼ {1, 2}, the sets of mixed strat-
egies are X1 ¼ {(pH, pT) (0, 0) : pH + pT ¼ 1} and The following result is a direct implication of
X2 ¼ {(qH, qT) (0, 0) : qH + qT ¼ 1}, and the utility the definition of an equilibrium in beliefs.
functions are now given by U1((pH, pT),
(qH, qT)) ¼ (pHqH)2 + (pTqT)2 pHqT pTqH Proposition 2 (Crawford 1990)) Let G ¼ hN,
and U2((pH, pT), (qH, qT)) ¼ (pHqT)2 + (pTqH)2 (Ai)iN, (ui)iNi be a strategic game, and G ¼ hN,
pHqH pTqT. Since the utility functions are (Xi)iN, (Ui)iNi be the mixed extension of G,
strictly convex in the players’ own mixed strategies, where Ui is continuous but not necessarily
the best response to any strategy of the opponent is multilinear.
a pure strategy. In particular, one can verify that
1. Every mixed strategy Nash equilibrium of G is
8 an equilibrium in beliefs.
< ð0, 1Þ
> if qH > qT
2. If for all i N, Ui is quasiconcave in Xi, then
B 1 ðqH , qT Þ ¼ fð1, 0Þ, ð0, 1Þg if qH ¼ qT
>
: every equilibrium in beliefs is a mixed strategy
ð1, 0Þ if qH < qT Nash equilibrium of G.
Proof
and
1. Since B i xi
D B i xi for all i N,
8
every Nash equilibrium is an equilibrium in
< ð0, 1Þ
> if pH > pT
beliefs.
B 1 ðpH , pT Þ ¼ fð1, 0Þ ð0, 1Þg if pH ¼ pT
>
: 2. When the utility function Ui is quasiconcave in
ð1, 0Þ if pH < pT : i’s mixed strategy, the set of best responses
B i xi is a convex set. Therefore,
D B i xi ¼ Bi xi , and any equilibrium
It can also be verified that
in beliefs is a Nash equilibrium. □
pH , pT , qH , qT ¼ ((1/2, 1/2), (1/2, 1/2)) is
Crawford (1990) shows that although some
an equilibrium in beliefs. Indeed, for both i ¼ 1, 2,
games have no Nash equilibrium, every game
(1/2, 1/2) Xi is a convex combination of (1, 0)
has an equilibrium in beliefs.
and (0, 1), which are both in B j ð1=2, 1=2Þ, j 6¼ i.
In this equilibrium,
realization of the other players’ random variables. Static Games, Table 1 A random device
There is nothing in the bare notion of equilibrium,
S2
however, that requires players’ behavior to be inde-
1 2
pendent. The basic feature of an equilibrium is that 1 4/9 2/9
each player is best responding to the behavior of S1
2 2/9 1/9
others, and that each player is free to choose any
action in his action set. But one thing is that players Driver 1 learns the realization of S1, he still
can, if they so wish, change their behavior without believes that Driver 2 will choose Slow Down
the consent of others, and another different thing is with probability 2/3 and consequently any choice
to expect players’ choices to be independent. There- is optimal, in particular the one described above.
fore, one could ask what would happen if the ran- Similarly, after Driver 2 learns the realization of
dom devices players use to ultimately choose their S2, he still believes that Driver 1 will choose to
actions were correlated. In that case, knowledge of slow down with probability 2/3, and his planned
the realization of one’s random device would pro- behavior continues to be optimal.
vide some partial information about the realization But what would happen if the joint distribution
of the other players’ random devices, and therefore of S1 and S2, was not as presented in Table 1, but
of their choices. In equilibrium, a player should take rather as follows?
this information into account. To illustrate this
point, consider the game of Chicken. S2
1 2
1 1/3 1/3
Driver 2 S1
Slow Down Speed up 2 1/3 0
Slow Down 6, 6 2, 7
Driver 1
Speed up 7, 2 0, 0 To answer this question, assume that both players
still choose their actions according to the previous
This game has two pure action Nash equilibria, pattern of behavior: Driver 1 chooses Slow Down if
and one equilibrium in mixed strategies. S1 ¼ 1, and Speed Up otherwise. The same holds for
According to the mixed strategy Nash equilib- Driver 2. As a result, it is still true that each player
rium, each player chooses Slow Down with prob- chooses Slow Down with probability 2/3 and Speed
ability 2/3 and Speed Up with probability 1/3. Up with probability 1/3. However, since this time the
This mixed strategy equilibrium can be conditioning random variables S1 and S2 are not
implemented by the following random device. independent, knowledge of the realization of S1
Consider two random variables S1 and S2, whose affects the beliefs of Driver 1 about the probability
joint distribution is given by the following table: with which Driver 2 chooses his actions. In particular,
Driver 1 chooses his action as a function of the if S1 ¼ 1, Driver 1 updates his beliefs and assigns
realization of S1 and Driver 2 chooses his action as probability 1/2 to Driver 2 choosing either action, and
a function of the realization of S2. (Neither player consequently, Driver l’s only optimal action is Slow
is informed of the realization of the other player’s Down, which is precisely the choice dictated by the
random variable.) In particular, Driver 1 chooses above pattern of behavior. Similarly, if S1 ¼ 2, Driver
Slow Down if S1 ¼ 1 and Speed Up otherwise. 1 should update his beliefs and assign probability one
Similarly, Driver 2 chooses Slow Down if S2 ¼ 1, that Driver 2 will choose Slow Down. Consequently,
and Speed Up otherwise. Note that according to Driver l’s best reply is to follow the above pattern of
this pattern behavior, each player chooses to slow behavior and choose Speed Up. One can see that,
down with probability 2/3. But more importantly, given that the players know that the random variables
since S1 and S2 are independent random variables, S1 and S2 are correlated and they use this information
knowledge of the realization of one random vari- accordingly, there is no incentive for either of them to
able does not give any information about the deviate from the proposed pattern of behavior. There-
realization of the other one. Therefore, after fore, we can say that this pattern of behavior is an
96 Static Games
equilibrium. This notion of a correlated equilibrium • O ¼ {(1, 1), (1, 2), (2, 1))}
was introduced in Aumann (1974). Before we give a • p(o) ¼ 1/3 for all o O
formal definition we introduce the concept of a cor- • P I ¼ ffð1, 1Þ, ð1, 2Þg, fð2, 1Þgg and P II ¼
related strategy profile, which will play a central role ffð1, 1Þ, ð2, 1Þgg, fð1, 2Þg
not only in this section, but in the next one as well. Slow Down if o fð1, 1Þ, ð1, 2Þg
• s1 ðoÞ ¼
Speed up if o fð2, 1Þg
Definition 6 Let G ¼ hN, (Ai)i N, (ui)i Ni be Slow Down if o fð1,1Þ, ð2,1Þg
• sII ðoÞ ¼
a game. A correlated strategy profile in G consists of Speed up if o fð1,2Þg:
expected utility is increased. Note that the player if xi(ai) > 0 then Ui(x) ¼ Ui(ai, xi)
presumably chooses his strategy (his way to con- if xi(ai) ¼ 0 then Ui(x) Ui(ai, xi).
dition his actions on the outcomes of the random
Consequently, for all ai Ai
device) before he learns the realization of the
device. Nonetheless, he evaluates the outcomes
generated by the players’ strategies by taking xi ðai ÞUi ðai , xi Þ xi ðai ÞU i ðbi , xi Þ for all bi Ai :
into account the precise correlation of the ran- ð7Þ
dom devices on which outcomes players are con-
ditioning their behavior. Although strictly Now let ti : A ! Ai be a function that is
speaking mixed strategy Nash equilibria are not measurable with respect to P i . Let ai Ai be
correlated equilibria, they do induce a correlated a fixed profile of actions for players other than i.
equilibrium distribution over action profiles. In Letting bi ¼ ti(ai, ai), Eq. (7) implies that
order to state this claim, we need the following
definition.
xi ðai ÞU i ðai , xi Þ xi ðai ÞU i ðt, ðai , ai Þ, xi Þ
for all ai Ai
Definition 8 Let hðO, pÞ, ðP i , si Þi N be a cor-
related strategy profile for G. Its induced proba-
bility distribution over action profiles is given by Adding over all ai Ai,
the function p : A ! [0, 1] defined by
X
xi ðai ÞU i ðai , xi Þ
pðaÞ ¼ pðfo O : sðoÞ ¼ agÞ ai Ai
X X
¼ pðoÞ for all a A xi ðai ÞU i ðti ðai , ai Þ, xi Þ:
ai Ai
fo O:sðoÞ¼ag
Proposition 3 Let G ¼ hN, (Ai)iN, (ui)iNi be a Taking into account the definition of Ui(ai, xi)
strategic game, and let x ¼ (x1, , xn) be a mixed and Ui(ti(a), xi), and using the measurability of
strategy Nash equilibrium of G. Then, there is a
ti with respect to P i , we get
correlated equilibrium ðO, pÞ, ðP i Þi N , ðsi Þi N i,
0 1
whose induced probability distribution over action X X Y
profiles is the same as x’s distribution. xi ðai Þ @ x j a j A ui ðai , ai Þ
ai Ai ai Ai j N∖fig
0 1
Proof Let ðO, pÞ, ðP i Þi N , ðsi Þi N be defined X X Y
xi ðai Þ @ x j a j A ui ðti ðaÞ, ai Þ
as follows: ai Ai ai Ai j N∖fig
O¼A !
X Y
Y x j a j ui ðai , ai Þ
pðaÞ ¼ x i ð ai Þ aA jN
iN
!
X Y
x j a j ui ðti ðaÞ, ai Þ
P i ð aÞ ¼ fb A : b i ¼ ai g aA jN
X X
pðaÞ ui ðai , ai Þ pðaÞ ui ðti ðaÞ, ai Þ
si ðaÞ ¼ ai : aA aA
X X
pðaÞ ui si ðaÞ, sj ðaÞ pðaÞ ui ðti ðaÞ, si ðaÞÞ:
We claim that ðO, pÞ, ðP i Þi N , ðsi Þi N is a aA aA
pðaÞ ¼ pðfb A : sðbÞ ¼ agÞ if o0 P0i ðoÞ, then s0i o0 Þ ¼ s0i ðoÞ by measurability
of s0i with respect to P 0i . Therefore, by definition of
¼ pð f b A : b ¼ a g Þ
P i , P i ðs0 ðo0 ÞÞ ¼ P i ðs0 ðoÞÞ, and both s0i ðo0 Þ and
¼ pðaÞ s0i ðoÞ belong to the same element of P i. Since ti is
Y measurable with respect to P i , we conclude that
¼ xi ðai Þ:
iN □ t0i ðo0 Þ ¼ t; s0i ðo0 Þ ¼ ti s0i ðoÞ ¼ t0i ðoÞ.
Also,
Although a correlated strategy profile consists X
pðoÞui ðsi ðoÞ, ti ðoÞÞ
of a randomizing device used by the players, it oO
turns out that the only feature of the device that X
¼ pðaÞui ðai , ti ðaÞÞ
determines whether or not the correlated strategy aA
profile constitutes a correlated equilibrium is its X X
induced probability distribution over the action ¼ p0 ðoÞUi ðs0i ðoÞ, ti ðs0 ðoÞÞ
a Afo O0 :s0 ðoÞ¼ag
profiles. This is shown by the next proposition. X X
¼ p0 ðoÞui s0i ðoÞ, t0i ðoÞ
Proposition 4 Let G ¼ hN, (Ai)iN, (ui)iNi be a Afo O0 :s0 ðoÞ¼ag
X
a finite strategic game. Every correlated equilib- ¼ p0 ðoÞui ðs0i oi, t0i ðoÞ
rium probability distribution over action profiles o O0
can be obtained in a correlated equilibrium of
In particular, for ti ¼ si,
G in which
X
• O¼A pðoÞui ðsi ðoÞ, si ðoÞÞ
• P i ðaÞ ¼ fb A : bi ¼ ai g: oO
X
¼ p0 ðoÞui s0i ðoÞ, si 0 ðoÞ :
Proof Let hðO0 , p0 Þ P 0i , s0i i N i be a correlated
o O0
equilibrium of G. Consider the D E
correlated strategy Since ðO0 , p0 Þ, P 0i , s0i is a correlated
profile ðO, pÞ, ðP i , si Þi N defined by iN
equilibrium,
• O¼A
X
• p(a) ¼ p0({o O : s0(o) ¼ a}) for each a A p0 ðoÞui s0i ðoÞ, s0i ðoÞ
• P i ðaÞ ¼ fb A : bi ¼ ai g for each i N and o O0
for each a A X
p0 ðoÞui s0i ðoÞ, t0i ðoÞ
• si(a) ¼ ai for each i N. o O0
and therefore
X
It is clear that this correlated strategy profile pðoÞui ðsi ðoÞ, si ðoÞÞ
induces the required distribution over action pro- oO
X
files. Indeed, pðoÞui ðsi ðoÞ, ti ðoÞÞ:
oO
p ð a Þ ¼ p ð f o O : sð o Þ ¼ a g Þ □
0 0
¼ pðfa A : a ¼ agÞ
¼ pðaÞ
Rationality, Correlated Equilibrium and
0 0 0
¼ p ðfo O : s ðoÞ ¼ agÞ: Equilibrium in Beliefs
It remains to show that this profile is a correlated As mentioned earlier, Nash equilibrium and corre-
equilibrium. Take a function ti : A ! Ai that is lated equilibrium are two examples of what is
measurable with respect to P i. Define t0i : O0 ! Ai known as solution concepts. Solution concepts
by t0i ðoÞ ¼ ti ðs0 ðoÞÞ ¼ ti s0i ðoÞ, s0i ðoÞ . The assign to each game a pattern behavior for the
function t0i is measurable with respect to P 0i. Indeed, players in the game. The interpretation of these
Static Games 99
patterns of behavior is not always explicit, but it is believe. Although players cannot freely choose
fair to say that they are usually interpreted either as their beliefs (in the same way as they cannot
descriptions what rational people do, or as prescrip- choose their preferences), they can choose their
tions of what rational people should do. There is a actions. Furthermore, they have no obligation to
growing literature that tries to connect various game behave according to the specified correlated strat-
theoretic solution concepts to the idea of rationality. egy profile. However, ultimately players do
Rationality is generally understood as the character- behave in a certain way and that behavior is
istic of a player who chooses an action that maxi- what is represented by the given correlated strat-
mizes his preferences, given his information about egy profile.
the environment in which he acts. Part of the infor- Once we fix a correlated strategy profile we can
mation a player has is represented by his beliefs address the rationality of the players. Formally,
about the behavior of other players, their beliefs
about the behavior of other players, and so on. So Definition 9 Player i N is Bayes rational at
when one speaks of the rationality players, one o O if his expected payoff at
needs to take into account their epistemic state. o, Eðui ðsÞjP i ÞðoÞ, is at least as large as the
There is a formal framework which is appropriate amount Eðui ðsi , ai ÞjP i ÞðoÞ that he would have
for discussing the actions, knowledge, beliefs and got had he chosen action ai Ai instead of si(o).
rationality players. Namely, the framework of a
correlated strategy profile. As defined in Sect. “Cor- In other words, player i is rational at a given state
related Equilibrium”, a correlated strategy profile in of the world if the action si(o) he chooses at that
a game G consists of state maximizes his expected utility given his infor-
mation, P i ðoÞ, and, in particular, given his beliefs
• A finite probability space (O, p) about the actions of the other players.
• For each player i N a partition P i of O into As before, for any finite set T, let D(Ti be the
events of positive probability set of all probability distributions on T. The beliefs
• For each player i N a function si : O ! Ai of player i about the actions of the other players
which is measurable with respect to P i . are represented by his conjectures. A conjecture of
i is a probability distribution ci D(Ai) over the
For the present discussion
we interpret a corre- elements of Ai. For any j 6¼ i, the marginal of ci
lated strategy profile ðO, pÞ, ðP i Þi N , ðsi Þi N on Ai is the conjecture of i about j induced by
as a description of the players’ behavior and ci. Given a correlated strategy profile
beliefs, as observed by an outside observer. The ðO, pÞ, ðP i Þi N , ðsi Þi N , one can determine
set O is the set of possible states of the world and p the conjectures that each player is entertaining at
is the prior probability on O shared by all the each state of the world about the actions of the
players. For each player i N, P i is a partition other players. These conjectures are given by the
of O that represents i ’s information. At state following definition.
o O, player i is informed not of the state that
actually occurred, but of the element P i ðoÞ of his Definition 10 Given a correlated strategy profile
partition that contains o. Player i then uses this
hðO, pÞ, ðP i , si Þi N , the conjectures of i N
information and his prior p to update his beliefs
about the other players’ actions are given by the
about the true state of the world. Finally, the
function fi : O ! D(Ai) defined by
function si represents the actions taken by player
i at each state. In particular, si(o) is the action p½fo0 P i ðoÞ : si ðo0 Þ ¼ ai g
chosen by i at state o. Although a correlated fi ðoÞðai Þ ¼ :
p½pi ðoÞ
equilibrium can be interpreted as a correlated
strategy profile prescribed by a given solution
concept (that of a correlated equilibrium), here For each o, fi(o) D(Ai) is the conjecture
we want to interpret a correlated strategy profile of i at o. For j 6¼ i, the marginal of fi(o) on Ai is
as a description of what players actually do and the conjecture of i at o about j’s actions.
100 Static Games
Given a correlated strategy profile, we can E½ui ðsÞjP i ðo0 Þ E½ui ðsi , ai ÞjP i ðo0 Þ
speak about what each player knows. The object for all ai Ai
of knowledge are called events, which are the
subsets of the set of states of the world O. We and since si : O ! Ai is measurable with respect to
say that player i knows event E
O at state o, if P i , si ðo0 Þ ¼ ai is the action that player i chooses
Pi(o)
E. That is, i knows E at o if whatever at all states in Pi(o0). Then we can write
state he deems possible at o is in E.
The next result, proved by Aumann and
Theorem 3 Let G ¼ hN, (Ai)iN, (ui)iNi be a Eðui ðsÞjP i ÞðoÞ Eðui ðsi , ai ÞjP i ÞðoÞ 8ai Ai :
strategic game, and let ðO, pÞ, ðP i Þi N , ðsi Þi N
be a correlated strategy profile for G. Also let That is,
(ci)iN iND(Ai) be a profile of conjectures,
one for each player. Assume that at some state X pðo0 Þ
u ðs ðo0 Þ, si ðo0 ÞÞ
o O each player knows that the others are ratio- pðP i ðoÞÞ i i
o0 P j ðoÞ
nal. Further, assume that at o their conjectures are X pðo0 Þ
commonly known to be (ci)iN. Then, for each j, all u ðs ðo0 Þ, ai Þ 8ai Ai :
pðP i ðoÞÞ i i
the conjectures ci of players j other than j, induce o0 p j ðoÞ
the same belief ’j D(Aj) about j’s actions, and the
resulting profile of beliefs, (’i)iN, is an equilibrium In particular, for ai ¼ t(o) ¼ t(o0) for all
0
in beliefs. o P i ðoÞ,
102 Static Games
cards dealt to the player. The probability measure A Bayesian equilibrium of a Bayesian game
m represents the players’ prior belief about the is a Nash equilibrium of a properly defined
state of nature. This prior belief will be used static game. As such, conditions for its exis-
along with the information obtained by each tence can be derived from Theorem 1. However,
player to form beliefs about the other players’ in many situations one is interested in particular
information. The set of actions of player i is Ai. kinds of equilibria. Specifically, in the analysis
Note that there is no loss of generality in assuming of auctions or of the war attrition, one is often
that this set does not depend on the state nature. interested in efficient outcomes. In a single
One can always add unavailable actions and object auction, efficient outcomes are character-
assign them intolerable disutility. Finally, ui is ized by the fact that in equilibrium the object is
the payoff function that associates to each state allocated to the buyer who values it most.
of nature and action profile a utility level. Note According to many standard auction rules, the
that since the state of the world is unknown to the object goes to the highest bidder. Therefore, in
player at the time of making his choice, a player such auctions, to guarantee an efficient out-
faces a lottery for any given action profile. The come, one would need a monotone equilibrium,
assumption is that the player evaluates this lottery namely, one in which bidders’ bids are higher
according to the expected value of ui with respect the higher their valuations for the object are.
to that lottery. Athey (2001) shows conditions under which a
Let N, ðO, mÞ, ðAi , P i , ui Þi N be a Bayes- Bayesian equilibrium exists where strategies
ian game. A strategy for player iN is a function are non-decreasing. The crucial conditions are
si : O ! Ai that is measurable with respect to P i . that the players’ types can be represented by a
We denote the set of strategies for player i by B i . one-dimensional variable, and that, fixing a
That is, B i ¼ {si : O ! Ai : si is measurable w.r.t. nondecreasing strategy for each of a player’s
p i }. The interpretation of a strategy in a Bayesian opponents, this player’s expected payoffs sat-
game is the usual one. For each state of nature isfies a single-crossing property. This single-
o O, si(o) is the action chosen by player i at crossing property roughly says that if a high
o. The measurability requirement imposes that action is preferred to a low action for a given
player i’s actions depend only on his information. type t, then the same must be true for all types
If player i cannot distinguish between two states of higher than t. McAdams (2003) extended
nature, then he must choose the same action at both Athey’s result to the case where types and
states. Player i evaluates a profile s : O ! A of actions are multidimensional and partially
strategies according to the expected value of ui ordered.
with respect to m.
In order to define an equilibrium notion for
Bayesian games we follow the same idea used The Asymmetric Information Version of
for the definition of a mixed strategy equilibrium. the War of Attrition
Namely, we translate the Bayesian game into a
standard game, and then define an equilibrium of We have seen that, when applied to the war of
the Bayesian game as the Nash equilibrium of the attrition, as modeled by a standard strategic
induced game. game or by its mixed extension, the notion of
Nash equilibrium does not yield a satisfactory
prediction. (The war of attrition was analyzed in
Definition 12 A Bayesian equilibrium of a Bayes-
Maynard Smith (1974). For an analysis of the
ian game N, ðO, mÞ, ðAi , P i , ui Þi N is a Nash asymmetric information version of the war attri-
equilibrium of the strategic game: tion, see Krishna and Morgan (1997).) In the
N, ðB i Þi N , ðUi Þi N where for each profile former case all the equilibria involve no fight,
s : O ! A of strategies, Ui(s) ¼ Em[ui(s(o), o)] and in the latter case the equilibrium dictates a
is i’s expected utility with respect to m. more aggressive behavior to the player who
104 Static Games
Z z
values the contested object less. In what fol-
U ðv i , zÞ ¼ ðvi bðyÞÞf ðyÞdy bðzÞð1 FðzÞÞ:
lows, we analyze the war of attrition as a Bayes- 0
ian game. That is, we assume that the players are
ex-ante symmetric but they have private infor- Taking derivatives with respect to z, and using
mation about their value for the contested the fact that b0(z) ¼ zf(z)/[1 F(z)] we obtain
object.
A Bayesian game that represents the war of attri-
@U
ðv , zÞ ¼ vi f ðzÞ b0 ðzÞð1 FðzÞÞ
tion is given by N, O, ðAi , mi , P i , ui Þi N @z i
where ¼ ðvi zÞf ðzÞ,
• N ¼ {1, 2} which is positive for z < vi, and negative for z > vi.
• O ¼ [0, 1)2 ¼ {(v1, v2): 0 vi < 1, i ¼ 1, 2} As a result, the expected utility of player i with
• Ai ¼ [0, 1) for i ¼ 1, 2 willingness to pay vi is maximized at z ¼ vi, which
• P i ðvb1 , vb2 Þ ¼ fðv1 , v2 Þ O : vi ¼ vbi g for i ¼ 1, 2 implies that the optimal choice is b(vi).
• m((v1, v2) (b v1 , vb2 )) ¼ F (b
v ) F (b v2 ) Thus, modeling the war of attrition as an asym-
1 metric game has allowed us to find an equilibrium
ai if ai a j
• ui ðða1 , a2 Þ, ðv1 , v2 ÞÞ ¼ in which players with higher willingness to fight
vi a j if ai > a j :
fight more, and there is a non-negligible probabil-
ity of observing a fight.
Here the set of types player i, for i ¼ 1, 2, is
represented by the player’s willingness to fight, vi.
The players willingness to fight are drawn inde-
pendently from the same distribution F. A state of Evolutionary Stable Strategies
the world is, therefore, a realization (v1, v2) of the
players’ types, and at that state, each player is The notion of the Nash equilibrium concept
informed only of his type. Finally, the utility of a involves players choosing actions that maximize
player is his valuation for the prey, if he obtains it, their payoffs given the choices of the other
net of the time spent fighting for it. We are inter- players. The usual interpretation of a Nash equi-
ested in a symmetric equilibrium in which both librium is as a pattern of behavior that rational
players use a symmetric, strictly increasing strat- players should adopt. However, Nash equilibria
egy b : [0, 1) ! [0, 1), where b(vi is the time at are sometimes interpreted more descriptively as
which a player with willingness to fight v is dic- patterns of behavior that rational players do adopt.
tated by the equilibrium to give up. Such an equi- Certainly, rationality of players is neither a neces-
librium would imply that types who value the prey sary condition nor a sufficient one for players to
more, are willing to fight more. Further, the prob- play a Nash equilibrium. The relationship between
ability of observing a fight in equilibrium would rationality and the various solution concepts is not
not be 0 (in fact, it would be 1.) apparent and has been the focus of an extensive
It turns out that a symmetric equilibrium strat- literature (see, for example, Aumann 1987;
egy is given by Aumann 1995; Aumann and Brandenburger
Z 1995; Brandenburger and Dekel 1987) Nonethe-
v
xf ðxÞ less, the notion of a Nash equilibrium evokes the
bðvÞ ¼ dx,
0 1 Fð x Þ idea of players consciously making choices with
the deliberate objective of maximizing their pay-
where f denotes the derivative of F. To see this, offs. It is therefore quite remarkable that a concept
assume that player j behaves according to b and almost identical to that of Nash equilibrium has
that player i chooses to give up at t. Letting z be emerged from the biology literature. This concept
the type such b(z) ¼ t, the expected utility of describes a population equilibrium where uncon-
player i from choosing t is scious organisms are programmed to choose
Static Games 105
actions with no deliberate aim. In this equilibrium, expected payoff of a member of the majority.
members of the population meet at random over Otherwise it will propagate. This leads to the
and over again to interact. At each interaction, following definition.
these players act in a pre-programmed way and
the result their actions is a gain in biological fitness.
Definition 13 An action a A is an evolutionary
Fitness is a concept related to the reproductive
stable strategy of G if there is an e ð0, 1Þ such
value or survival capacity of an organism. In a
that for all e ð0, eÞ, and for all b A
temporary equilibrium, the fitness gains are such
that the proportions of individuals that choose each
one of the possible actions remain constant. How- ð1 eÞu1 ða, aÞ þ eu1 ða, bÞ
ever, this temporary equilibrium may be disturbed > ð1 eÞu1 ðb, aÞ þ eu1 ðb, bÞ: ð10Þ
by the appearance of a mutation, which is a new
kind of behavior. This mutation may upset the
temporary equilibrium if its fitness gains are such The following result shows that the concept of
that the new behavior spreads over the population. an evolutionary stable strategy is very close to the
Alternatively, if the fitness gains of the original notion of a Nash equilibrium.
population outweigh those of the mutation, then
the new behavior will fail to propagate and will Proposition 5 If a A is an evolutionary stable
eventually disappear. In a population equilibrium, strategy of G, then (a, a) is a Nash equilibrium.
the interaction of any mutant with the whole pop- And if (a, a) is a strict Nash equilibrium then a is
ulation awards the mutant insufficient fitness gains, an evolutionary stable strategy.
and as a result the mutants disappear. The notion of
a population equilibrium is formalized by means of Proof If u1(a, a) > u1(b, a) for all b A∖fbg ,
the concept of an evolutionary stable strategy, then inequality (10) holds for all sufficiently small
introduced by Maynard Smith and Price (1973). e > 0. If u1(b, a) > u1(a, a) for some b A, the
In what follows we restrict our attention to sym- reverse inequality holds for all sufficiently
metric two-player games. So let G ¼ h{1, 2}, small e. □
{A1, A2}, {u1, u2}i be a game such that A1 ¼
A2 ¼ A, and such that for all
a, b A, u1 ða, bÞ ¼ u2 ðb, aÞ: An evolutionary Future Directions
stable strategy is an action in A such that if all
members of the population were to choose that Static games have been shown to be a useful
action, no sufficiently small proportion of mutants framework for analyzing and understanding
choosing an alternative action would succeed in many situations that involve strategic interaction.
invading the population. Alternatively, an evolution- At present, a large body of literature is available
ary stable strategy is an action in A such that if all the that develops various solution concepts, some
members of the population were to choose that which are refinements of Nash equilibrium and
action, the population would reject all sufficiently some of which are coarsenings of it. Nonetheless,
small mutations involving a different action. several areas for future research remain. One is the
More specifically, suppose that all members of application of the theory to particular games to
the population are programmed to choose a A, better understand the situations they model, for
and then a proportion e of the population mutates example auctions. In many markets trade is
and adopts action b A. In that case, the proba- conducted by auctions of one kind or another,
bility that a given member of the population meets including markets for small domestic products as
a mutant is e, while the probability of meeting a well as some centralized electricity markets where
member that plays a is 1 e. Therefore, the generators and distributors buy and sell electric
mutation will not propagate and will vanish if power on a daily basis. Also, auctions are used to
the expected payoff of a mutant is less than the allocate large amounts of valuable spectrum
106 Static Games
shown in section “Correlated Equilibrium and Consider a random signal which has no effect on
Communication,” this property can be given sev- the players’ payoffs and takes three possible values:
eral precise statements according to the con- low, medium, or high, occurring each with proba-
straints imposed on the players’ communication, bility 1/3. Assume that, before the beginning of the
which can go from plain conversation to game, player 1 distinguishes whether the signal is
exchange of messages through noisy channels. high or not, while player 2 distinguishes whether
Originally designed for static games with com- the signal is low or not. The relevant interactive
plete information, the correlated equilibrium decision problem is then the extended game in
applies to any strategic form game. It is geomet- which the players can base their action on the
rically and computationally more tractable than private information they get on the random signal,
the better known Nash equilibrium. The solution while the payoffs only depend on the players’
concept has been extended to dynamic games, actions. In this game, suppose that player 1 chooses
possibly with incomplete information. As an the aggressive action when the signal is high and
illustration, we define in detail the communica- the pacific action otherwise. Similarly, suppose that
tion equilibrium for Bayesian games in section player 2 chooses the aggressive action when the
“Correlated Equilibrium in Bayesian Games.” signal is low and the pacific action otherwise. We
show that these strategies form an equilibrium in
the extended game. Given player 2’s strategy,
Introduction
assume that player 1 observes a high signal. Player
1 deduces that the signal cannot be low so that
Example
player 2 chooses the pacific action; hence, player
Consider the two-person game known as
1’s best response is to play aggressively. Assume
“chicken,” in which each player i can take a
now that player 1 is informed that the signal is not
“pacific” action (denoted as pi) or an “aggressive”
high; he deduces that, with probability 1/2, the
action (denoted as ai):
signal is medium (i.e., not low) so that player
p2 a2 2 plays pacific and, with probability 1/2, the signal
p1
ð8, 8Þ ð3, 10Þ is low so that player 2 plays aggressive. The
a1 ð10, 3Þ ð0, 0Þ expected payoff of player 1 is 5.5 if he plays pacific
and 5 if he plays aggressive; hence, the pacific
The interpretation is that player 1 and player action is a best response. The equilibrium condi-
2 simultaneously choose an action and then get tions for player 2 are symmetric. To sum up, the
a payoff, which is determined by the pair of strategies based on the players’ private information
chosen actions according to the previous matrix. form a Nash equilibrium in the extended game in
If both players are pacific, they both get 8. If both which an extraneous signal is first selected. We
are aggressive, they both get 0. If one player is shall say that these strategies form a “correlated
aggressive and the other is pacific, the aggressive equilibrium.” The corresponding probability distri-
player gets 10 and the pacific one gets 3. This bution over the players’ actions is
game has two pure Nash equilibria (p1, a2), (a1,
p2) and one mixed Nash equilibrium in which p2 a2
both players choose the pacific action with prob- 1 1
ability 3/5, resulting in the expected payoff 6 for p1
3 3 (1)
both players. A possible justification for the latter a 1 1
solution is that the players make their choices as a 0
3
function of independent extraneous random sig-
nals. The assumption of independence is strong. and the expected payoff of every player is 7. This
Indeed, there may be no way to prevent the probability distribution can be used directly to
players’ signals from being correlated. make private recommendations to the players
Correlated Equilibria and Communication in Games 109
X
before the beginning of the game (see the section q o0 jPi ðoÞ ui ðaðo0 ÞÞ
“Canonical Representation” below). o0 Pi ðoÞ
X
q o0 jPi ðoÞ ui ti , ai ðo0 Þ , (2)
o0 P ðoÞ
i
X
q si jsi ui si , si arbitrage opportunities” axiom that underlies
si X
Si subjective probability theory. They introduce
q si jsi ui ti , si , jointly coherent strategy profiles, which do not
si Si expose the players as a group to arbitrage from
8i N, 8si Si : qðsi Þ > 0, 8ti Si an outside observer. They show that a strategy
profile is jointly coherent if and only if it occurs
or, equivalently, with positive probability in some correlated
X equilibrium. From a technical point of view,
q si , si ui si , si both proofs turn out to be similar. Myerson
X
si Si
(1997) makes further use of the linear structure
q si jsi ui ti , si , (3) of correlated equilibria by introducing dual
si Si reduction, a technique to replace a finite game
8i N, 8si , ti Si with a game with fewer strategies, in such a way
that any correlated equilibrium of the reduced
The equilibrium conditions can also be formu- game induces a correlated equilibrium of the
lated ex ante: original game.
X X
qðsÞui ðsÞ qðsÞui ai si , si ,
sS sS
8i N, 8ai : Si ! Si Geometric Properties
and Zemel (1989) show more precisely that the theoretic foundation of the correlated equilibrium
complexity of standard computational problems is that, under the common prior assumption,
is “NP-hard” for the Nash equilibrium and poly- Bayesian rationality amounts to Eq. 2.
nomial for the correlated equilibrium. Examples If the common prior assumption is relaxed, the
of such problems are “Does the game G have a previous result still holds, with subjective prior
Nash (resp., correlated) equilibrium which probability distributions, for the subjective corre-
yields a payoff greater than r to every player lated equilibrium which was also introduced by
(for some given number r)?” and “Does the Aumann (1974). The latter solution concept is
game G have a unique Nash (resp., correlated) defined inthe same way by considering
as above,
equilibrium?” Papadimitriou (2005) develops a device O, ðqi Þi N , P i i N , with a probabil-
a polynomial-time algorithm for finding corre- ity distribution qi for every player i, and by writ-
lated equilibria, which is based on a variant of ing Eq. 2 in terms of qi instead of q.
the existence proof of Hart and Schmeidler Brandenburger and Dekel (1987) show that
(1989). (a refinement of) the subjective correlated equi-
librium is equivalent to (correlated)
rationalizability, another well-established solu-
Foundations tion concept which captures players’ minimal
rationality. Rationalizable strategies reflect that
By reinterpreting the previous canonical repre- the players commonly know that each of them
sentation, Aumann (1987) proposes a decision makes an optimal choice given some belief. Nau
theoretic foundation for the correlated equilib- and McCardle (1991) reconcile objective and
rium in games with complete information, in subjective correlated equilibrium by proposing
which i for i ϵ N stands merely for a set of the no arbitrage principle as a unified approach
actions of player i. Let O be the space of all states to individual and interactive decision problems.
of the world; an element o of O thus specifies all They argue that the objective correlated equilib-
the parameters which may be relevant to the rium concept applies to a game that is revealed by
players’ choices. In particular, the action profile the players’ choices, while the subjective corre-
in the underlying game G is part of the state of the lated equilibrium concept applies to the “true
world. A partition P i describes player i’s infor- game”; both lead to the same set of jointly coher-
mation on O. In addition, every player i has a ent outcomes.
prior belief, i.e., a probability distribution qi over
O. Formally, the framework is similar as above
except that the players possibly hold different Correlated Equilibrium and
beliefs over O. Let ai (o) denote player i’s action Communication
at o; a natural assumption is that player
i knows the action he chooses, namely, that ai is As seen in the previous section, correlated equi-
P i -measurable. According to Aumann (1987), libria can be achieved in practice with the help of
player i is Bayes rational at o if his action a mediator and emerge in a Bayesian framework
ai (o) maximizes his expected payoff (with embedding the game in a full description of the
respect to qi) given his information Pi(o). Note world. Both approaches require to extend the
that this is a separate rationality condition for game by taking into account information which
every player, not an equilibrium condition. is not generated by the players themselves. Can
Aumann (1987) proves the following result: the players reach a correlated equilibrium with-
under the common prior assumption (namely, out relying on any extraneous correlation device,
qi = q, i ϵ N), if every player is Bayes rational by just communicating with each other before the
at every state of the world, the distribution of the beginning of the game?
corresponding action profile a is a correlated Consider the game of “chicken” presented in
equilibrium distribution. The key to this decision the introduction. The probability distribution
112 Correlated Equilibria and Communication in Games
players to publicly check the record of communi- introduces (possibly multistage) “mediated talk”:
cation under some circumstances. The equilibria the players send private messages to a mediator,
of ext(G) constructed by Bárány involve that a but the latter can only make deterministic public
receiver gets the same message from two different announcements. Mediated talk captures real-life
senders; the message is nevertheless not public, communication procedures, like elections, espe-
thanks to the assumption on the number of players. cially if it lasts only for a few stages. Lehrer and
At every stage of ext(G), every player can ask for Sorin (1997) establish that whatever the number of
the revelation of all past messages, which are players of G, every (rational) correlated equilibrium
assumed to be recorded. Typically, a receiver can distribution of G can be realized as a Nash equilib-
claim that the two senders’ messages differ. In this rium of a single-stage mediated talk extension of G.
case, the record of communication surely reveals Ben-Porath (1998) proposes a variant of cheap talk
that either one of the senders or the receiver him- in which the players do not only exchange verbal
self has cheated; the deviator can be punished messages but also “hard” devices such as urns
(at his minimax level in G) by the other players. containing balls. This extension is particularly use-
The punishments in Bárány’s (1992) Nash equi- ful in two-person games to circumvent the equiva-
libria of ext(G) need not be credible threats. Instead lence between the equilibria achieved by cheap talk
of using double senders in the communication pro- and the convex hull of Nash equilibria. More pre-
tocols, Ben-Porath (1998, 2003) proposes a proce- cisely, the result of Ben-Porath (1998) stated above
dure of random monitoring, which prescribes a holds for two-person games if the players first check
given behavior to every player in such a way that together the content of different urns and then each
unilateral deviations can be detected with probabil- player draws a ball from an urn that was chosen by
ity arbitrarily close to 1. This procedure applies if the other player, so as to guarantee that one player
there are at least three players, which yields an only knows the outcome of a lottery, while the other
analog of Bárány’s result already in this case. If one only knows the probabilities of this lottery.
the number of players is exactly three, Ben-Porath The various extensions of the basic game
(2003) needs to assumes, as Bárány (1992), that G considered up to now, with or without a medi-
public verification of the record of communication ator, implicitly assume that the players are fully
is possible in ext(G) (see Ben-Porath 2006). How- rational. In particular, they have unlimited com-
ever, Ben-Porath concentrates on (rational) corre- putational abilities. By relaxing that assumption,
lated equilibrium distributions which allow for Urbano and Vila (2002) and Dodis et al. (2000)
strict punishment on a Nash equilibrium of G; he build on earlier results from cryptography so as to
constructs sequential equilibria which generate implement any (rational) correlated equilibrium
these distributions in ext(G), thus dispensing with distribution through unmediated communication,
incredible threats. At the price of raising the num- including in two-person games.
ber of players to five or more, Gerardi (2004) As the previous paragraphs illustrate, the players
proves that every (rational) correlated equilibrium can modify their initial distribution of information
distribution of G can be realized as a sequential by means of many different communication pro-
equilibrium of a cheap talk extension of G which tocols. Gossner (1998) proposes a general criterion
does not require any message recording. For this, he to classify them: a protocol is “secure” if under all
builds protocols of communication in which the circumstances, the players cannot mislead each
players base their decisions on majority rule, so other nor spy on each other. For instance, given a
that no punishment is necessary. cheap talk extension ext(G), a protocol P describes,
We have concentrated on two extreme forms of for every player, a strategy in ext(G) and a way to
communication: mediated communication, in interpret his information after the communication
which a mediator performs lotteries and sends pri- phase of ext(G). P induces a correlation device d(P)
vate messages to the players, and cheap talk, in (in the sense of section “Correlated Equilibrium:
which the players just exchange messages. Many Definition and Basic Properties”). P is secure if, for
intermediate schemes of communication are obvi- every game G and every Nash equilibrium a of
ously conceivable. For instance, Lehrer (1996) Gd(P), the following procedure is a Nash
114 Correlated Equilibria and Communication in Games
framework. The other results mentioned at the Lehrer (1991, 1992), proved to be particularly
end of section “Correlated Equilibrium and Com- useful and is still undergoing. Lehrer (1991)
munication” have also been generalized to showed that if players either are fully informed of
Bayesian games (see Gossner 1998; Lehrer and past actions or get no information (“standard-
Sorin 1997; Urbano and Vila 2004a). trivial” information structure), correlated equilib-
ria are equivalent to Nash equilibria. In other
words, all correlations can be generated internally,
Related Topics and Future Directions namely, by the past histories, on which players
have differential information. The schemes of
In this brief entry, we concentrated on two solu- internal correlation introduced to establish this
tion concepts: the strategic form correlated equi- result are widely applicable and inspired those of
librium, which is applicable to any game, and the Lehrer (1996) (see section “Correlated Equilib-
communication equilibrium, which we defined rium and Communication”). In general repeated
for Bayesian games. Other extensions of games with imperfect monitoring, Renault and
Aumann’s (1974) solution concept have been Tomala (2004) characterize communication equi-
proposed for Bayesian games, as the agent nor- libria, but the amount of correlation that the players
mal form correlated equilibrium and the (possibly can achieve in a Nash equilibrium is still an open
belief invariant) Bayesian solution (see Forges problem (see, e.g., Gossner and Tomala 2007;
(1993, 2006) for definitions and references). The Urbano and Vila 2004b for recent advances).
Bayesian solution is intended to capture the Throughout this entry, we defined a correlated
players’ rationality in games with incomplete equilibrium as a Nash equilibrium of an extension
information in the spirit of Aumann (1987) (see of the game under consideration. The solution
Nau 1992; Forges 1993). Lehrer et al. (2006) concept can be strengthened by imposing some
open a new perspective in the understanding of refinement, i.e., further rationality conditions, to
the Bayesian solution and other equilibrium con- the Nash equilibrium in this definition (see, e.g.,
cepts for Bayesian games by characterizing the Dhillon and Mertens 1996; Myerson 1986b).
classes of equivalent information structures with Refinements of communication equilibria have
respect to each of them. Comparison of informa- also been proposed (see, e.g., Gerardi 2004;
tion structures, which goes back to Blackwell Gerardi and Myerson 2007; Myerson 1986a).
(1951, 1953) for individual decision problems, Some authors (see, e.g., Milgrom and Roberts
was introduced by Gossner (2000) in the context 1996; Moreno and Wooders 1996; Ray 1996)
of games, both with complete and incomplete have also developed notions of coalition proof
information. In the latter model, information correlated equilibria, which resist not only to
structures basically describe how extraneous sig- unilateral deviations, as in this entry, but even to
nals are selected as a function of the players’ multilateral ones. A recurrent difficulty is that,
types; two information structures are equivalent for many of these stronger solution concepts, a
with respect to an equilibrium concept if, in every useful canonical representation (as derived in
game, they generate the same equilibrium distri- section “Correlated Equilibrium: Definition and
butions over outcomes. Basic Properties”) is not available.
Correlated equilibria, communication equilib- Except for two or three references, we delib-
ria, and related solution concepts have been stud- erately concentrated on the results published in
ied in many other classes of games, like multistage the game theory and mathematical economics
games (see, e.g., Forges 1986; Myerson 1986a), literature, while substantial achievements in
repeated games with incomplete information (see, computer science would fit in this survey. Both
e.g., Forges 1985, 1988), and stochastic games streams of research pursue similar goals but rely
(see, e.g., Solan 2001; Solan and Vieille 2002). on different formalisms and techniques. For
The study of correlated equilibrium in repeated instance, computer scientists often make use of
games with imperfect monitoring, initiated by cryptographic tools which are not familiar in
Correlated Equilibria and Communication in Games 117
game theory. Halpern (2007) gives an idea of Blackwell D (1953) Equivalent comparison of experi-
recent developments at the interface of computer ments. Ann Math Stat 24:265–272
Brandenburger A, Dekel E (1987) Rationalizability and
science and game theory (see in particular the correlated equilibria. Econometrica 55:1391–1402
section “Implementing Mediators”) and contains Dhillon A, Mertens JF (1996) Perfect correlated equilib-
a number of references. ria. J Econ Theory 68:279–302
Finally, the assumption of full rationality of Dodis Y, Halevi S, Rabin T (2000) A cryptographic solu-
tion to a game theoretic problem. In: CRYPTO 2000:
the players can also be relaxed. Evolutionary 20th international cryptology conference. Springer,
game theory has developed models of learning Berlin, pp 112–130
in order to study the long-term behavior of Evangelista F, Raghavan TES (1996) A note on correlated
players with bounded rationality. Many possible equilibrium. Int J Game Theory 25:35–41
Forges F (1985) Correlated equilibria in a class of repeated
dynamics are conceivable to represent more or games with incomplete information. Int J Game The-
less myopic attitudes with respect to optimiza- ory 14:129–150
tion. Under appropriate learning procedures, Forges F (1986) An approach to communication equilib-
which express, for instance, that agents want to rium. Econometrica 54:1375–1385
Forges F (1988) Communication equilibria in repeated games
minimize the regret of their strategic choices, the with incomplete information. Math Oper Res 13:191–231
empirical distribution of actions converges to Forges F (1990) Universal mechanisms. Econometrica
correlated equilibrium distributions (see, e.g., 58:1341–1364
Foster and Vohra 1997; Hart and Mas-Colell Forges F (1993) Five legitimate definitions of correlated
equilibrium in games with incomplete information.
2000; Hart 2005 for a survey). However, standard Theor Decis 35:277–310
procedures, as the “replicator dynamics,” may Forges F (2006) Correlated equilibrium in games with
even eliminate all the strategies which have pos- incomplete information revisited. Theor Decis
itive probability in a correlated equilibrium (see 61:329–344
Foster D, Vohra R (1997) Calibrated learning and corre-
Viossat 2007). lated equilibrium. Games Econ Behav 21:40–55
Gerardi D (2000) Interim pre-play communication.
Mimeo, Yale University, New Haven
Bibliography Gerardi D (2004) Unmediated communication in games
with complete and incomplete information. J Econ
Theory 114:104–131
Primary Literature Gerardi D, Myerson R (2007) Sequential equilibria in
Aumann RJ (1974) Subjectivity and correlation in ran- Bayesian games with communication. Games Econ
domized strategies. J Math Econ 1:67–96 Behav 60:104–134
Aumann RJ (1987) Correlated equilibrium as an expres- Gilboa I, Zemel E (1989) Nash and correlated equilibria:
sion of Bayesian rationality. Econometrica 55:1–18 some complexity considerations. Games Econ Behav
Aumann RJ, Maschler M, Stearns R (1968) Repeated 1:80–93
games with incomplete information: an approach to Gomez-Canovas S, Hansen P, Jaumard B (1999) Nash
the nonzero sum case. Reports to the US Arms Control Equilibria from the correlated equilibria viewpoint.
and Disarmament Agency, ST-143, Chapter IV, Int Game Theory Rev 1:33–44
117–216 (reprinted In: Aumann RJ, Maschler Gossner O (1998) Secure protocols or how communica-
M (1995) Repeated Games of Incomplete Information. tion generates correlation. J Econ Theory 83:69–89
M.I.T. Press, Cambridge) Gossner O (2000) Comparison of information structures.
Bárány I (1992) Fair distribution protocols or how players Games Econ Behav 30:44–63
replace fortune. Math Oper Res 17:327–340 Gossner O, Tomala T (2007) Secret correlation in repeated
Ben-Porath E (1998) Correlation without mediation: games with signals. Math Oper Res 32:413–424
expanding the set of equilibrium outcomes by cheap Halpern JY (2007) Computer science and game theory. In:
pre-play procedures. J Econ Theory 80:108–122 Durlauf SN, Blume LE (eds) The New Palgrave dic-
Ben-Porath E (2003) Cheap talk in games with incomplete tionary of economics, 2nd edn. Palgrave Macmillan.
information. J Econ Theory 108:45–71 The New Palgrave dictionary of economics online.
Ben-Porath E (2006) A correction to “Cheap talk in games http://www.dictionaryofeconomics.com/article?id=
with incomplete information”. Mimeo, Hebrew Uni- pde2008_C000566. Accessed 24 May 2008
versity of Jerusalem, Jerusalem Hart S, Schmeidler D (1989) Existence of correlated equi-
Blackwell D (1951) Comparison of experiments. In: Pro- libria. Math Oper Res 14:18–25
ceedings of the Second Berkeley Symposium on Math- Hart S, Mas-Colell A (2000) A simple adaptive procedure
ematical Statistics and Probability, University of leading to correlated equilibrium. Econometrica
California Press, Berkeley, pp 93–102 68:1127–1150
118 Correlated Equilibria and Communication in Games
Hart S (2005) Adaptative heuristics. Econometrica Solan E (2001) Characterization of correlated equilib-
73:1401–1430 rium in stochastic games. Int J Game Theory
Krishna RV (2007) Communication in games of incomplete 30:259–277
information: two players. J Econ Theory 132:584–592 Solan E, Vieille N (2002) Correlated equilibrium in sto-
Lehrer E (1991) Internal correlation in repeated games. Int chastic games. Game Econ Behav 38:362–399
J Game Theory 19:431–456 Urbano A, Vila J (2002) Computational complexity and
Lehrer E (1992) Correlated equilibria in two-player communication: coordination in two-player games.
repeated games with non-observable actions. Math Econometrica 70:1893–1927
Oper Res 17:175–199 Urbano A, Vila J (2004a) Computationally restricted
Lehrer E (1996) Mediated talk. Int J Game Theory unmediated talk under incomplete information.
25:177–188 J Econ Theory 23:283–320
Lehrer E, Sorin S (1997) One-shot public mediated talk. Urbano A, Vila J (2004b) Unmediated communication in
Games Econ Behav 20:131–148 repeated games with imperfect monitoring. Games
Lehrer E, Rosenberg D, Shmaya E (2006) Signaling and Econ Behav 46:143–173
mediation in Bayesian games. Mimeo, Tel Aviv Uni- Vida P (2007) From communication equilibria to corre-
versity, Tel Aviv lated equilibria. Mimeo, University of Vienna, Vienna
Milgrom P, Roberts J (1996) Coalition-proofness and Viossat Y (2008) Is having a unique equilibrium robust?
correlation with arbitrary communication possibilities. J Math Econ 44:1152–1160
Games Econ Behav 17:113–128 Viossat Y (2006) The geometry of Nash equilibria and
Moreno D, Wooders J (1996) Coalition-proof equilibrium. correlated equilibria and a generalization of zero-sum
Games Econ Behav 17:80–112 games. Mimeo, S-WoPEc working paper 641. Stock-
Myerson R (1982) Optimal coordination mechanisms in holm School of Economics, Stockholm
generalized principal-agent problems. J Math Econ Viossat Y (2007) The replicator dynamics does not lead to
10:67–81 correlated equilibria. Games Econ Behav 59:397–407
Myerson R (1986a) Multistage games with communica-
tion. Econometrica 54:323–358
Myerson R (1986b) Acceptable and predominant corre- Books and Reviews
lated equilibria. Int J Game Theory 15:133–154 Forges F (1994) Non-zero sum repeated games and infor-
Myerson R (1997) Dual reduction and elementary games. mation transmission. In: Megiddo N (ed) Essays in
Games Econ Behav 21:183–202 game theory in honor of Michael Maschler. Springer,
Nash J (1951) Non-cooperative games. Ann Math 54:286–295 Berlin, pp 65–95
Nau RF (1992) Joint coherence in games with incomplete Mertens JF (1994) Correlated- and communication equi-
information. Manag Sci 38:374–387 libria. In: Mertens JF, Sorin S (eds) Game theoretic
Nau RF, McCardle KF (1990) Coherent behavior in non- methods in general equilibrium analysis. Kluwer, Dor-
cooperative games. J Econ Theory 50(2):424–444 drecht, pp 243–248
Nau RF, McCardle KF (1991) Arbitrage, rationality and Myerson R (1985) Bayesian equilibrium and incentive
equilibrium. Theor Decis 31:199–240 compatibility. In: Hurwicz L, Schmeidler D,
Nau RF, Gomez-Canovas S, Hansen P (2004) On the Sonnenschein H (eds) Social goals and social organi-
geometry of Nash equilibria and correlated equilibria. zation. Cambridge University Press, Cambridge,
Int J Game Theory 32:443–453 pp 229–259
Papadimitriou CH (2005) Computing correlated equilibria Myerson R (1994) Communication, correlated equilibria
in multiplayer games. In: Proceedings of the 37th and incentive compatibility. In: Aumann R, Hart
ACM symposium on theory of computing. STOC, S (eds) Handbook of game theory, vol 2. Elsevier,
Baltimore, pp 49–56 Amsterdam, pp 827–847
Ray I (1996) Coalition-proof correlated equilibrium: a Sorin S (1997) Communication, correlation and cooper-
definition. Games Econ Behav 17:56–79 ation. In: Mas Colell A, Hart S (eds) Cooperation:
Renault J, Tomala T (2004) Communication equilibrium game theoretic approaches. Springer, Berlin,
payoffs in repeated games with imperfect monitoring. pp 198–218
Games Econ Behav 49:313–344
Correlated equilibrium A Nash equilibrium in
Bayesian Games: Games with an extension of the game in which there is a
Incomplete Information chance move, and each player has only partial
information about its outcome.
Shmuel Zamir State of nature Payoff relevant data of the game
Center for the Study of Rationality, Hebrew such as payoff functions, value of a random
University, Jerusalem, Israel variable, etc. It is convenient to think of a state
of nature as a full description of a “game-form”
Article Outline (actions and payoff functions).
State of the world A specification of the state of
Definition nature (payoff relevant parameters) and the
Introduction players’ types (belief of all levels). That is, a
Harsanyi’s Model: The Notion of Type state of the world is a state of nature and a list of
Aumann’s Model the states of mind of all players.
Harsanyi’s Model and Hierarchies of Beliefs Type Also known as state of mind and is a full
The Universal Belief Space description of player’s beliefs (about the state
Belief Subspaces of nature), beliefs about beliefs of the other
Consistent Beliefs and Common Priors players, beliefs about the beliefs about his
Bayesian Games and Bayesian Equilibrium beliefs, etc. ad infinitum.
Bayesian Equilibrium and Correlated Equilibrium
Concluding Remarks and Future Directions
Bibliography Definition
statistics, we encounter the need to deal with an which looks rather intractable. The natural emer-
infinite hierarchy of beliefs: what does each gence of hierarchies of beliefs is illustrated in the
player believe that the other player believes following example:
about what he believes. . . is the actual payoff
associated with a certain outcome? It is not sur- Example 1 Two players, P1 and P2, play a 2 2
prising that this methodological difficulty was a game whose payoffs depend on an unknown state
major obstacle in the development of the theory, of nature s {1, 2}. Player P1’s actions are {T, B},
and this article is largely devoted to explaining player P2’s actions are {L, R}, and the payoffs are
and resolving this methodological difficulty. given in the following matrices:
P2
a L R
Introduction
T 0, 1 1, 0
A game is a mathematical model for an interactive P1
B 1, 0 0, 1
decision situation involving several decision
makers (players) whose decisions affect each Payoffs when s = 1
other. A basic, often implicit, assumption is that b P2
L R
the data of the game, which we call the state of
nature, are common knowledge (CK) among the T 1, 0 0, 1
P1
players. In particular the actions available to the B 1, 0
0, 1
players and the payoff functions are CK. This is a
rather strong assumption that says that every Payoffs when s = 2
player knows all actions and payoff functions of
all players, every player knows that all other Assume that the belief (prior) of P1 about the
players know all actions and payoff functions, event {s ¼ 1} is p and the belief of P2 about the
every player knows that every player knows that same event is q. The best action of P1 depends
every player knows, etc. ad infinitum. Bayesian both on his prior and on the action of P2 and
games (also known as games with incomplete similarly for the best action of P2. This is given
information), which is the subject of this article, in the following tables:
are models of interactive decision situations in
a P2’s action
which each player has only partial information L R
about the payoff relevant parameters of the given
p < 0.5 T B
situation.
Adopting the Bayesian approach, we assume p > 0.5 B T
that a player who has only partial knowledge
about the state of nature has some beliefs, namely, Best reply of P1
prior distribution, about the parameters which he
b q < 0.5 q > 0.5
does not know or he is uncertain about. However,
unlike in a statistical problem which involves a T R L
single decision maker, this is not enough in an P1’s action
B L R
interactive situation: As the decisions of other
players are relevant, so are their beliefs, since Best reply of P2
they affect their decisions. Thus, a player must
have beliefs about the beliefs of other players. For Now, since the optimal action of P1 depends not
the same reason, a player needs beliefs about the only on his belief p but also on the, unknown to him,
beliefs of other players about his beliefs and so action of P2, which depends on his belief q, player
on. This interactive reasoning about beliefs leads P1 must therefore have beliefs about q. These are his
unavoidably to infinite hierarchies of beliefs second-level beliefs, namely, beliefs about beliefs.
Bayesian Games: Games with Incomplete Information 121
But then, since this is relevant and unknown to P2, player’s beliefs about the state of nature (the data of
he must have beliefs about that which will be third- the game), beliefs about the beliefs of other players
level beliefs of P2 and so on. The whole infinite about the state of nature and about his own beliefs,
hierarchies of beliefs of the two players pop out etc. One may think of a player’s type as his state of
naturally in the analysis of this simple two-person mind: a specific configuration of his brain that con-
game of incomplete information. tains an answer to any question regarding beliefs
The objective of this article is to model this about the state of nature and about the types of the
kind of situation. Most of the effort will be other players. Note that this implies self-reference
devoted to the modeling of the mutual beliefs (of a type to itself through the types of other players)
structure, and only then we add the underlying which is unavoidable in an interactive decision sit-
game which, together with the beliefs structure, uation. A Harsanyi game of incomplete information
defines a Bayesian game for which we define the consists of the following ingredients (to simplify
notion of Bayesian equilibrium. notations, assume all sets to be finite):
• I – Player’s set.
Harsanyi’s Model: The Notion of Type
• S – The set of states of nature.
• Ti – The type set of player i I. Let T ¼ i ITi
As suggested by our introductory example, the
denote the type set, that is, the set type profiles.
straightforward way to describe the mutual beliefs
• Y S T – A set of states of the world.
structure in a situation of incomplete information is
• p D(Y) – Probability distribution on Y, called
to specify explicitly the whole hierarchies of beliefs
the common prior.
of the players, that is, the beliefs of each player about
the unknown parameters of the game, each player’s (For a set A, we denote the set of probability
beliefs about the other players’ beliefs about these distributions on A by D(A)).
parameters, each player’s beliefs about the other
players’ beliefs about his beliefs about the parame-
ters, and so on ad infinitum. This may be called the Remark A state of the world o thus consists of a
explicit approach and is in fact feasible and was state of nature and a list of the types of the players.
explored and developed at a later stage of the theory We denote it as
(see Aumann 1999a, b; Aumann and Heifetz 2002;
o ¼ ðsðoÞ; t 1 ðoÞ, . . . , t n ðoÞÞ:
Mertens and Zamir 1985). We will come back to it
when we discuss the universal belief space. How- We think of the state of nature as a full descrip-
ever, for obvious reasons, the explicit approach is tion of the game which we call a game-form. So, if
mathematically rather cumbersome and hardly man- it is a game in strategic form, we write the state of
ageable. Indeed this was a major obstacle to the nature at state of the world o as
development of the theory of games with incom-
plete information at its early stages. The break- sðoÞ ¼ I, ðAi ðoÞÞi I, ðui ð; oÞÞi I :
through was provided by John Harsanyi (1967) in
a seminal work that earned him the Nobel Prize The payoff functions ui depend only on the state
some 30 years later. While Harsanyi actually formu- of nature and not on the types. That is, for all i I,
lated the problem verbally, in an explicit way, he
suggested a solution that “avoided” the difficulty of sðoÞ ¼ sðo0 Þ ) ui ð; oÞ ¼ ui ð; o0 Þ:
having to deal with infinite hierarchies of beliefs, by
providing a much more workable implicit, encapsu- The game with incomplete information is
lated model which we present now. played as follows:
The key notion in Harsanyi’s model is that of
type. Each player can be of several types where a 1. A chance move chooses o ¼ (s(o);t1(o),. . .,
type is to be thought of as a full description of the tn(o)) Y using the probability distribution p.
122 Bayesian Games: Games with Incomplete Information
2. Each player is told his chosen type ti(o) (but • For i I, pi is a partition of Y.
not the chosen state of nature s(o) and not the • P is a probability distribution on Y, also called
other players’ types t i(o) ¼ (tj(o))j6¼i). the common prior.
3. The players choose simultaneously an action:
player i chooses ai Ai(o) and receives a pay- In this model, a state of the world o Y is
off ui(a;o) where a ¼ (a1,. . .,an) is the vector chosen according to the probability distribution
of chosen actions and o is the state of the world P, and each player i is informed of pi(o), the
chosen by the chance move. element of his partition that contains the chosen
state of the world o. This is the informational
Remark The set Ai(o) of actions available to structure which becomes a game with incomplete
player i in state of the world o must be known to information if we add a mapping s:Y ! S.
him. Since his only information is his type ti(o), The state of nature s(o) is the game-form
we must impose that Ai(o) is Ti measurable, i.e., corresponding to the state of the world o (with
the requirement that the action sets Ai(o) are pi
t i ðoÞ ¼ t i ðo0 Þ ) Ai ðoÞ ¼ Ai ðo0 Þ: measurable).
It is readily seen that Aumann’s model is a
Note that if s(o) was commonly known Harsanyi model in which the type set Ti of player
among the players, it would be a regular game i is the set of his partition elements, i.e., Ti ¼ {pi(-
in strategic form. We use the term “game-form” o)|o Y}, and the common prior on Y is P. Con-
to indicate that the players have only partial versely, any Harsanyi model is an Aumann model
information about s(o). The players do not in which the partitions are those defined by the
know which s(o) is being played. In other types, i.e., pi(o) ¼ {o0 Y | ti(o0) ¼ ti(o)}.
words, in the extensive form game of Harsanyi,
the game-forms (s(o))o Y are not subgames
since they are interconnected by information
sets: Player i does not know which s(o) is Harsanyi’s Model and Hierarchies of
being played since he does not know o; he Beliefs
knows only his own type ti(o).
An important application of Harsanyi’s model is As our starting point in modeling incomplete
made in auction theory, as an auction is a clear information situations was the appearance of hier-
situation of incomplete information. For example, archies of beliefs, one may ask how is the Har-
in a closed private-value auction of a single indivis- sanyi (or Aumann) model related to hierarchies of
ible object, the type of a player is his private value beliefs and how does it capture this unavoidable
for the object, which is typically known to him and feature of incomplete information situations? The
not to other players. We come back to this in the main observation towards answering this question
section entitled “Examples of Bayesian Equilibria.” is the following:
Definition 2 An Aumann model of incomplete Let us illustrate the idea of the proof by the
information is (I,Y, (pi)i I, P) where: following example:
• I is the players’ set.
• Y is a (finite) set whose elements are called Example Consider a Harsanyi model with two
states of the world. players, I and II, each of which can be of two
Bayesian Games: Games with Incomplete Information 123
types: TI ¼ {I1, I2}, TII ¼ {P1, P2} and thus • I2: With probability 23 the state is c and with
T ¼ {(I1, II1), (I1, II2), (I2, P1), (I2, P2)}. The probability 13 the state is d.
probability p on types is given by • II1: With probability 37 the state is a and with
II1 II2 probability 47 the state is c.
• II2: With probability 35 the state is b and with
I1 1 1 probability 25 the state is d.
4 4
Second-level beliefs
(using shorthand
notation
I2 1 1
1
3 6 for the above beliefs: 2a þ 12 b , etc.):
Denote the corresponding states of nature by • I1: With probability 12 , player II believes
a ¼ s(I1II2), b ¼ s(I1II2), c ¼ s(I2II1), and 3
a þ 4
probability 12 , player II
7 7 , and with
c
d ¼ s(I2II2). These are the states of nature about
believes 35 b þ 25 d .
which there is incomplete information.
• I2: With probability 23 , player II believes
The game in extensive form:
3 4
a þ 7 , and with
c probability 13 , player
7
Chance believes 35 b þ 25 d .
• II1: With probability 37 , player I believes
1 1
a þ 2 , and with
b probability 47 , player
1 1 1 1 2
4 4 3 6 I believes 23 c þ 13 d .
II1 • II2: With probability 35 , player I believes
1 1
I1 I2 a þ 2 , and with
b probability 25 , player
II2
2
I believes 23 c þ 13 d .
Third-level beliefs:
b a c d
• I1: With probability 12, player II believes that:
Assume that the state of nature is a. What are “With probability 37 , player I believes
the belief hierarchies of the players? 1 1
probability 47 , player
2 a þ 2 b and with
II1 II2 I believes 23 c þ 13 d .” And with probability
1 3
a b 2 , player II believes that: “With probability 5,
1 1
I1
4 4 player I believes 12 a þ 12 b and with proba-
bility 25, player I believes 23 c þ 13 d .”
c d
1 1
I2 3 6
The idea is very simple and powerful; since
each player of a given type has a probability
First-level beliefs are obtained by each player distribution (beliefs) both about the types of the
from p, by conditioning on his type: other players and about the set S of states of
nature, the hierarchies of beliefs are constructed
inductively: If the kth level beliefs (about S) are
• I1: With probability 12 the state is a and with defined for each type, then the beliefs about types
probability 12 the state is b. generate the (k + 1)th level of beliefs.
124 Bayesian Games: Games with Incomplete Information
Thus, the compact model of Harsanyi does topology to the probability F if and only if
ð ð
capture the whole hierarchies of beliefs and it is
lim g ð o ÞdF n ¼ gðoÞdF for all bounded
rather tractable. The natural question is whether n!1
O O
this model can be used for all hierarchies of and continuous functions g : O ! ℝ.
beliefs. In other words, given any hierarchy of It follows from the compactness of S that all
mutual beliefs of a set of players I about a set spaces defined by Eqs. 1 and 2 are compact in the
S of states of nature, can it be represented by a weak topology. However, for k > 1, not every
Harsanyi game? This was answered by Mertens element of Xk represents a coherent hierarchy of
and Zamir (1985), who constructed the universal beliefs of level k. For example, if (m1, m2) X2
belief space; that is, given a set S of states of where m1 D(S) ¼ X1 and m2 D(S X1n 1),
nature and a finite set I of players, they looked then for this to describe meaningful beliefs of a
for the space O of all possible hierarchies of player, the marginal distribution of m2 on S must
mutual beliefs about S among the players in I. coincide with m1. More generally, any event A in the
This construction is outlined in the next section. space of k-level beliefs has to have the same
(marginal) probability in any higher-level beliefs.
Furthermore, not only are each player’s beliefs
coherent, but he also considers only coherent beliefs
The Universal Belief Space
of the other players (only those that are in support of
his beliefs). Expressing formally this coherency
Given a finite set of players I ¼ {1,. . .,n} and a
condition yields a selection Tk Xk such that
set S of states of nature, which are assumed to be
T1 ¼ X1 ¼ D(S). It is proved that the projection of
compact, we first identify the mathematical spaces
Tk+1 on Xk is Tk (i.e., any coherent k-level hierarchy
in which lie the hierarchies of beliefs. Recall that
can be extended to a coherent k + 1-level hierarchy)
D(A) denotes the set of probability distributions
and that all the sets Tk are compact. Therefore, the
on A and defines inductively the sequence of
projective limit, T ¼ lim1 kTk, is well defined and
spaces (Xk)k ¼ 11 by
nonempty such that mk + 1 ¼ (mk, nk)). (The projec-
tive limit (also known as the inverse limit) of the
X 1 ¼ DðS Þ ð1Þ
sequence (Tk)k ¼ 11 is the space T of all sequences
(m1, m2,. . .) k ¼ 11Tk which satisfy: For any
X kþ1 ¼ X k D S X n1 , for k ¼ 1, 2, ::::
k k ℕ, there is a probability distribution
ð2Þ nk D(S Tkn1).
Any probability distribution on S can be a first- Definition 5 The universal type space T is the
level belief and is thus in X1. A second-level belief projective limit of the spaces (Tk)k ¼ 11.
is a joint probability distribution on S and the first- That is, T is the set of all coherent infinite hier-
level beliefs of the other (n 1) players. This is archies of beliefs regarding S, of a player in I. It does
an element in D(S X1n1), and therefore, a two- not depend on i since by construction it contains all
level hierarchy is an element of the product space possible hierarchies of beliefs regarding S, and it is
X1 D(S X1n1) and so on for any level. Note therefore the same for all players. It is determined
that at each level belief is a joint probability dis- only by S and the number of players n.
tribution on S and the previous level beliefs, allo-
wing for correlation between the two. In dealing Proposition 6 The universal type space T is com-
with these probability spaces, we need to have pact and satisfies
some mathematical structure. More specifically,
we make use of the weak topology. T D S T n1 : ð3Þ
can be identified with a joint probability distribution of i), this set must also be measurable. Mertens
on the state of nature and the types of the other and Zamir used the weak topology which is the
players. The implicit Eq. 3 reflects the self-reference minimal topology with which the event Bip(E) is
and circularity of the notion of type: The type of a (Borel) measurable for any (Borel) measurable
player is his beliefs about the state of nature and event E. In this topology, if A is a compact set,
about all the beliefs of the other players, in particu- then D(A), the space of all probability distribu-
lar, their beliefs about his own beliefs. tions on A, is also compact. However, the hierar-
chic construction can also be made with stronger
Definition 7 The universal belief space (UBS) is topologies on D(A) (see Brandenburger and Dekel
the space O defined by 1993; Heifetz 1993; Mertens et al. 1994). Heifetz
and Samet (1998) worked out the construction of
O ¼ S Tn ð4Þ the universal belief space without topology, using
only a measurable structure (which is implied by
An element of O is called a state of the world. the assumption that the beliefs of the players are
Thus, a state of the world is o ¼ (s(o);t1(o), measurable). All these explicit constructions of
t2(o),. . .,tn(o)) with s(o) S and ti(o) T for all the belief space are within what is called the
i in I. This is the specification of the states of nature semantic approach. Aumann (1999b) provided
and the types of all players. The universal belief another construction of a belief system using the
space O is what we looked for: the set of all syntactic approach based on sentences and logical
incomplete information and mutual belief configu- formulas specifying explicitly what each player
rations of a set of n players regarding the state of believes about the state of nature, about the
nature. In particular, as we will see later, all Har- beliefs of the other players about the state of
sanyi and Aumann models are embedded in O, but nature, and so on. For a detailed construction,
it includes also belief configurations that cannot be see Aumann (1999b), Heifetz and Mongin
modeled as Harsanyi games. As we noted before, (2001), and Meier (2001). For a comparison
the UBS is determined only by the set of states of of the syntactic and semantic approaches, see
nature S and the set of players I, so it should be Aumann and Heifetz (2002).
denoted as O(S, I). For the sake of simplicity, we
shall omit the arguments and write O, unless we
wish to emphasize the underlying sets S and I.
The execution of the construction of the UBS Belief Subspaces
according to the outline above involves some non-
trivial mathematics, as can be seen in Mertens and In constructing the universal belief space, we
Zamir (1985). The reason is that even with a finite implicitly assumed that each player knows his
number of states of nature, the space of first-level own type since we specified only his beliefs
beliefs is a continuum, the second level is the space about the state of nature and about the beliefs
of probability distributions on a continuum, and the of the other players. In view of that, and since
third level is the space of probability distributions by Eq. 3 a type of player i is a probability
on the space of probability distributions on a con- distribution on S TI\{i}, we can view a type
tinuum. This requires some structure for these ti also as a probability distribution on O ¼ S
spaces: For a (Borel) measurable event E, let TI in which the marginal distribution on Ti
Bip(E) be the event “player i of type ti believes is a degenerate delta function at ti; that is, if
that the probability of E is at least p, that is, o ¼ (s(o);t1(o),t2(o),. . ., tn(o)), then for all i in I,
Since this is the object of beliefs of players In particular, it follows that if Supp(ti) denotes
other than i (beliefs of j 6¼ i about the beliefs the support of ti, then
126 Bayesian Games: Games with Incomplete Information
o0 Suppðt i ðoÞÞ ) t i ðo0 Þ ¼ t i ðoÞ: ð6Þ would be Ye ðoÞ is the minimal BL-subspace
containing Pi(o) for all i in I. However, if for
Let Pi(o) ¼ Supp(ti(o)) O. This defines a every player the state o is not in Pi(o), then
possibility correspondence; at state of the world o=2Ye ðoÞ. Yet, even if it is not in the belief closure
o, player i does not consider as possible any point of the players, the real state o is still relevant
not in Pi(o). By Eq. 6, (at least for the analyst) because it determines
the true state of nature; that is, it determines the
Pi ðoÞ \ Pi ðo0 Þ 6¼ f ) Pi ðoÞ ¼ Pi ðo0 Þ: true payoffs of the game. This is the reason for
adding the true state of the world o, even though
However, unlike in Aumann’s model, Pi does
“it may not be in the mind of the players.”
not define a partition of O since it is possible that
It follows from Eqs. 5, 6, and 7 that a
o=2Pi(o), and hence the union [o OPi(o) may
BL-subspace Y has the following structure:
be strictly smaller than O (see Example 7). If
o Pi(o) Y holds for all o in some subspace
Proposition 10 A closed subset Yof the universal
Y O, then (Pi(o))o Y is a partition of Y.
belief space O is a BL-subspace if and only if it
As we said, the universal belief space includes
satisfies the following conditions:
all possible beliefs and mutual belief structures
over the state of nature. However, in a specific
1. For any o ¼ (s(o);t1(o), t2(o),. . ., tn(o)) Y,
situation of incomplete information, it may well
and for all i, the type ti(o) is a probability
be that only part of O is relevant for describing the
distribution on Y.
situation. If the state of the world is o, then clearly
2. For any o and o0 in Y,
all states of the world in [i IPi(o) are relevant,
but this is not all, because if o0 Pi(o), then all
states in Pj(o0), for j 6¼ i, are also relevant in the o0 Suppðt i ðoÞÞ ) t i ðo0 Þ ¼ t i ðoÞ:
considerations of player i. This observation moti-
vates the following definition: In fact condition 1 follows directly from Defi-
nition 8, while condition 2 follows from the gen-
Definition 8 A belief subspace (BL-subspace) is eral property of the UBS expressed in Eq. 6.
a closed subset Y of O which satisfies Given a BL-subspace Y in O(S, I), we denote by
Ti the type set of player i:
Pi ðoÞ Y 8i I and 8o Y : ð7Þ
T i ¼ ft i ðoÞjo Y g,
A belief subspace is minimal if it has no proper
subset which is also a belief subspace. Given and note that unlike in the UBS, in a specific
o O, the belief subspace at o, denoted by model Y, the type sets are typically not the same
Y(o), is the minimal subspace containing o. for all i and the analogue of Eq. 4 is
Since O is a BL-subspace, Y(o) is well defined
for all o O. A BL-subspace is a closed subset of Y S T 1 . . . T n:
O which is also closed under beliefs of the players.
In any o Y, it contains all states of the world A BL-subspace is a model of incomplete infor-
which are relevant to the situation: If o02= Y, then mation about the state of nature. As we saw in
no player believes that o0 is possible, no player Harsanyi’s model, in any model of incomplete
believes that any other player believes that o0 is information about a fixed set S of states of nature,
possible, no player believes that any player involving the same set of players I, a state of the
believes that any player believes, etc. world o defines (encapsulates) an infinite hierar-
chy of mutual beliefs of the players I on S. By the
Remark 9 The subspace Y(o) is meant to be the universality of the belief space O(S, I), there is
minimal subspace which is belief closed by all o0 O(S, I) with the same hierarchy of beliefs as
players at the state o. Thus, a natural definition that of o. The mapping of each o to its
Bayesian Games: Games with Incomplete Information 127
corresponding o0 in O(S, I) is called a belief There is a single type, [p1o1,. . .,pkok], which
morphism, as it preserves the belief structure. is the same for all players. It should be empha-
Mertens and Zamir (1985) proved that the space sized that the type is a distribution on Y (and not
O(S, I) is universal in the sense that any model Yof just on the states of nature), which implies that the
incomplete information of the set of players beliefs [p1G1,. . .,pkGk] on the state of nature are
I about the state of nature s S can be embedded commonly known by the players.
in O(S, I) via belief morphism j:Y ! O(S, I) so
that j(Y) is a belief subspace in O(S, I). In the Example 3 (Two Players with Incomplete
following examples, we give the BL-subspaces Information on One Side) There are two
representing some known models. players, I ¼ {I, II}, and two possible payoff
matrices, S ¼ {G1, G2}. The payoff matrix is
Examples of Belief Subspaces chosen at random with P(s ¼ G1) ¼ p, known
Example 1 (A Game with Complete Informa- to both players. The outcome of this chance
tion) If the state of nature is s0 S, then in the move is known only to player I. Aumann and
universal belief space O(S, I), the game is Maschler have studied such situations in which
described by a BL-subspace Y consisting of a the chosen matrix is played repeatedly and the
single state of the world: issue is how the informed player strategically
uses his information (see Aumann and Maschler
Y ¼ fog where o ¼ ðs0 ; ½1o
, . . . , ½1o
Þ: (1995) and its references). This situation is pre-
sented in the UBS by the following BL-subspace:
Here [1o] is the only possible probability dis-
• Y ¼ {o1,o2}.
tribution on Y, namely, the trivial distribution
• o1 ¼ (G1;[1o1],[po1,(1 p)o2]).
supported by o. In particular, the state of nature
• o2 ¼ (G2;[1o2],[po1,(1 p)o2]).
s0 (i.e., the data of the game) is commonly known.
• Y ¼ {o1,. . .,ok}.
Example 4 (Incomplete Information About the
• o1 ¼ (G1;[p1o1,. . .,pkok],. . .,[p1o1,. . .,pkok]).
• o2 ¼ (G2;[p1o1,. . .,pkok],. . .,[p1o1,. . .,pkok]). Other Players’ Information) In the next exam-
• . . .. . . ple, taken from Sorin and Zamir (1985), one of
• ok ¼ (Gk;[p1o1,. . .,pkok],. . .,[p1o1,. . .,pkok]). two players always knows the state of nature but
128 Bayesian Games: Games with Incomplete Information
I1 0.3 0.4
G1 G1 G2
one-to-one mapping between the type set T and Example 7 (“Highly Inconsistent” Beliefs) In
the set S of states of nature, the situation is gener- the previous example, even though the beliefs of
ated by a chance move choosing the state of nature the players were inconsistent in all states of the
sij S according to the distribution p (i.e., world, the true state was considered possible by
P(sij) ¼ P(Ii, IIj) for i and j in {1,2}), and then all players (e.g., in the state o12, player I assigns to
player I is informed of i and player II is informed this state probability 4/7 and player II assigns to it
of j. As a matter of fact, all the BL-subspaces in the probability 4/5). As was emphasized before, the
previous examples can also be written as Harsanyi UBS contains all belief configurations, including
games, mostly in a trivial way. highly inconsistent or wrong beliefs, as the fol-
lowing example shows. The belief subspace of the
Example 6 (Inconsistent Beliefs) In the same two players I and II concerning the state of nature
universal belief space, O(S, I) of the previous which can be s1 or s2 is given by
example, consider now another BL-subspace Ye
• Y ¼ {o1,o2}.
which differs from Y only by changing the
type • o1 ¼ s1 ; 12 o1 , 12 o2 ½1o2
:
II1 of player II from 35 o11 , 25 o21 to 1
1 1
• o2 ¼ s2 ; 2 o1 , 12 o2 ½1o2
:
o ,
2 11 2 21o , that is,
Now, the mutual beliefs about each other’s A BL-subspace Y is a semantic belief system pre-
type are senting, via the notion of types, the hierarchies of
belief of a set of players having incomplete infor-
II1 II2 II1 II2 mation about the state of nature. A state of the
I1 /
3 7 /
4 7 I1 /
1 2 /
4 5
world captures the situation at what is called the
interim stage: Each player knows his own type
I2 2/3 1/3 I2 /
1 2 /
1 5
and has beliefs about the state of nature and the
Beliefs of player I Beliefs of player II types of the other players. The question “what is
the real state of the world o?” is not addressed. In
Unlike in the previous example, these beliefs a BL-subspace, there is no chance move with
cannot be derived from a prior distribution p. explicit probability distribution that chooses the
According to Harsanyi, these are inconsistent state of the world, while such a probability distri-
beliefs. A BL-subspace with inconsistent beliefs bution is part of a Harsanyi or an Aumann model.
cannot be described as a Harsanyi or Aumann Yet, in the belief space Y of Example 5 in the
model; it cannot be described as a game in previous section, such a prior distribution
extensive form. p emerged endogenously from the structure of Y.
130 Bayesian Games: Games with Incomplete Information
More specifically, if the state o Y is chosen by a which is part of the data of the model. The role
chance move according to the probability distri- of the prior distribution p in these models is
bution p and each player i is told his type ti(o), actually not that of an additional parameter of
then his beliefs are precisely those described by the model but rather that of an additional
ti(o). This is a property of the BL-subspace that assumption on the belief system, namely, the
we call consistency (which does not hold, for consistency assumption. In fact, if a minimal
instance, for the BL-subspace Ye in Example 6) belief subspace is consistent, then the common
and that we define now: Let Y O be a prior p is uniquely determined by the beliefs, as
BL-subspace. we saw in Example 5; there is no need to specify
p as additional data of the system.
Definition 11
1. A probability distribution p D(Y) is said to be Proposition 13 If o O is a consistent state of
consistent if for any player i I, the world, and if Y(o) is the smallest consistent
BL-subspace containing o, then the consistent
ð
probability distribution p on Y(o) is uniquely
p¼ t i ðoÞdp: ð8Þ
Y determined.
(The formulation of this proposition requires
2. A BL-subspace Y is said to be consistent if there some technical qualification if Y(o) is a
is a consistent probability distribution p with continuum).
Supp(p) ¼ Y. A consistent BL-subspace will The consistency (or the existence of a common
be called a C-subspace. A state of the world prior) is quite a strong assumption. It assumes that
o O is said to be consistent if it is a point in a differences in beliefs (i.e., in probability assess-
C-subspace. ments) are due only to differences in information;
players having precisely the same information
The interpretation of Eq. 8 is that the proba- will have precisely the same beliefs. It is no sur-
bility distribution p is “the average” of the types prise that this assumption has strong conse-
ti(o) of player i (which are also probability quences, the most known of which is due to
distributions on Y), when the average is taken Aumann (1976): Players with consistent beliefs
on Y according to p. This definition is not trans- cannot agree to disagree. That is, if at some state
parent; it is not clear how it captures the con- of the world it is commonly known that one player
sistency property we have just explained, in assigns probability q1 to an event E and another
terms of a chance move choosing o Y player assigns probability q2 to the same event,
according to p. However, it turns out to be then it must be the case that q1 ¼ q2. Variants of
equivalent. this result appear under the title of “No trade
For o Y, denote pi(o) ¼ {o0 Y | ti(o0) theorems” (see, e.g., Milgrom and Stokey 1982):
¼ ti(o)}; then we have Rational players with consistent beliefs cannot
believe that they both can gain from a trade or a
Proposition 12 A probability distribution bet between them.
p D(Y) is consistent if and only if The plausibility and the justification of the
common prior assumption were extensively
t i ðoÞðAÞ ¼ pðAjpi ðoÞÞ ð9Þ discussed in the literature (see, e.g., Aumann
1998; Gul 1998; Harsanyi 1967). It is sometimes
holds for all i I and for any measurable set referred to in the literature as the Harsanyi doc-
A Y. trine. Here we only make the observation that
In particular, a Harsanyi or an Aumann within the set of BL-subspaces in O, the set of
model is represented by a consistent consistent BL-subspaces is a set of measure zero.
BL-subspace since, by construction, the beliefs To see the idea of the proof, consider the follow-
are derived from a common prior distribution ing example:
Bayesian Games: Games with Incomplete Information 131
If the subspace is consistent, these beliefs are As we said, a game with incomplete information
obtained as conditional distributions from some played by Bayesian players, often called a Bayes-
prior probability distribution p on T ¼ TI TII, ian game, is a game in which the players have
say, by p of the following matrix: incomplete information about the data of the
game. Being a Bayesian, each player has beliefs
(probability distribution) about any relevant data
II1 II2
he does not know, including the beliefs of the
I1 p11 p12
other players. So far, we have developed the
belief structure of such a situation which is
a BL-subspace Y in the universal belief space
I2 p21 p22 O(S, I). Now we add the action sets and the payoff
functions. These are actually part of the descrip-
tion of the state of nature: The mapping s:O ! S
Prior distribution p on T
assigns to each state of the world o the game-form
s(o) played at this state. To emphasize this inter-
pretation of s(o) as a game-form, we denote it also
This implies (assuming pij 6¼ 0 for all i and j) as Go:
p11 a1 p a2
¼ ; 21 ¼ Go ¼ ðI, Ai ðt i ðoÞÞi I , ui ðoÞi I ,
p12 1 a1 p22 1 a2
p p a1 1 a2
and hence 11 22 ¼ : where Ai(ti(o)) is the actions set (pure strategies) of
p12 p21 1 a1 a2
player i at o and ui(o) : A(o) ! ℝ is his payoff
Similarly, function and A(o) ¼ i IAi(ti(o)) is the set of
p11 b1 p b2 action profiles at state o. Note that while the
¼ ; 12 ¼
p21 1 b1 p22 1 b2 actions of a player depend only on his type, his
p p b1 1 b2 payoff depends on the actions and types of all the
and hence 11 22 ¼ : players. For a vector of actions a A(o), we write
p12 p21 1 b1 b2
ui(o;a) for ui(o)(a). Given a BL-subspace Y O(-
It follows that the types must satisfy S, I), we define the Bayesian game on Y as follows:
132 Bayesian Games: Games with Incomplete Information
Definition 14 The Bayesian game on Y is a vec- of best reply can be adapted to yield the solution
tor payoff game in which: concept of Bayesian equilibrium (also called
Nash-Bayes equilibrium).
• I ¼ {1,. . .,n} – the players’ set.
• Si – the strategy set of player i is the set of
Definition 15 A vector of strategies s ¼
mappings.
(s1,. . .,sn), in a Bayesian game, is called a Bayesian
si : Y ! Ai which are T i measurable: equilibrium if for all i in I and for all ti in Ti,
• In particular,
uti ðsÞ uti ðsi ; e
si Þ, 8e
si Si , ð12Þ
t i ðo1 Þ ¼ t i ðo2 Þ ) si ðo1 Þ ¼ si ðo2 Þ:
• Let S ¼ i I Si. where, as usual, s i ¼ (sj)j6¼i denotes the vector
• The payoff function ui for player i is a vector- of strategies of players other than i.
valued function ui ¼ ðuti Þti T i , where uti (the Thus, a Bayesian equilibrium specifies a
payoff function of player i of type ti) is a mapping behavior for each player which is a best reply to
uti : S ! ℝ what he believes is the behavior of the other
players, that is, a best reply to the strategies of
• Defined by
ð the other players given his type. In a game with
uti ðsÞ ¼ ui ðo; sðoÞÞdt i ðoÞ: ð11Þ complete information, which corresponds to a
Y BL-subspace with one state of the world
Note that uti is Ti measurable, as it should (Y ¼ {o}), as there is only one type of each
be. When Y is a finite BL-subspace, the above- player, and the beliefs are all probability one on
defined Bayesian game is an n-person “game” in a singleton, the Bayesian equilibrium is just the
which the payoff for player i is a vector with a well-known Nash equilibrium.
payoff for each one of his types (therefore, a
Remark 16 It is readily seen that when Y is finite,
vector of dimension |Ti|). It becomes a regular
any Bayesian equilibrium is a Nash equilibrium of
game-form for a given state of the world o since
the Selten game G* * in which each type is a player
then the payoff to player i is uti(o). However, these
who selects the types of his partners according to
game-forms are not regular games since they are
his beliefs. Similarly, we can transform the Bayes-
interconnected; the players do not know which of
ian game into an ordinary game in strategic form
these “games” they are playing (since they do not
by defining the payoff function to player i to be
know the state of the world o). Thus, just like a X
Harsanyi game, a Bayesian game on a e
ui gt i uti where gti are strictly positive. Again,
ti T i
BL-subspace Y consists of a family of connected
independently of the values of the constants gti, any
game-forms, one for each o Y. However, unlike
Bayesian equilibrium is a Nash equilibrium of this
a Harsanyi game, a Bayesian game has no chance
game and vice versa. In particular, if we choose the
move that chooses the state of the world (or the X
vector of types). A way to transform a Bayesian constants so that gt i ¼ 1, we obtain the game
ti T i
game into a regular game was suggested by
suggested by Aumann and Maschler in 1967 (see
R. Selten and was named by Harsanyi as the
p. 95 in Aumann and Maschler 1995), and again,
Selten game G* * (see p. 496 in (Harsanyi 1967).
the set of Nash equilibria of this game is precisely
This is a game with |T1| |T2|. . .|Tn| players (one
the set of Bayesian equilibria.
for each type) in which each player ti Ti chooses
a strategy and then selects his (n 1) partners,
one from each Tj; j 6¼ i, according to his beliefs ti. The Harsanyi Game Revisited
As we observed in Example 5, the belief structure of
Bayesian Equilibrium a consistent BL-subspace is the same as in a Har-
Although a Bayesian game is not a regular game, sanyi game after the chance move choosing the
the Nash equilibrium concept based on the notion types. That is, the embedding of the Harsanyi
Bayesian Games: Games with Incomplete Information 133
game as a BL-subspace in the universal belief Bayesian equilibrium the natural extension of the
space is only at the interim stage, after the Nash equilibrium concept to games with incom-
moment that each player gets to know his type. plete information for consistent or inconsistent
The Harsanyi game on the other hand is at the ex beliefs, when the Harsanyi ordinary game model
ante stage, before a player knows his type. is unavailable.
Then, what is the relation between the Nash
equilibrium in the Harsanyi game at the ex ante Examples of Bayesian Equilibria
stage and the equilibrium at the interim stage, In Example 6, there are two players of two types
namely, the Bayesian equilibrium of the each and with inconsistent mutual beliefs given by
corresponding BL-subspace? This is an impor-
tant question concerning the embedding of the II1 II2 II1 II2
Harsanyi game in the UBS since, as we said
before, the chance move choosing the types
I1 /
3 7 /
4 7 I1 /
1 2 4 5 /
does not appear explicitly in the UBS. The
I2 2/3 1/3 I2 /
1 2 1 5 /
answer to this question was given by Harsanyi Beliefs of player I Beliefs of player II
(1967–1968) (assuming that each type ti has a
positive probability). Assume that the payoff matrices for the four
types of profiles are
II II
Theorem 17 (Harsanyi) The set of Nash equi-
L R L R
libria of a Harsanyi game is identical to the set of
T 2, 0 0, 1 T 0, 0 0, 0
Bayesian equilibria of the equivalent BL-subspace I I
in the UBS. B 0, 0 1, 0 B 1, 1 1, 0
In other words, this theorem states that any G11: Payoffs when t = (I 1, II 1) G12: Payoffs when t = (I 1, II 2)
equilibrium in the ex ante stage is also an equilib- II II
L R L R
rium at the interim stage and vice versa.
In modeling situations of incomplete informa- T 0, 0 0, 0 T 0, 0 2, 1
I I
tion, the interim stage is the natural one; if a player B 1, 1 0, 0 B 0, 0 0, 2
knows his beliefs (type), then why should he ana- G21: Payoffs when t = (I 2, II 1) G22: Payoffs when t = (I 2, II 2)
lyze the situation, as Harsanyi suggests, from the
ex ante point of view as if his type was not known As the beliefs are inconsistent, they cannot be
to him and he could equally well be of another presented by a Harsanyi game. Yet we can com-
type? Theorem 17 provides a technical answer to pute the Bayesian equilibrium of this Bayesian
this question: The equilibria are the same in both game. Let (x, y) be the strategy of player I,
games and the equilibrium strategy of the ex ante which is:
game specifies for each type precisely his equilib-
rium strategy at the interim stage. In that respect, • Play the mixed strategy [x(T),(1 x)(B)]
for a player who knows his type, the Harsanyi when you are of type I1.
model is just an auxiliary game to compute his • Play the mixed strategy [y(T),(1 y)(B)]
equilibrium behavior. Of course the deeper when you are of type I2.
answer to the question above comes from the
interactive nature of the situation: Even though and let (z, t) be the strategy of player II, which
player i knows he is of type ti, he knows that his is:
partners do not know that and that they may
consider the possibility that he is of type et i , and • Play the mixed strategy [z(L),(1 z)(R)] when
since this affects their behavior, the behavior of you are of type II1.
type et i is also relevant to player i who knows he is • Play the mixed strategy [t(L),(1 t)(R)] when
of type ti. Finally, Theorem 17 makes the you are of type II2.
134 Bayesian Games: Games with Incomplete Information
For 0 < x, y, z, t < 1, each player of each type if he further believes that vj is random with uni-
must be indifferent between his two pure actions; form probability distribution on [0,1], then this is
that yields the values in equilibrium: a Bayesian game in which the type of a player is
his private valuation; that is, the type sets are
3 2 7 2
x¼ , y¼ , z¼ , t¼ : T1 ¼ T2 ¼ [0,1], which is a continuum. This is a
5 5 9 9
consistent Bayesian game (that is, a Harsanyi
There is no “expected payoff” since this is a game) since the beliefs are derived from the uni-
Bayesian game and not a game; the expected form probability distribution on T1 T2
payoffs depend on the actual state of the world, ¼ [0,1]2. A Bayesian equilibrium of this game is
i.e., the actual types of the players and the actual that in which each player bids half of his private
payoff matrix. For example, the state of the world value: bi(vi) ¼ vi/2 (see, e.g., Chap. III in
is o11 ¼ (G11;I1, II1); the expected payoffs are Wolfstetter (1999). Although auction theory was
developed far beyond this simple example, almost
7=9
3 2 46 6 all the models studied so far are Bayesian games
pðo11 Þ ¼ , G11 ¼ , :
5 5 2=9 45 45 with consistent beliefs, that is, Harsanyi games.
The main reason of course is that consistent
Similarly, Bayesian games are more manageable since they
! can be described in terms of an equivalent ordi-
2=9
3 2 18 4 nary game in strategic form. However, inconsis-
pðo12 Þ ¼ , G12 ¼ ,
5 5 7=9 45 45 tent beliefs are rather plausible and exist in the
! market place in general and even more so in
7=9
2 3 21 21
pðo21 Þ ¼ , G21 ¼ , auction situations. An example of that is the case
5 5 2=9 45 45 of collusion of bidders: When a bidding ring is
!
2=9 formed, it may well be the case that some of the
2 3 28 70
pðo22 Þ ¼ , G22 ¼ , : bidders outside the ring are unaware of its exis-
5 5 7=9 45 45
tence and behave under the belief that all bidders
However, these are the objective payoffs as are competitive. The members of the ring may or
viewed by the analyst; they are viewed differently may not know whether the other bidders know
by the players. For player i of type ti, the relevant about the ring, or they may be uncertain about
payoff is his subjective payoff uti(s) defined in it. This rather plausible mutual belief situation is
Eq. 11. For example, at state o11 (or o12), player typically inconsistent and has to be treated as an
I believes that with probability 3/7 the state is o11 inconsistent Bayesian game for which a Bayesian
in which case his payoff is 46/45 and with prob- equilibrium is to be found.
ability 4/7 the state is o12 in which case his payoff
is 18/45. Therefore, his subjective expected payoff
at state o11 is 3/7 46/45 + 4/7 18/45 ¼ 2/3. Bayesian Equilibrium and Correlated
Similar computations show that in states o21 or Equilibrium
o22, player I “expects” a payoff of 7/15, while
player II “expects” 3/10 in state o11 or o21 and Correlated equilibrium was introduced in
86/225 in state o12 or o22. Aumann (1974) as the Nash equilibrium of a
Bayesian equilibrium is widely used in auction game extended by adding to it random events
theory, which constitutes an important and suc- about which the players have partial information.
cessful application of the theory of games with Basically, starting from an ordinary game,
incomplete information. The simplest example is Aumann added a probability space and informa-
that of two buyers bidding in a first-price auction tion structure and obtained a game with incom-
for an indivisible object. If each buyer i has a plete information, the equilibrium of which he
private value vi for the object (which is indepen- called a correlated equilibrium of the original
dent of the private value vj of the other buyer), and game. The fact that the Nash equilibrium of a
Bayesian Games: Games with Incomplete Information 135
game with incomplete information is the Bayes- suggestion of which action to choose, then it is
ian equilibrium suggests that the concept of cor- readily verified that following the suggestion is a
related equilibrium is closely related to that of Nash equilibrium of the extended game yielding a
Bayesian equilibrium. In fact Aumann noticed payoff (5,5). This was called by Aumann a corre-
that and discussed it in a second paper entitled lated equilibrium of the original game G. In our
“Correlated equilibrium as an expression of terminology, the extended game G* is a Bayesian
Bayesian rationality” (Aumann 1987). In this game and its Nash equilibrium is its Bayesian
section, we review briefly, by way of an example, equilibrium. Thus, what we have here is that a
the concept of correlated equilibrium and state correlated equilibrium of a game is just the Bayes-
formally its relation to the concept of Bayesian ian equilibrium of its extension to a game with
equilibrium. incomplete information. We now make this a gen-
eral formal statement. For simplicity, we use the
Example 18 Consider a two-person game with Aumann model of a game with incomplete
actions {T, B} for player 1 and {L, R} for player information.
2 with corresponding payoffs given in the follow- Let G ¼ (I,(Ai)i I,(ui)i I) be a game in strate-
ing matrix: gic form where I is the set of players, Ai is the set
2
of actions (pure strategies) of player i, and ui is his
L R payoff function.
T 6, 6 2, 7
1 Definition 19 Given a game in strategic form G,
B 7, 2 0, 0 an incomplete information extension (the
G: Payoffs of the basic game I-extension) of the game G is the game G* given by
G ¼ I, ðAi Þi I, ðui Þi I , ðY , pÞ , ðpi Þi I Þ,
This game has three Nash equilibria: (T, R) with
payoff (2,7), (B, L) with payoff (7,2), and the mixed where (Y,p) is a finite probability space and pi is a
2 partition of Y (the information partition of player i).
equilibrium ðT Þ, 13 ðBÞ , 23 ðLÞ, 13 ðRÞ with
2 23 This is an Aumann model of incomplete infor-
payoff 4 3 , 4 3 Suppose that we add to the game
mation, and as we noted before, it is also a
a chance move that chooses an element in
Harsanyi-type-based model in which the type of
{T,B} {L,R} according to the following proba-
player i at state o Y is ti(o) ¼ pi(o) and a strat-
bility distribution m:
egy of player i is a mapping from his type set to his
L R mixed actions: si:Ti ! D(Ai).
T 1/3 1/3
We identify a correlated equilibrium in the
game G by the probability distribution m on the
B 1/3 0, 0 vectors of actions A ¼ A1,. . .,An. Thus,
m: Probability distribution on {T, B} × {L, R} m D(A) is a correlated equilibrium of the game
G if when a A is chosen according to m and each
Let us now extend the game G to a game with player i is suggested to play ai, his best reply is in
incomplete information G* in which a chance fact to play the action ai.
move chooses an element in {T, B} {L, R} Given a game with incomplete information G*
according to the probability distribution above. as in definition 19, any vector of strategies of the
Then, player 1 is informed of the first (left) com- players s ¼ (s1,. . .,sn) induces a probability dis-
ponent of the chosen element and player 2 is tribution on the vectors of actions a A. We
informed of the second (right) component. Then, denote this as ms D(A).
each player chooses an action in G and the payoff We can now state the relation between corre-
is made. If we interpret the partial information as a lated and Bayesian equilibria.
136 Bayesian Games: Games with Incomplete Information
Theorem 20 Let s be a Bayesian equilib- Another related point is the fact that if players’
rium in the game of incomplete information beliefs are the data of the situation (in the interim
G* ¼ (I,(Ai)i I,(ui)i I,(Y,p)),(pi)i I); then the stage), then these are typically imprecise and
induced probability distribution ms is a corre- rather hard to measure. Therefore, any meaningful
lated equilibrium of the basic game result of our analysis should be robust to small
G ¼ (I,(Ai)i I,(ui)i I). changes in the beliefs. This cannot be achieved
The other direction is within the consistent belief systems which are a
thin set of measure zero in the universal belief
space.
Theorem 21 Let m be a correlated equilibrium of
the game G ¼ (I,(Ai)i I,(ui)i I); then G has an
extension to a game with incomplete information
Knowledge and Beliefs
G* ¼ (I,(Ai)i I,(ui)i I,(Y,p)),(pi)i I) with a
Our interest in this article was mostly in the
Bayesian equilibrium s for which ms ¼ m.
notion of beliefs of players and less in the notion
of knowledge. These are two related but different
notions. Knowledge is defined through a knowl-
Concluding Remarks and Future edge operator satisfying some axioms. Beliefs
Directions are defined by means of probability distributions.
Aumann’s model, discussed in the section enti-
The Consistency Assumption tled “Aumann’s Model” above, has both ele-
To the heated discussion of the merits and justifi- ments: The knowledge was generated by the
cation of the consistency assumption in economic partitions of the players, while the beliefs were
and game-theoretical models, we would like to generated by the probability P on the space
add a couple of remarks. In our opinion, the Y (and the partitions). Being interested in the
appropriate way of modeling an incomplete infor- subjective beliefs of the player, we could under-
mation situation is at the interim stage, that is, stand “at state of the world o O, player i knows
when a player knows his own beliefs (type). The the event E O” to mean “at state of the world
Harsanyi ex ante model is just an auxiliary con- o O, player i assigns to the event E O prob-
struction for the analysis. Actually, this was also ability 1.” However, in the universal belief space,
the view of Harsanyi, who justified his model by “belief with probability 1” does not satisfy a
proving that it provides the same equilibria as the central axiom of the knowledge operator.
interim stage situation it generates (Theorem 17). Namely, if at o O player i knows the event
The Harsanyi doctrine says roughly that our E O, then o E. That is, if a player knows an
models “should be consistent” and if we get an event, then this event in fact happened. In the
inconsistent model, it must be the case that it not universal belief space where all coherent beliefs
be a “correct” model of the situation at hand. This are possible, in a state o O a player may assign
becomes less convincing if we agree that the probability 1 to the event {o0} where o0 6¼ o. In
interim stage is what we are interested in: Not fact, if in a BL-subspace Y the condition
only are most mutual beliefs inconsistent, as we o Pi(o) is satisfied for all i and all o Y, then
saw in the section entitled “Consistent Beliefs and belief with probability 1 is a knowledge operator
Common Priors” above, but it is hard to argue on Y. This in fact was the case in Aumann’s and in
convincingly that the model in Example Harsanyi’s models where, by construction, the
5 describes an adequate mutual belief situation support of the beliefs of a player in the state o
while the model in Example 6 does not; the only always included o. For a detailed discussion of
difference between the two is that in one model, a the relationship between knowledge and beliefs
certain type’s beliefs are 35 o11 , 25 o21 , while in in the universal belief space, see Vassilakis and
the other model his beliefs are 12 o11 , 12 o21 . Zamir (1993).
Bayesian Games: Games with Incomplete Information 137
On the other hand, if players engage in a Every degenerate lottery in Si (which puts proba-
repeated Prisoner’s Dilemma, if they value suffi- bility 1 to one particular action in Ai) is associated
ciently future payoffs compared to present ones, to the corresponding element in Ai. A choice of
and if past actions are observable, then (C, C) is a action
Yfor every player i determines an outcome
sustainable outcome. Indeed, if each player plays a Ai. The payoff function of the stage game is
C as long as the other one has always done so in i
the past and plays D otherwise, both players have g : A ! ℝI . Payoffs are naturallyYassociated to
an incentive to always play C, since the short-term profiles of mixed actions s S ¼ Si using the
gain that can be obtained by playing D is more expectation: gðsÞ ¼ Es gðaÞ. i
can be sustained in one-shot interactions. in the repeated game specifies the choice of a
In general, what are the equilibrium payoffs of mixed action at every stage, as a function of the
a repeated game and how can they be computed past observed history. More specifically, a behav-
from the data of the static game? Is there a signif- ioral strategy for player i is of the form si :
icant difference between games repeated a finite [t H t ! Si . When all the strategy choices belong
number of times and infinitely repeated ones? to Ai (si : [t H t ! Ai ), si is called a pure strategy.
What is the role played by the degree of impa-
tience of players? Do the conclusions obtained for Other Strategy Specifications A behavioral
the Prisoner’s Dilemma game and for other games strategy allows the player to randomize his action
rely crucially on the assumption that each player depending on past history. If, at the start of the
perfectly observes other player’s past choices, or repeated game, the player was to randomize over
would imperfect observation be sufficient? The the set of behavioral strategies, the result would be
theory of repeated games aims at answering equivalent to a particular behavioral strategy
these questions and many more. choice. This result is a consequence of Kuhn’s
theorem (Aumann 1964; Kuhn 1953). Further-
more, behavioral strategies are also equivalent to
Games with Observable Actions randomizations over the set of pure strategies.
This section focuses on repeated games with per- Induced Plays Every choice of pure strategies
fect monitoring in which, after every period of the s ¼ ðsi Þi by all the players induces a play h ¼
repeated game, all strategic choices of all the ða1 , a2 , . . .Þ A1 in the repeated game, defined
players are publicly revealed. inductively by a1 ¼ ðsi,0 ð 0ÞÞ and at ¼
In infinitely repeated games with no discounting, Perfect equilibrium is a more robust and often
the players care about their long-run stream of stage considered a more satisfactory solution concept
payoffs. In particular, the payoff in the repeated than Nash equilibrium. The construction of per-
game associated to a play h ¼ ða1 , a2 , . . .Þ A1 fect equilibria is in general also more demanding
coincides with the limit of the Cesaro means of stage than the construction of Nash equilibria.
payoffs when this limit exists. When this limit does The main objective of the theory of repeated
not exist, the most common evaluation of the stream games is to characterize the set of payoff vectors
of payoffs is defined through a Banach limit of the that can be sustained by some Nash or perfect
Cesaro means (a Banach limit is a linear form on the equilibrium of the repeated game.
set of bounded sequences that lies always between
the liminf and the limsup).
Necessary Conditions on Equilibrium Payoffs
In infinitely repeated games with discounting,
Some properties are common to all equilibrium
a discount factor 0 < d < 1 characterizes the
payoffs. First, under the common assumption that
player’s degree of impatience. A payoff of 1 at
all players evaluate the payoff associated to a play
stage t þ 1 is equivalent to a payoff of d at stage t.
in the same way, the resulting payoff vector in the
Player i’s payoff in the repeated game for the play
repeated game is a convex combination of stage
h ¼ ða1 , a2 , . . .Þ A1 is the
X normalized sum of
payoffs. That is, the payoff vector in the repeated
discounted payoffs: ð1 dÞ dt1 gi ðat Þ.
game is an element of the convex closure of g(A),
t1
In finitely repeated games, the game ends after called the set of feasible payoffs and denoted F.
some stage T. Payoffs induced by the play after A notable exception is the work of Lehrer and
stage T are irrelevant (and a strategy needs not Pauzner (1999) who study repeated games where
specify choices after stage T). The payoff for a players have heterogeneous time preferences. The
player is the average of the stage payoffs during payoff vector resulting from a play does not nec-
X
T essarily belong to F if players have different eval-
stages 1 up to T: T1 gi ðat Þ. uations of payoff streams. For instance, in a
t¼1 repetition of the Prisoner’s Dilemma, if player
1 cares only about the payoff in stage 1 and player
2 cares only about the payoff in stage 2, it is
Equilibrium Notions What plays can be
possible for both players to obtain a payoff of
expected to be observed in repeated interactions
four in the repeated game.
of players who observe each other’s choices?
Now consider a strategy profile s, and let ti be
Noncooperative game theory focuses mainly on
a strategy of player i that plays after every history
the idea of stable convention, i.e., of strategy pro-
(a1, . . ., at) a best response to the profile of mixed
files from which no player has incentives to devi-
actions chosen by the other players in the next
ate, knowing the strategies adopted by the other
stage. At any stage of the repeated game,
players.
the expected payoff for player i using t i is no
A strategy profile forms a Nash equilibrium
less than
(Nash 1951) when no player can improve his pay-
off by choosing an alternative strategy, as long as
vi ¼ min max gi ðsi , ai Þ ð1Þ
other players follow the prescribed strategies. si Si ai Ai
In some cases, the observation of past play may
not be consistent with the prescribed strategies. where si ¼ s j j6¼i (we use similar notations
When, for every possible history, each player’s throughout the paper: for a family of sets
strategy maximizes the continuation stream of ðEi Þi I , ei denotes an element of Ei ¼
payoffs, assuming that other players abide with Y Y
their prescribed strategies at all future stages, the E j , and a profile e E j is denoted e ¼
strategy profile forms a subgame perfect equilib- j6¼i j
rium (Selten 1965). ðei , ei Þ when the ith component is stressed).
142 Repeated Games with Complete Information
The payoff vi is referred to as player i’s min Theorem 1 The set of equilibrium payoffs of the
max payoff. A payoff vector that provides each repeated game with no discounting coincides with
player i with at least [resp. strictly more than] vi is the set of feasible and individually rational payoffs.
called individually rational [resp. strictly individ-
ually rational], and IR [resp. IR*] denotes the set Aumann and Shapley (1976, 1994) and
of such payoff vectors. Since, for any strategy Rubinstein (1977, 1994) show that restricting
profile, there exists a strategy of player i that attention to perfect equilibria does not narrow
yields a payoff no less than vi, all equilibrium down the set of equilibrium payoffs. They
payoffs have to be individually rational. prove that
Also note that players j 6¼ i collectively have a
strategy profile in the repeated game that forces Theorem 2 The set of perfect equilibrium pay-
player i’s payoff down to vi: they play repeatedly a offs of the repeated game with no discounting
mixed strategy profile that achieves the minimum coincides with the set of feasible and individually
in the definition of vi. Such a strategy profile in the rational payoffs.
one-shot game is referred to as punishing strategy
or min max strategy against player i. We outline a proof of Theorem 2. It is
For the Prisoner’s Dilemma game, F is the established that any equilibrium payoff is in F \
convex hull of (1, 1), (5, 0), (0, 5), and (4, 4). IR. We need only to prove that every element of
Both player’s min max levels are equal to 1. F \ IR is a subgame perfect equilibrium payoff.
Figure 1 illustrates the set of feasible and individ- Let x F \ IR , and let h ¼ a1 , . . . , at , . . . be a
ually rational payoff vectors (hatched area): play inducing x. Consider the strategies that play
The set of feasible and individually rational at in stage t; if player i does not respect this
payoffs can be directly computed from the stage prescription at stage t0, the other players punish
game data. player i for t0 stages by repeatedly playing the
min max strategy profile against player i. After
Infinitely Patient Players the punishment phase is over, players revert to the
The following result has been part of the folklore of play of h, hence playing a2t0 þ1 . . ..
game theory at least since the mid-1960s. Its author- Now we explain why these strategies form a
ship is obscure (see the introduction of Aumann subgame perfect equilibrium. Consider a strategy
(1981a)). For this reason, it is commonly referred of player i starting after any history. The induced
to as the “Folk Theorem.” By extension, character- play by this strategy for player i and by other
ization of sets of equilibrium payoffs in repeated player’s prescribed strategies is, up to a subset of
games is also referred to as “Folk Theorems.” stages of null density, defined by the sequence
h with interweaved periods of punishment for
Player 2’s player i. Hence the induced long-run payoff for
payoff player i is a convex combination of his punish-
ment payoff and of the payoff induced by h. The
5
result follows since the payoff for player
4 i induced by h is no worse than the punishment
3 payoff.
2
1 Impatient Players
Player 1’s The strategies constructed in the proof of the
−1 1 2 3 4 5 payoff Folk Theorem for repeated games with infinitely
−1 patient players (Theorem 1) do not necessarily
constitute a subgame perfect equilibrium if
Repeated Games with Complete Information, players are impatient. Indeed, during a punish-
Fig. 1 F and IR for the Prisoner’s Dilemma ment phase, the punishing players may be
Repeated Games with Complete Information 143
receiving low stage payoffs, and these stage Theorem 3 If the number of players is 2 or if the
payoffs matter in the evaluation of their stream set feasible payoff vectors has a nonempty inte-
of payoffs. When constructing subgame perfect rior, then any payoff vector that is feasible and
equilibria of discounted games, one must strictly individually rational is a subgame perfect
make sure that after a deviation of player i, equilibrium of the discounted repeated game, pro-
players j 6¼ i have incentives to implement vided that players are sufficiently patient.
player i’s punishment.
Forges et al. (1986) provide an example for
Nash Reversion which a payoff which is individually rational but
Friedman (1971) shows that every feasible payoff not strictly individually rational is not an equilib-
that Pareto dominates a Nash equilibrium payoff rium payoff of the discounted game.
of the static game is a subgame perfect equilib- Abreu et al. (1994) show that the nonempty
rium payoff of the repeated game provided that interior condition of the theorem can be replaced
players are patient enough. In Friedman’s proof, by a weaker condition of “nonequivalent utili-
punishments take the simple form of reversion to ties”: no pair of players have the same preferences
the repeated play of the static Nash equilibrium over outcomes. Wen (1994) and Fudenberg et al.
forever. In the Prisoner’s Dilemma, (D, D) is the (2007) show that a Folk Theorem still holds when
only static Nash equilibrium payoff, and thus the condition of nonequivalent utilities fails if one
(4, 4) is a subgame perfect Nash equilibrium pay- replaces the min max level defining individually
off of the repeated game if players are patient rational payoffs by some “effective min max”
enough. Note however that in some games, the payoffs.
set of payoffs that Pareto dominates some equi- An alternative representation of impatience to
librium payoff may be empty. Also, Friedman’s discounted payoffs in infinitely repeated games is
result constitutes a partial Folk Theorem only in the overtaking criterion, introduced by Rubinstein
that it does not characterize the full set of equilib- (1979): the play (a1, a2, . . .) is strictly preferred
rium payoffs. by player i to the play (a10 , a20 , . . .) if the inferior
limit of the difference of the corresponding
The Recursive Structure streams of payoffs is positive, i.e., if
Repeated games with discounting possess a struc- X T
ture similar to dynamic programming problems. lim inf gi ðat Þ gi a0t > 0 . Rubinstein
T
t¼1
At any stage in time, players choose actions that (1979) proves a Folk Theorem with the overtaking
maximize the sum of the current payoff and the criterion.
payoff at the subsequent stages. When strategies
form a subgame perfect equilibrium, the payoff
vector at subsequent stages must be an equilib- Finitely Repeated Games
rium payoff, and players must have incentives to Strikingly, equilibrium payoffs in finitely repeated
follow the prescribed strategies at the current games and in infinitely repeated games can be
stage. This implies that subgame perfect equilib- drastically different. This effect is best exempli-
rium payoffs have a recursive structure, first stud- fied in repetitions of the Prisoner’s Dilemma.
ied by Abreu (1988). Subsection “A Recursive
Structure” presents the recursive structure in
more details for the more general model of The Prisoner’s Dilemma
games with public monitoring. Recall that in an infinitely repeated Prisoner’s
Dilemma, cooperation at all stages is achieved at
The Folk Theorem for Discounted Games a subgame perfect equilibrium if players are
Relying on Abreu’s recursive results, Fudenberg patient enough. By contrast, at every Nash equi-
and Maskin (1986) prove the following Folk The- librium of any finite repetition of the Prisoner’s
orem for subgame perfect equilibria with Dilemma, both players play D at every stage with
discounting: probability 1.
144 Repeated Games with Complete Information
Now we present a short proof of this result. knowledge between players. Neyman (1999)
Consider any Nash equilibrium of the Prisoner’s shows that a Folk Theorem obtains for the finitely
Dilemma repeated T times. Let a1, . . ., aT be a repeated Prisoner’s Dilemma (and for other
sequence of action profiles played with positive games) if there is lack of common knowledge on
probability at the Nash equilibrium. Since each the last stage of repetition.
player can play D at the last stage of the repetition,
and D is a dominating action, aT ¼ ðD, DÞ: We Folk Theorems for Finitely Repeated Games
now prove by induction on t that for any such t, A Folk Theorem can be obtained when there are
aTt , . . . , aT ¼ ðD, DÞ, . . . , ðD, DÞ . Assume the two Nash equilibrium payoffs for each player. The
induction hypothesis valid for t 1. Consider a following result is due to Benoît and Krishna
strategy for player i that follows the equilibrium (1985) and Gossner (1995):
strategy up to stage T t 1, then plays D from
stage T t on. This strategy obtains the same Theorem 6 Assume that each player has two
payoff as the equilibrium strategy at stages distinct Nash equilibrium payoffs in G and that
1, . . . , T t 1 , and at least as much as the the set of feasible payoffs has nonempty interior.
equilibrium strategy at stages T t þ Then, the set of subgame perfect equilibrium pay-
1, . . . , T t. Hence, this strategy cannot obtain offs of the T times repetition of G converges to the
more than the equilibrium strategy at stage T t, set of feasible and individually rational payoffs as
and therefore, the equilibrium strategy plays D at T goes to infinity.
stage T t with probability 1 as well.
Sorin (1986) proves the more general result: Hence, with at least two equilibrium payoffs
per player, the sets of equilibrium payoffs of
Theorem 4 Assume that in every Nash equilib- finitely repeated games and infinitely repeated
rium of G, all players are receiving their individ- games are asymptotically the same.
ually rational levels. Then, at every Nash The condition that each player has two distinct
equilibrium of any finitely repeated version of G, Nash equilibrium payoffs in the stage game can be
all players are receiving their individually rational weakened; see Smith (1995). Assume for simplic-
levels. ity that one player has two distinct Nash payoffs.
By playing one of the two Nash equilibria in the
The proof of Theorem 4 relies on a backward last stages of the repeated game, it is possible to
induction type of argument, but it is striking that provide incentives for this player to play actions
the result applies for all Nash equilibria and not that are not part of Nash equilibria of the one-shot
only for subgame perfect Nash equilibria. This game in previous stages. If this construction leads
result shows that, unless some additional assump- to perfect equilibria in which a player j 6¼ i has
tions are made on the one-shot game, a Folk The- distinct payoffs, we can now provide incentives
orem cannot obtain for finitely repeated games. for both players i and j. If successive iterations of
this procedure yield distinct subgame perfect
Games with Unique Nash Payoff equilibrium payoffs for all players, a Folk Theo-
Using a proof by backward induction, Benoît and rem applies.
Krishna (1985) obtain the following result:
How equilibrium payoffs of the repeated game assumptions. We review this literature in subsec-
depend on the quality of player’s monitoring of tion “Almost-Perfect Monitoring.”
each other’s actions is the subject of a very active Little is known about general discounted
area of research. games with imperfect private monitoring. We pre-
Repeated games with imperfect monitoring, in sent the main known results in subsection “Gen-
which players observe imperfectly other player’s eral Stochastic Signals.”
action choices, were first motivated by economic With perfect monitoring, the worst equilibrium
applications. In Stigler (1964), two firms are payoff for a player is given by the min max of the
repeatedly engaged in price competition over mar- one-shot game, where punishing (minimizing)
ket shares. Each firm observes its own sales, but players choose an independent profile of mixed
not the price set by the rival. While it is in the best strategies. With imperfect monitoring, correlation
interest for both firms to set a collusive price, each past signals for the punishing players may lead to
firm has incentives to secretly undercut the rival’s more efficient punishments. We present results on
price. Upon observing plunging sales, should a punishment levels in subsection “Punishment
firm deduce that the rival firm is undercutting Levels.”
prices, and retaliate by setting lower prices, or
should lower sales be interpreted as a result of an
Model
exogenous shock on market demand? Whether
In this section we define repeated games with
collusive behavior is sustainable or not at equilib-
imperfect monitoring and describe several classes
rium is one of the motivating questions in the
of monitoring structures of particular interest.
theory of repeated games with imperfect
monitoring.
It is interesting to compare repeated games Data of the Game
with imperfect monitoring with their perfect mon- Recall that the one-shot strategic interaction is
itoring counterparts. described by a finite set I of players, a finite
The structure of equilibria used to prove the action set Ai for each player i, and a payoff func-
Folk Theorem with perfect monitoring and no tion g : A ! ℝI . Player’s observation of each
discounting is rather simple: if a player deviates other’s actions is described by a monitoring struc-
from the prescribed strategies, the deviation is ture given by a finite set of signals Yi for each
detected, and the deviating player is identified, player i and by aY transition probability
Y Q:A!
and all other players can then punish the deviator. DðY Þ (with A ¼ Ai and Y ¼ Y i ). When the
With imperfect monitoring, not all deviations are iI iI
detectable, and when a deviation is detected, devi- action profile chosen is a ¼ ðai Þi I , a profile of
ators are not necessarily identifiable. The notions signals y ¼ ðyi Þi I is drawn with probability Q(y|
of detection and identification allow fairly general a), and yi is observed by player i.
Folk Theorems for undiscounted games. We pre-
sent these results in subsection “Detection and
Perfect Monitoring Perfect monitoring is the
Identification.”
particular case in which each player observes the
With discounting, repeated games with perfect
actionprofile chosen:
for each player i, Y i ¼ A
monitoring possess a recursive structure that facil-
and Q ðyi Þi I a ¼ 1f8i ,yi ¼ag .
itates their study. Recursive methods can also be
successfully applied to discounted games with
public monitoring. We review the major results Almost-Perfect Monitoring The monitoring
of this branch of the literature in subsection “Pub- structure is e-perfect (see Mailath and Morris
lic Equilibria.” (2002)) when each player can identify the other
Almost-perfect monitoring is the natural player’s action with a probability of error less than
framework to study the effect of small departures e. This is the case if there exist functions fi :
from the perfect or public monitoring Ai Y i ! Ai for all i such that for all a A:
146 Repeated Games with Complete Information
support (i.e., under every action profile, all signal actions for other players, they are called equiva-
profiles have positive probability). The results lent (Lehrer 1990, 1991, 1992a, b):
presented in this survey all hold for sequential
equilibria, both for discounted and undiscounted Definition 1 Two actions ai and bi of player i are
games. equivalent, and we note ai bi, if they induce the
same distribution of other players’ signals:
Extensions of the Repeated Game
When players receive correlated inputs or may Qðyi jai , ai Þ ¼ Qðyi jbi,ai Þ, 8ai :
communicate between stages of the repeated
game, the relevant concepts are correlated and Example 1 Consider the two-player repeated
communication equilibria. Prisoner’s Dilemma where player 2 receives no
information about the actions of player 1 (e.g., Y2
Correlated Equilibria A correlated equilibrium is a singleton). The two actions of player 1 are thus
(Aumann 1974) of the repeated game is an equi- equivalent. The actions of player 2 are independent
librium of an extended game in which at a prelim- of the actions of player 1: player 1 has no impact on
inary stage, a mediator chooses a profile of the behavior of player 2. Player 2 has no power to
correlated random inputs and informs each player threat player 1 and in any equilibrium, player
of his own input; then the repeated game is played. 1 defects at every stage. Player 2 also defects at
A characterization of the set of correlated equilib- every stage: since player 1 always defects, he also
rium payoffs for two-player games is obtained by loses his threatening power. The only equilibrium
Lehrer (1992a). payoff in this repeated game is thus (1, 1).
Correlation arises endogenously in repeated
games with imperfect monitoring, as the signals Example 1 suggests that between two equiva-
received by the players can serve as correlated lent actions, a player chooses at equilibrium the
inputs that influence player’s continuation strate- one that yields the highest stage payoff. This is
gies. This phenomenon is called internal correla- indeed the case when the information received by
tion and was studied by Lehrer (1991) and a player does not depend on his own action.
Gossner and Tomala (2006, 2007). Lehrer (1990) studies particular monitoring struc-
tures satisfying this requirement. Recall from
Communication Equilibria An (extensive Lehrer (1990) the definition of semi-standard
form) communication equilibrium (Myerson monitoring structures: each action set Ai is endo-
1982; Forges 1986) of a repeated game is an equi- wed with a partition Ā i; when player i plays ai, the
librium of an extension of the repeated game in corresponding partition cell āi is publicly
which after every stage, players send messages to a announced. In the semi-standard case, two actions
mediator and the mediator sends back private out- are equivalent if and only if they belong to the
puts to the players. Characterizations of the set of same cell: ai bi , ai ¼ bi and the information
communication equilibrium payoffs are obtained received by a player on other player’s action does
under weak conditions on the monitoring structure; not depend on his own action.
see, e.g., Kandori and Matsushima (1998), Compte If player i deviates from ai to bi, the deviation is
(1998), and Renault and Tomala (2004). undetected if and only if ai bi : Otherwise it is
detected by all other players. A profile of mixed
Detection and Identification actions is called immune to undetectable deviations
if no player can profit by a unilateral deviation that
Equivalent Actions maintains the same distribution of other players’
A player’s deviation is detectable when it induces signals. The following result, due to Lehrer (1990),
a different distribution of signals for other players. characterizes equilibrium payoffs for undiscounted
When two actions induce the same distribution of games with semi-standard signals:
148 Repeated Games with Complete Information
Theorem 7 In an undiscounted repeated game Player 2 has incentives to play C2 most of the
with semi-standard signals, the equilibrium pay- time, since player 1 can statistically detect if player
offs are the individually rational payoffs that 2 uses the action D2 more frequently than prescribed.
belong to the convex hull of payoffs generated Player 1 also has incentives to play the real action
by mixed action profiles that are immune to C1, as this is the only way to observe player 2’s
undetectable deviations. action, which needs to be reported later on.
The key point in the example above is that
the two real actions C1 and D1 of player 1 are
More Informative Actions
equivalent but D1 is less informative than C1
When the information of player i depends on his
for player 1. For general monitoring structures
own action, some deviations may be detected in
an action ai is more informative than an action
the course of the repeated game even though they
bi if, whenever player i plays ai i, he can
are undetectable in the stage game.
reconstitute the signal he would have
observed, had he played bi. The precise defi-
Example 2 Consider the following modification
nition of the more informative relation relies
of the Prisoner’s Dilemma. The action set of player
on Blackwell’s ordering of stochastic experi-
1 is A1 ¼ fC1 , D1 g fC2 , D2 g and the action set
ments (Blackwell 1951):
of player 2 is {C2, D2}. An action for player 1 is
thus a pair a1 ¼ ðae1 , ae2 Þ. When the action profile
(ã1, ã2, a2) is played, the payoff to player i is Definition 2 The action ai of player i is more
gi(ã1, a2). We can interpret the component ã1 as a informative than the action bi if there exists a
real action (it impacts payoffs) and the component transition probability f : Y i ! DðY i Þ such that
ã2 as a message sent to player 2 (it does not impact for every ai and every profile of signals y,
payoffs). The monitoring structure is as follows:
X
• Player 2 only observes the message component f y0i yi Qðyi , yi jai , ai Þ ¼ Q y0i , yi bi , ai :
ã2 of the action of player 1. yi
• Player 1 perfectly observes the action of player
2 if he chooses the cooperative real action
a1 ¼ C1 ) and gets no information on player
(e We denote ai bi if ai bi and ai is more
a1 ¼ D1 ).
2’s action if he defects (e informative than bi.
Assume that prescribed strategies require
Note that the actions (C1, C2) and (D1, C2) of player i to play bi at stage t, and let ai bi .
player 1 are equivalent, and so are the actions Consider the following deviation from player i:
(C1, D2) and (D1, D2). However, it is possible to play ai at stage t, and reconstruct a signal at stage
construct an equilibrium that implements the t that could have arisen from the play of bi. In all
cooperative payoff along the following lines: subsequent stages, play as if no deviation took
place at stage t and as if the reconstructed signal
1. Using his message component, player 1 reports had been observed at stage t. Not only such a
at every stage t > 1 the previous action of deviation would be undetectable at stage t, since
player 2. Player 1 is punished in case of a ai bi , but it would also be undetectable at all
nonmatching report. subsequent stages, as it would induce the same
2. Player 2 randomizes between both actions, so probability distribution over plays as under the
that player 1 needs to play the cooperative action prescribed strategy. This argument shows that, if
in order to report player 2’s action accurately. The an equilibrium strategy specifies that player
weight on the defective action of player 2 goes to i plays ai, there is no bi ai that yields a higher
0 as t goes to infinity to ensure efficiency. expected stage payoff than ai.
Repeated Games with Complete Information 149
Definition 3 A distribution of action profiles Lehrer (1992b) assumes that payoffs are
p DðAÞ is immune to undetectable deviations if observable and obtains the following result:
for each player i and pair of actions ai, bi such that
bi ai : Theorem 9 In a two-player repeated game with
no discounting, nontrivial signals, and observable
X X
pðai , ai Þgi ðai , ai Þ pðai , ai Þgi ðbi , ai Þ payoffs, the set of equilibrium payoffs is the set of
ai ai individually rational payoffs induced by distribu-
tions that are immune to undetectable deviations.
If p is immune to undetectable deviations and if
player i is supposed to play ai, any alternative Finally, Lehrer (1991) shows that, in some
action bi that yields a greater expected payoff cases, one may dispense with the correlation
cannot be such that bi ai . device of Theorem 8, as all necessary correlation
The following proposition gives a necessary can be generated endogenously through the sig-
condition on equilibrium payoffs that holds nals of the repeated game:
both in the discounted and in the undiscounted
cases: Proposition 2 In two-player games with non-
trivial signals such that either the action profile is
Proposition 1 Every equilibrium payoff of the publicly announced or a blank signal is publicly
repeated game is induced by a distribution that is announced, the set of equilibrium payoffs coin-
immune to undetectable deviations. cides with the set of correlated equilibrium payoffs.
L R L R L R
T 1,1,1 4, 4, 0 0,3, 0 0,3, 0 3, 0, 0 3, 0, 0
B 4, 4, 0 4, 4, 0 0,3, 0 0,3, 0 3, 0, 0 3, 0, 0
W M E
150 Repeated Games with Complete Information
Consider the monitoring structure in which Theorem 10 For every game with imperfect
actions are not observable and the payoff vector monitoring, the set of communication equilibrium
is publicly announced. payoffs of the repeated game with no discounting
The payoff (1, 1, 1) is feasible and individu- is the set of approachable payoffs induced by
ally rational. The associated action profile distributions which are immune to undetectable
(T, L, W) is immune to undetectable deviations deviations.
since any individual deviation from (T, L, W)
changes the payoff. Tomala (1998) shows that pure strategy equi-
However, (1, 1, 1) is not an equilibrium pay- librium payoffs of undiscounted repeated games
off. The reason is that, player 3, who has the with public signals are also characterized through
power to punish either player 1 or player 2, cannot identifiability and approachability conditions (the
punish both players simultaneously: punishing approachability definition then uses pure strate-
player 1 rewards player 2 and vice versa. More gies). Tomala (1999) provides a similar character-
precisely, whatever weights player 3 puts on the ization in mixed strategies for a restricted class of
action M and E, the sum of player 1 and player 2’s public signals.
payoffs is greater than 3. Any equilibrium payoff
vector v ¼ ðv1 , v2 , v3 Þ must thus satisfy v1 þ
v2 3. In fact, it is possible to prove that the set Identification Through Endogenous
of equilibrium payoffs of this repeated game is the Communication A deviation may be identified
set of feasible and individually rational payoffs in the repeated game even though it cannot be
that satisfy this constraint. identified in the stage game. In a network game,
players are located at nodes of a graph, and each
player monitors his neighbors’ actions. Each
Approachability When the deviating player player can use his actions as messages that are
cannot be identified, it may be necessary to punish broadcasted to all the neighbors in the graph. The
a group of suspects altogether. The notion of a graph is called 2-connected if no single node
payoff that is enforceable under group punish- deletion disconnects the graph. Renault and
ments is captured by the definition of approach- Tomala (1998) show that when the graph is
able payoffs: 2-connected, there exists a communication proto-
col among the players that ensures that the identity
Definition 4 A payoff vector v is approachable if of any deviating player becomes common knowl-
there exists a strategy profile s such that, for every edge among all players in finite time. In this case,
player i and unilateral deviation ti of player i, the identification takes place through communication
average payoff of player i under ðti , si Þ is over the graph.
asymptotically less than or equal to vi.
Signals are public when all sets of signals are are the relative weights of present payoffs versus
identical,
i.e., Y i ¼ Y pub for each i and all future payoffs in the repeated game.
Q 8i , j, yi ¼ y j a ¼ 1 for every a. A public his-
Definition 6 A payoff vector u ℝI is decom-
tory of length t is a record of t tpublic signals, i.e., posable with respect to the set W ℝI if there
an element of H pub,t ¼ Y pub . A strategy si for
exists a mapping f : Y pub ! W such that v is a
player i is a public strategy if it depends on the
Nash equilibrium payoff of G(d, f). Fd(W) denotes
public history only: if hi ¼ ðai,1 , y1 , . . . , ai,t , yt Þ
the set of payoff vectors which are decomposable
and h0i ¼ a0i,1 , y01 , . . . , a0i,t , y0t are two histories
0 0 with respect to W.
that y1 ¼ y1 , . . . , yt ¼ yt , then
for player i such
0
si ðhi Þ ¼ si hi .
Let E(d) be the set of perfect public equilibrium
payoffs of the repeated game discounted at the
Definition 5 A perfect public equilibrium is a
rate d. The following result is due to Abreu et al.
profile of public strategies such that after every
(1990):
public history, each player’s continuation strategy
is a best reply to the opponents’ continuation
Theorem 11 E(d) is the largest bounded set
strategy profile.
W such that W Fd ðW Þ.
The repetition of a Nash equilibrium of the
Fudenberg and Levine (1994) derive an
stage game is a perfect public equilibrium, so
asymptotic characterization of the set of PPE pay-
that perfect public equilibria exist. Every perfect
offs when the discount factor goes to 1 as follows.
public equilibrium is a sequential equilibrium:
Given a vector l ℝI , define the score in the
any consistent belief assigns probability 1 to the
direction l as
realized public history and thus correctly forecasts
future opponents’ choices.
kðlÞ ¼ suphl, ui
Folk Theorems for Public Equilibria with pairwise full rank, then every convex and
The recursive structure of Theorem 11 and the compact subset of the interior of F is a subset of
asymptotic characterization of PPE payoffs E(d) for d large enough.
given by Theorem 12 are essential tools for find-
ing sufficient conditions under which every feasi- In particular, under the conditions of the theo-
ble and individually rational payoff is an rem, every feasible and individually rational pay-
equilibrium payoff, i.e., conditions under which off vector is arbitrarily close to a PPE payoff for
a Folk Theorem holds. large enough discount factors. Variations of this
The two conditions under which a Folk Theo- result can be found in Fudenberg et al. (1994) and
rem in PPEs holds are (1) a condition of detect- Fudenberg and Levine (1994).
ability of deviations and (2) a condition of
identifiability of deviating players. Extensions
Definition 7 A profile of mixed actions s ¼ The Public Part of a Signal The definition of
ðsi , si Þ has individual full rank if for each player perfect public equilibria extends to the case in
i, the probability vectors (in the vector space which each player’s signal consists of two com-
ℝY pub ) ponents: a public component and a private com-
ponent. The public components of all players’
fQð
jai , si Þ : ai Ai g signals are the same with probability 1. A public
strategy is then a strategy that depends only on the
are linearly independent. public components of past signals, and all the
analysis carries through.
If s has individual full rank, no player can
change the distribution of his actions without Public Communication In the public communi-
affecting the distribution of public signals. Indi- cation extension of the repeated game, players
vidual full rank is thus a condition on detectability make public announcements between any two
of deviations. stages of the repeated game. The profile of public
announcements then forms a public signal, and
Definition 8 A profile of mixed actions s has recursive methods can be successfully applied.
pairwise full rank if for every pair of players i 6¼ j, The fact that public communication is a powerful
the family of probability vectors instrument to overcome the difficulties arising
from private signals was first observed by Matsu-
Qð
jai , si Þ : ai Ai g [ Q
a j , sj : a j A j shima (1991a, b). Ben and Kahneman (1996),
Kandori and Matsushima (1998), and Compte
has rank jAi j þ A j 1. (1998) prove Folk Theorems in games with pri-
vate signals and public communication. Kandori
Under the condition of pairwise full rank, devi- (2003) shows that in games with public monitor-
ations from two distinct players induce distinct ing, public communication allows to relax the
distributions of public signals. Pairwise full rank conditions for the Folk Theorem of Fudenberg
is therefore a condition of identifiability of devi- et al. (1994).
ating players.
Fudenberg et al. (1994) prove the following Private Strategies in Games with Public
theorem: Monitoring PPE payoffs do not cover the full
set of sequential equilibrium payoffs, even when
Theorem 13 Assume the set of feasible and indi- signals are public, as some equilibria may rely on
vidually rational payoff vectors F has nonempty players using private strategies, i.e., strategies
interior. If every pure action profile has individual that depend on past chosen actions and past pri-
full rank and if there exists a mixed action profile vate signals. See Mailath et al. (2002) and Kandori
Repeated Games with Complete Information 153
and Obara (2006) for examples. In a minority (2002), and Ely and Valimaki (2002) show that a
game, there are an odd number of players; each Folk Theorem obtains.
player chooses between actions A and B. Players Piccione (2002) and Ely and Valimaki (2002)
choosing the least chosen (minority) action get a study a particular class of equilibria called belief-
payoff of 1, and other players get 0. The public free. Strategies form a belief-free equilibrium if,
signal is the minority action. Renault et al. (2005, whatever player i’s belief on the opponent’s pri-
2008) show that, for minority games, a Folk The- vate history, the action prescribed by i’s strategy is
orem holds in private strategies but not in public a best response to the opponent’s continuation
strategies. Only few results are known concerning strategy.
the set of sequential equilibrium payoffs in private Ely et al. (2005) extend the belief-free
strategies of games with public monitoring. approach to general games. However, they show
A monotonicity property is obtained by Kandori that, in general, belief-free strategies are not
(1992) who shows that the set of payoffs associ- enough to reconstruct a Folk Theorem, even
ated to sequential equilibria in pure strategies is when monitoring is almost perfect.
increasing with respect to the quality of the public For general games and with any number of
signal. players, Hörner and Olszewski (2006) prove a
Folk Theorem with almost-perfect monitoring.
The strategies that implement the equilibrium
Almost-Public Monitoring Some PPEs are
payoffs are defined on successive blocks of a
robust to small perturbations of public signals.
fixed length and are block belief-free in the
Considering strategies with finite memory,
sense that, at the beginning of every block,
Mailath and Morris (2002) identify a class of
every player is indifferent between several con-
public strategies which are sequential equilibria
tinuation strategies, independently on his belief
of the repeated game with imperfect private mon-
as to which continuation strategies are used by
itoring, provided that the monitoring structure is
the opponents. This result closes the almost-
close enough to a public one. They derive a Folk
perfect monitoring case by showing that equi-
Theorem for games with almost-public and
librium payoffs in the Folk Theorem are robust
almost-perfect monitoring. Hörner and Olszewski
to a small amount of imperfect monitoring.
(2007) strengthen this result and prove a Folk
Theorem for games with almost-public monitor-
General Stochastic Signals
ing. Under detectability and identifiability condi-
Besides the case of public (or almost-public) mon-
tions, they prove that feasible and individually
itoring, little is known about equilibrium payoffs
rational payoffs can be achieved by sequential
of repeated games with discounting and imperfect
equilibria with finite memory.
signals.
The Prisoner’s Dilemma game is particularly
Almost-Perfect Monitoring important for economic applications. In particular,
Monitoring is almost perfect when each player it captures the essential features of collusion with
can identify the action profile of his opponents the possibility of secret price cutting, as in
with near certainty. Almost-perfect monitoring is Stigler (1964).
the natural framework to study the robustness of When signals are imperfect, but independent
the Folk Theorem to small departures from the conditionally on the pair of actions chosen
assumption that actions are perfectly observed. (a condition called conditional independence),
The first results were obtained for the Pris- Matsushima (2004) shows that the efficient out-
oner’s Dilemma. Sekiguchi (1997) shows that come of the repeated Prisoner’s Dilemma game
the cooperative outcome can be approximated at is an equilibrium outcome if players are suffi-
equilibrium when players are sufficiently patient ciently patient. In the equilibrium construction,
and monitoring is almost perfect. Under the same every player’s action is constant in every block.
assumptions, Bhaskar and Obara (2002), Piccione The conditional independence assumption is
154 Repeated Games with Complete Information
crucial in that it implies that, during every player i’s opponents can force down player i to ui
block, a player has no feedback as to what sig- by repeatedly playing the min max strategy
nals the other player has received. The condi- against player i.
tional independence assumption is nongeneric: With two players, it is a consequence of von
it holds for a set of monitoring structures of Neumann’s min max theorem (von Neumann
empty interior. 1928) that vi is the IR level for player i.
Fong et al. (2007) prove that efficiency can be For any number of players, Gossner and
obtained at equilibrium without conditional inde- Hörner (2006) show that vi is equal to the min
pendence. Their main assumption is that there max in the one-shot game whenever there exists a
exists a sufficiently informative signal, but this garbling of player i’s signal such that, condition-
signal needs not be almost perfectly informative. ally on i’s garbled signal, the signals of i’s oppo-
Their result holds for a family of monitoring nents are independent. Furthermore, the condition
structures of nonempty interior. It is the first result in Gossner and Hörner (2006) is also a necessary
that establishes cooperation in the Prisoner’s condition in normal form games extended by cor-
Dilemma with impatient players for truly imper- relation devices (as in Aumann (1974)).
fect, private, and correlated signals. A continuity result in the IR level also applies
for monitoring structure close to those that satisfy
Punishment Levels the conditional independence condition.
Individual rationality is a key concept for Folk The following example shows that, in general,
Theorems and equilibrium payoff characteriza- the IR level can be lower than vi:
tions. Given a repeated game, define the individ-
ually rational (IR) level of player i as the lowest Example 4 Consider the following three-player
payoff down to which this player may be punished game. Player 1 chooses the row, player 2 the col-
in the repeated game. umn, and player 3 the matrix. Players 1 and 2 per-
fectly observe the action profile, while player
Definition 9 The individual rational level of 3 observes player 2’s action only. As we deal
player i is with the IR level of player 3, we specify the payoff
" # for this player only.
X
lim min max Esi ,si ð1 dÞdt1 gi,t L R L R
d!1 si si
t
T 0 0 –1 0
where the min runs over profiles of behavior strat- B 0 –1 0 0
egies for player i and the max over behavior
W E
strategies of player i.
A simple computation shows that u3 ¼ 14 and
That is, the individually rational level is the that the min max strategies of players 1 and 2 are
limit (as the discount factor goes to one) of the uniform. Consider the following strategies of
min max value of the discounted game (other players 1 and 2 in the repeated game: randomize
approaches, through undiscounted games or uniformly at odd stages and play (T, L) or (B, R)
limits of finitely repeated games, yield equivalent depending on player 1’s previous action at even
definitions; see Gossner and Tomala (2006)). stages. Against these strategies, player 3 cannot
obtain better than 14 at odd stages and 12 at even
Comparison of the IR Level with the Min Max stages, resulting in an average payoff of 38.
With perfect monitoring, the IR level of player i is
player i’s min max in the one-shot game, as Entropy Characterizations
defined by Eq. 1. With imperfect monitoring, the The exact computation of the IR level in games
IR level for player i is never larger than vi since with imperfect monitoring requires to analyze the
Repeated Games with Complete Information 155
optimal trade-off for punishing players between Benoît JP, Krishna V (1985) Finitely repeated games.
the production of correlated and private signals Econometrica 53(4):905–922
Bhaskar V, Obara I (2002) Belief-based equilibria in the
and the use of these signals for effective punish- repeated prisoners’ dilemma with private monitoring.
ment. Gossner and Vieille (2002) and Gossner and J Econ Theory 102:40–70
Tomala (2006) develop tools based on informa- Blackwell D (1951) Comparison of experiments. In: Pro-
tion theory to analyze this trade-off. At any stage, ceedings of the second Berkeley symposium on math-
ematical statistics and probability. University of
the amount of correlation generated (or spent) by California Press, Berkeley, pp 93–102
the punishing players is measured using the Blackwell D (1956) An analog of the minimax theorem for
entropy function. Gossner and Tomala (2007) vector payoffs. Pac J Math 6:1–8
derive a characterization of the IR level for some Compte O (1998) Communication in repeated games with
imperfect private monitoring. Econometrica 66:597–626
classes of monitoring structures. Gossner et al. Ely JC, Välimäki J (2002) A robust folk theorem for the
(2009) provide methods explicit computations of prisoner’s dilemma. J Econ Theory 102:84–106
the IR level. In particular, for the above example, Ely JC, Hörner J, Olszewski W (2005) Belief-free equilib-
the IR level is computed and is about 0.401. ria in repeated games. Econometrica 73:377–415
Fong K, Gossner O, Hörner J, Sannikov Y (2007) Effi-
Explicit computations of IR levels for other ciency in a repeated prisoner’s dilemma with imperfect
games are derived by Goldberg (2007). private monitoring. Mimeo
Forges F (1986) An approach to communication equilibria.
Acknowledgments The authors are grateful to Johannes Econometrica 54:1375–1385
Hörner for insightful comments. Forges F, Mertens J-F, Neyman A (1986) A counterexample
to the folk theorem with discounting. Econ Lett 20:7–7
Friedman J (1971) A noncooperative equilibrium for
supergames. Rev Econ Stud 38:1–12
Bibliography Fudenberg D, Levine DK (1994) Efficiency and observ-
ability with long-run and short-run players. J Econ
Primary Literature Theory 62:103–135
Abreu D (1988) On the theory of infinitely repeated games Fudenberg D, Maskin E (1986) The folk theorem in
with discounting. Econometrica 56:383–396 repeated games with discounting or with incomplete
Abreu D, Pearce D, Stacchetti E (1990) Toward a theory of information. Econometrica 54:533–554
discounted repeated games with imperfect monitoring. Fudenberg D, Levine DK, Maskin E (1994) The folk
Econometrica 58:1041–1063 theorem with imperfect public information.
Abreu D, Dutta P, Smith L (1994) The folk theorem for Econometrica 62(5):997–1039
repeated games: a NEU condition. Econometrica Fudenberg D, Levine DK, Takahashi S (2007) Perfect
62:939–948 public equilibrium when players are patient. Game
Aumann RJ (1960) Acceptable points in games of perfect Econ Behav 61:27–49
information. Pac J Math 10:381–417 Goldberg Y (2007) Secret correlation in repeated games
Aumann RJ (1964) Mixed and behavior strategies in infi- with imperfect monitoring: the need for nonstationary
nite extensive games. In: Dresder M, Shapley LS, strategies. Math Oper Res 32:425–435
Tucker AW (eds) Advances in game theory. Princeton Gossner O (1995) The folk theorem for finitely repeated
University Press, Princeton, pp 627–650 games with mixed strategies. Int J Game Theory
Aumann RJ (1974) Subjectivity and correlation in random- 24:95–107
ized strategies. J Math Econ 1:67–95 Gossner O, Hörner J (2006) When is the individually
Aumann RJ (1981) Survey of repeated games. In: Aumann rational payoff in a repeated game equal to the minmax
RJ (ed) Essays in game theory and mathematical eco- payoff? DP 1440, CMS-EMS
nomics in honor of Oskar Morgenstern. Gossner O, Tomala T (2006) Empirical distributions of
Wissenschaftsverlag, Bibliographisches Institut, beliefs under imperfect observation. Math Oper Res
Mannheim, pp 11–42 31(1):13–30
Aumann RJ, Shapley LS (1976) Long-term competition – a Gossner O, Tomala T (2007) Secret correlation in repeated
game theoretic analysis. Re-edited in 1994. See games with signals. Math Oper Res 32:413–424
Aumann and Shapley (1994) Gossner O, Vieille N (2002) How to play with a biased
Aumann RJ, Shapley LS (1994) Long-term competition – a coin? Game Econ Behav 41:206–226
game theoretic analysis. In: Megiddo N (ed) Essays on Gossner O, Laraki R, Tomala T (2009) Informationally
game theory. Springer, New York, pp 1–15 optimal correlation. Math Program B 116:147–112
Ben EP, Kahneman M (1996) Communication in repeated Green EJ, Porter RH (1984) Noncooperative collusion
games with private monitoring. J Econ Theory under imperfect price information. Econometrica
70(2):281–297 52:87–100
156 Repeated Games with Complete Information
Hörner J, Olszewski W (2006) The folk theorem with Neyman A (1999) Cooperation in repeated games when the
private almost-perfect monitoring. Econometrica number of stages is not commonly known.
74(6):1499–1544 Econometrica 67:45–64
Hörner J, Olszewski W (2007) How robust is the folk Piccione M (2002) The repeated prisoner’s dilemma with
theorem with imperfect public monitoring? Mimeo imperfect private monitoring. J Econ Theory
Kandori M (1992) The use of information in repeated 102:70–84
games with imperfect monitoring. Rev Econ Stud Renault J, Tomala T (1998) Repeated proximity games. Int
59:581–593 J Game Theory 27:539–559
Kandori M (2003) Randomization, communication, and Renault J, Tomala T (2004) Communication equilibrium
efficiency in repeated games with imperfect public payoffs of repeated games with imperfect monitoring.
monitoring. Econometrica 71:345–353 Game Econ Behav 49:313–344
Kandori M, Matsushima H (1998) Private observation, Renault J, Scarlatti S, Scarsini M (2005) A folk theorem for
communication and collusion. Rev Econ Stud minority games. Game Econ Behav 53:208–230
66:627–652 Renault J, Scarlatti S, Scarsini M (2008) Discounted and
Kandori M, Obara I (2006) Efficiency in repeated games finitely repeated minority games with public signals.
revisited: the role of private strategies. Econometrica Math Soc Sci 56:44–74
74:499–519 Rubinstein A (1977) Equilibrium in supergames. Center
Kohlberg E (1975) Optimal strategies in repeated games for Research in Mathematical Economics and Game
with incomplete information. Int J Game Theory Theory. Research memorandam, The Hebrew Univer-
4:7–24 sity, Jerusalem 25
Kreps DM, Wilson RB (1982) Sequential equilibria. Rubinstein A (1979) Equilibrium in supergames with the
Econometrica 50:863–894 overtaking criterion. J Econ Theory 21:1–9
Kuhn HW (1953) Extensive games and the problem of Rubinstein A (1994) Equilibrium in supergames. In:
information. In: Kuhn HW, Tucker AW (eds) Contri- Megiddo N (ed) Essays on game theory. Springer,
butions to the theory of games, vol II, vol 28, Annals of New York, pp 17–28
mathematical studies. Princeton University Press, Sekiguchi T (1997) Efficiency in repeated prisoner’s
Princeton, pp 193–216 dilemma with private monitoring. J Econ Theory
Lehrer E (1990) Nash equilibria of n-player repeated 76:345–361
games with semi-standard information. Int J Game Selten R (1965) Spieltheoretische Behandlung eines
Theory 19:191–217 Oligopolmodells mit Nachfrageträgheit. Z Gesamte
Lehrer E (1991) Internal correlation in repeated games. Int Staatswiss 12:201–324
J Game Theory 19:431–456 Smith L (1995) Necessary and sufficient conditions for the
Lehrer E (1992a) Correlated equilibria in two-player perfect finite horizon folk theorem. Econometrica
repeated games with nonobservable actions. Math 63:425–430
Oper Res 17:175–199 Sorin S (1986) On repeated games with complete informa-
Lehrer E (1992b) Two players repeated games with non tion. Math Oper Res 11:147–160
observable actions and observable payoffs. Math Oper Stigler G (1964) A theory of oligopoly. J Polit Econ
Res 17:200–224 72:44–61
Lehrer E, Pauzner A (1999) Repeated games with differ- Tomala T (1998) Pure equilibria of repeated games with
ential time preferences. Econometrica 67:393–412 public observation. Int J Game Theory 27:93–109
Luce RD, Raiffa H (1957) Games and decisions: introduc- Tomala T (1999) Nash equilibria of repeated games with
tion and critical survey. Wiley, New York observable payoff vector. Game Econ Behav
Mailath G, Morris S (2002) Repeated games with almost- 28:310–324
public monitoring. J Econ Theory 102:189–229 von Neumann J (1928) Zur Theorie der
Mailath GJ, Matthews SA, Sekiguchi T (2002) Private Gesellschaftsspiele. Math Ann 100:295–320
strategies in finitely repeated games with imperfect Wen Q (1994) The “folk theorem” for repeated games with
public monitoring. Contrib Theor Econ 2(1), Aritcle 2 complete information. Econometrica 62:949–954
Matsushima H (1991a) On the theory of repeated games
with private information: part i: anti-folk theorem with-
out communication. Econ Lett 35:253–256 Books and Reviews
Matsushima H (1991b) On the theory of repeated games Mailath GJ, Samuelson L (2006) Repeated games and
with private information: part ii: revelation through reputations: long-run relationships. Oxford University
communication. Econ Lett 35:257–261 Press, Oxford
Matsushima H (2004) Repeated games with private mon- Mertens J-F (1986) Repeated games. In: Proceedings of the
itoring: two players. Econometrica 72:823–852 international congress of mathematicians, Berkeley,
Myerson RB (1982) Optimal coordination mechanisms in pp 1528–1577
generalized principal-agent problems. J Math Econ Mertens J-F, Sorin S, Zamir S (1994) Repeated games.
10:67–81 CORE discussion paper. Univeritsé Catholique de
Nash JF (1951) Noncooperative games. Ann Math Levain, Louvain-la-neuve, pp 9420–9422
54:289–295
Equilibrium Strategy profile where each
Repeated Games with player’s strategy is in best reply against the
Incomplete Information strategy of the other players.
Completely revealing strategy Strategy of a
Jérôme Renault player which eventually reveals to the other
Toulouse School of Economics, Université players everything known by this player on
Toulouse 1 Capitole, Toulouse, France the selected state.
Non revealing strategy Strategy of a player
which reveals nothing on the selected state.
Article Outline The simplex of probabilities over a finite
set For a finite set S, we denote by D(S) the
Glossary and Notation set of probabilities over S, and we identify D(S)
Definition of the Subject and Its Importance to {p = (ps)s S ℝS, 8s S ps 0 and
Strategies, Payoffs, Value, and Equilibria s S ps = 1}. Given s in S, the Dirac measure
The Standard Model of Aumann and Maschler on s will be denoted by ds. For p = (ps)s S and
Vector Payoffs and Approachability q = (qs)s S in ℝS, we will use, unless other-
Zero-Sum Games with Lack of Information on wise specified, kp qk = s S j ps qsj.
Both Sides
Nonzero-sum Games with Lack of Information on
One Side Definition of the Subject and Its
Nonobservable Actions Importance
Advances
Future Directions Introduction
Bibliography In a repeated game with incomplete information,
there is a basic interaction called stage game which
Glossary and Notation is repeated over and over by several participants
called players. The point is that the players do not
Repeated game with incomplete information A perfectly know the stage game which is repeated,
situation where several players repeat the same but rather have different knowledge about it. As
stage game, the players having different knowl- illustrative examples, one may think of the follow-
edge of the stage game which is repeated. ing situations: an oligopolistic competition where
Strategy of a player A rule, or program, describ- firms do not know the production costs of their
ing the action taken by the player in any pos- opponents, a financial market where traders bar-
sible case which may happen. gain over units of an asset which terminal value is
Strategy profile A vector containing a strategy imperfectly known, a cryptographic model where
for each player. some participants want to transmit some informa-
Lack of information on one side Particular case tion (e.g., a credit card number) without being
where all the players but one perfectly know understood by other participants, a conflict when
the stage game which is repeated. a particular side may be able to understand the
Zero-sum games 2-player games where the communications inside the opponent side
players have opposite payoffs. (or might have a particular type of weapons),. . .
Value Solution (or price) of a zero-sum game, in Natural questions arising in this context are as
the sense of the fair amount that player 1 should follows. What is the optimal behavior of a player
give to player 2 to be entitled to play the game. with a perfect knowledge of the stage game? Can we
determine which part of the information such a player zero at each stage t 2 (because player 2 can play
should use? Can we price the value of possessing a Left or Right depending on player 1’s first action).
particular information? How should one player On the opposite, player 1 may not use his infor-
behave while having only a partial information? mation and play a nonrevealing, or NR, strategy, i.e.,
Foundations of games with incomplete infor- a strategy which is independent of the selected state.
mation have been studied in (Harsanyi 1967;
He can consider the average matrix 12 Ga þ 12 Gb ¼
Mertens and Zamir 1985). Repeated games with
incomplete information have been introduced in 1=2 0
the sixties by Aumann and Maschler (1995), and and play independently at each stage
0 1=2
we present here the basic and fundamental results an optimal mixed action in this matrix, i.e., here the
of the domain. Let us start with a few well-known unique mixed action 12 Top þ 12 Bottom. It will turn
elementary examples (Aumann and Maschler out that this is here the optimal behavior for player 1,
1995; Zamir 1992). and the value of the repeated game is the value of the
average matrix, i.e., 1/4.
Basic Examples In each example, there are two
players, and the game is zero-sum, i.e., player 2’s
4 0 2 0 4 2
payoff always is the opposite of player 1’s payoff. Example 3 Ga ¼ and Gb ¼ .
4 0 2 0 4 2
There are two states a and b, and the possible stage
Playing a CR strategy for player 1 does not
games are given by two real matrices Ga and Gb
guarantee more than zero in the long-run, because
with identical size. Initially a true state of nature k
player 2 will eventually be able to play Middle if the
{a, b} is selected with even probability between
state is a, and Left if the state is b. But a NR strategy
a and b, and k is announced to player 1 only. Then
will not do better, because the average matrix 12 Ga
the matrix game Gk is repeated over and over: at
2 2 0
every stage, simultaneously player 1 chooses a row þ 2 G is
1 b
, hence has value 0.
i, whereas player 2 chooses a column j, the stage 2 2 0
payoff for player 1 is then Gk(i, j), but only i and We will see later that an optimal strategy for
j are publicly announced before proceeding to the player 1 in this game is to play as follows. Initially,
next stage. Players are patient and want to maxi- player 1 chooses an element s in {T, B} as follows:
mize their long-run average expected payoffs. if k = a, then s = T with probability 3/4, and thus
s = B with probability 1/4; and if k = b, then s = T
0 0 1 0 with probability 1/4, and s = B with probability 3/4.
Example 1 Ga ¼ and Gb ¼ .
0 1 0 0 Then at each stage player 1 plays row s, indepen-
This example is trivial. In order to maximize dently of the actions taken by player 2. The condi-
his payoff, player 1 just has to play, at any stage, tional probabilities satisfy: P (k = a|s = T) = 3/4,
the Top row if the state is a and the Bottom row if and P (k = a|s = B) = 1/4. At the end of stage 1,
the state is b. player 2 will have learnt, from the action played
by his opponent, something about the selected
1 0 0 0 state: his belief on the state will move from 12 a þ 12
Example 2 Ga ¼ and Gb ¼ .
0 0 0 1 b to 34 a þ 14 b or to 14 a þ 34 b. But player 2 still does
A naive strategy for player 1 would be to play not know perfectly the selected state. Such a strat-
at stage 1: Top if the state is a, and Bottom if the egy of player 1 is called partially revealing.
state is b. Such a strategy is called completely
revealing, or CR, because it allows player 2 to
deduce the selected state from the observation of General Definition
the actions played by player 1. This strategy of Formally, a repeated game with incomplete infor-
player 1 would be optimal here if a single stage mation is given by the following data. There is a
was to be played, but it is a very weak strategy on set of players N and a set of states K. Each player
the long run and does not guarantee more than i in N has a set of actions Ai and a set of signals Ui,
Repeated Games with Incomplete Information 159
and we denote by A = ∏i N Ai the set of action 4. The most standard case of signaling function is
profiles and by U = ∏i N Ui the set of signal when each player exactly learns, at the end of
profiles. Every player i has a payoff function gi: each stage t, the whole action profile at. Such
K A ! ℝ. There is a signaling function q: games are usually called games with “perfect
K A ! D(U), and an initial probability p monitoring,” “full monitoring,” “perfect obser-
D(K U). In what follows, we will always vation” or with “observable actions.”
assume the sets of players, states, actions, and
signals to be nonempty and finite.
Strategies, Payoffs, Value, and Equilibria
A repeated game with incomplete information
can thus be denoted by G = (N, K, (Ai)i N,
Strategies
(Ui)i N, (gi)i N, q, p). The progress of the game
A (behavior) strategy for player i is a rule, or
is the following.
program, describing the action taken by this
player in any possible case which may happen.
• Initially, an element k, ui0 i is selected These actions may be chosen at random, so a
according to p: k is the realized state of nature strategy for player i is an element si ¼ sit t1 ,
and will remain fixed, and each player i learns where for each t, sit is a mapping from
ui0 (and nothing more than ui0 ). Ui (Ui Ai)t1 to D(Ai) giving the lottery
• At each integer stage t 1, simultaneously played by player i at stage t as a function of the
i i
every player i chooses past signals and actions of player i. The set of
an action at in A , and
we denote by at ¼ at i the action profile
i strategies for player i is denoted by Si.
played at stage t. The stage payoff of a player A history of length t in G is a sequence (k, u0,
i is then given by gi(k, at). A signal profile uti i a1, u1, . . ., at, ut), and the set of such histories is
is selected according to q(k, at), and each the finite set K U (A U)t. An infinite
player i learns uti (and nothing more than uti ) history is called a play, and the set of plays is
before proceeding to the next stage. denoted by O = K U (A U)1 and is
endowed with the product s-algebra. A strategy
Remarks profile s = (si)i naturally induces, together with
the initial probability p, a probability distribution
1. The players do not necessarily know their stage over the set of histories of length t. This proba-
payoff after each stage (as an illustration, ima- bility uniquely extends to a probability over
gine the players bargaining over units of an plays and is denoted by ℙp, s.
asset which terminal value will only be
known “at the end” of the game). This is with-
out loss of generality, because it is possible to Payoffs
add hypotheses on q so that each player will be Given a time horizon T, the average expected
able to deduce his stage payoff from his real- payoff of player i, up to stage T, if the strategy
ized stage signal. profile s is played, is denoted by:
2. Repeated games with complete information are a !
particular case, corresponding to the situation 1 XT
giT ðsÞ ¼ E ℙ p ,s g ðk, at Þ :
i
where each initial signal ui0 reveals the selected T t¼1
state. Such games are studied in the chapter
▶ “Repeated Games with Complete Information” The T-stage game is the game GT where simul-
3. Games where the state variable k evolve from taneously each player i chooses
a strategy
si in Si,
stage to stage, according to the actions played, then receives the payoff giT ðsj Þj N .
are called stochastic games. These games are
not covered here, but in a specific chapter Given a discount factor l in (0, 1], the
entitled ▶ “Stochastic Games”. l-discounted payoff of player i is denoted by:
160 Repeated Games with Incomplete Information
!
X
1 u e. Similarly, Player 2 can guarantee u in G if
gil ðsÞ ¼ E ℙ p ,s l ð 1 lÞ t1 i
g ðk, at Þ : 8e > 0, ∃s2 S2, ∃T0, 8T T0, 8s1 S1, g1T
t¼1
ðs1 ,s2 Þ u þ e. If both player 1 and player 2 can
guarantee u, then u is called the uniform value of
The l-discounted game is the game Gl where the repeated game. A strategy s1 of player 1 satis-
simultaneously, each player i chooses
a strategy
si fying ∃T0, 8T T0, 8s2 S2, g1T ðs1 ,s2 Þ u is
in S , then receives the payoff gil ðsj Þj N .
i
then called an optimal strategy of player 1 (optimal
strategies of player 2 are defined similarly).
Remark A strategy for player i is called pure if it The uniform value, whenever it exists, is neces-
always plays in a deterministic way. A mixed strat- sarily unique. Its existence is a strong property, which
egy for player i is defined as a probability distribu- implies that both uT, as T goes to infinity, and ul, as l
tion over the set of pure strategies (endowed with goes to zero, converge to the uniform value.
the product s-algebra). Kuhn’s theorem (see
Aumann (1964), Kuhn (1953) or Sorin (2002) for Equilibria of General-Sum Games
a modern presentation) states that mixed strategies In the general case, the T -stage game GT can be
or behavior strategies are equivalent, in the follow- seen as the mixed extension of a finite game and
ing sense: for each behavior strategy si, there exists consequently possesses a Nash equilibrium. Sim-
a mixed strategy ti of the same player such that ilarly, the discounted game Gl always has, by the
ℙ p,si ,si ¼ ℙ p,ti ,si for any strategy profile si of Nash Glicksberg theorem, a Nash equilibrium.
the other players, and vice versa if we exchange the Concerning uniform notions, couples of optimal
words “behavior” and “mixed.” Unless otherwise strategies are generalized as follows.
specified, the word strategy will refer here to a
behavior strategy, but we will also sometimes Definitions 2 A strategy profile s = (si)i N is a
equivalently use mixed strategies, or even mixtures uniform Nash equilibrium of G if: (1) 8e > 0, s is
of behavior strategies. an e-Nash equilibrium in every finitely repeated
game sufficiently long, that is, ∃T0, 8T T0, 8i
N, 8ti Si, giT ðti ,si Þ giT ðsÞ þ e, and (2) the
Value of Zero-Sum Games
By definition the game is zero-sum if there are two sequence of payoffs giT ðsÞ i N converges to
T
players, say player 1 and player 2, with opposite a limit payoff (gi(s))i N in ℝN.
payoffs. The T -stage game GT can then be seen as
a matrix game; hence, by the minmax theorem it Remark The initial probability p will play a great
has a value vT ¼ sups1 inf s2 g1T ðs1 ,s2 Þ ¼ inf s2 role in the following analyses, so we will often
sups1 g1T ðs1 ,s2 Þ . Similarly, one can use Sion’s i ,p
write gT ðsÞ for giT ðsÞ, uT (p) for the value uT,
theorem (1958) to show that the l-discounted etc. . .
game has a value vl ¼ sups1 inf s2 g1l ðs1 ,s2 Þ ¼
inf s2 sups1 g1l ðs1 ,s2 Þ.
To study long term strategic aspects, it is also The Standard Model of Aumann and
important to consider the following notion of uni- Maschler
form value. Players are asked to play well uni-
formly in the time horizon, i.e., simultaneously in This famous model has been introduced in the
all game GT with T sufficiently large (or similarly sixties by Aumann and Maschler (see the reedi-
uniformly in the discount factor, i.e., simulta- tion (Aumann and Maschler 1995). It deals with
neously in all game Gl with l sufficiently low). zero-sum games with lack of information on one
side and observable actions, as in the basic exam-
Definitions 1 Player 1 can guarantee the real ples previously presented. There is a finite set of
number u in the repeated game G if: 8e > 0, states K, an initial probability p = (pk)k K on K,
∃s1 S1, ∃T0, 8T T0, 8s2 S2, g1T ðs1 ,s2 Þ and a family of matrix games Gk with identical
Repeated Games with Incomplete Information 161
Basic Tools: Splitting, Martingale, are here in a repeated context, and for every strat-
Concavification, and the Recursive Formula egy profile s one can define the process (pt(s))t 0
The following aspects are simple but fundamen- of the a posteriori of player 2. We have p0 = p,
tal. The initial probability p = (pk)k K represents and pt(s) is the random variable of player 2’s
the initial belief, or a priori, of player 2 on the belief on the state after the first t stages. More
selected state of nature. Assume that player precisely, we define for any t 0, ht = (i1, j1, . . .,
1 chooses his first action (or more generally a it, jt) (I J)t and k in K:
message or signal s from a finite set S) according
to a probability distribution depending on the
pk ℙ dk ,s ðht Þ
state, i.e., according to a transition probability pkt ðs, ht Þ ¼ ℙ p,s ðkj ht Þ ¼ :
x = (xk)k K D(S)K. For each signal s, the ℙ p ,s ð h t Þ
probability that s is chosen is denoted
l(x, s) = k pkxk(s), and given s such that pt(s, ht) = ( pkt (s, ht))k K D(K) (arbitrarily
l(x, s) > 0 the conditional probability defined if ℙp, s(ht) = 0) is the conditional proba-
kon K,
or a
posteriori of player 2, is p^ðx,sÞ ¼ plðxx,ðssÞÞ
k
. bility on the state of nature given that s is played
k K and ht has occurred in the first t stages. It is easy to
We clearly have: see that as soon as ℙp, s(ht) > 0, pt(s, ht) does not
X depend on player 2’s strategy s2, nor on player 2’s
p¼ lðx, sÞ^
p ðx, sÞ: (1) last action jt. It is fundamental to notice that:
sS
nX
cav f ðpÞ ¼ max l f ðps Þ, S
sS s
finite, 8s
X
ls 0, ps DðK Þ, l :
sS s
X
¼ 1, l p ¼ pg
sS s s
Concavification Lemma 3 If for any initial Repeated Games with Incomplete Information,
probability p, the informed player can guarantee Fig. 2 u and cavu
f(p) in the game G(p), then for any p this player
can also guarantee cavf (p) in G(p). Let us consider again the partially revealing
strategy previously described. With probability
Nonrevealing Games 1/2, the a posteriori will be 34 a þ 14 b, and player
play Top which is optimal in 4 G þ 4 G
3 a 1 b
As soon as player 1 uses a strategy which depends 1 will
3 1 1
on the selected state, the martingale of a posteriori ¼ . Similarly with probability 1/2,
will move and player 2 will have learnt something 3 1 1
on the state. This is the dilemma of the informed the a posteriori will be 14 a þ 34 b and player 1 will
player: he cannot use the information on the state play an optimal strategy in 14 Ga þ 34 Gb . Conse-
without revealing information. Imagine now that quently, this strategy guarantees 1/2 u(3/4) + 1/2 u
player 1 decides to reveal no information on the (1/4) = cavu(1/2) = 1 to player 1.
selected state and plays independently of it. Since
payoffs are defined via expectations, it is as if the Player 2 Can Guarantee the Limit Value
players were repeating the average matrix game In the infinitely repeated game with initial proba-
G(p) = k K pkGk. Its value is: bility p, player 2 can play as follows: T being
X fixed, he can play an optimal strategy in the T -
uðpÞ ¼ max min xðiÞyð jÞGð pÞði, jÞ stage game GT (p), then forget everything and play
x Dð I Þ y Dð J Þ
i, j again an optimal strategy in the T -stage game
X GT (p), etc. By doing so, he guarantees vT (p) in
¼ min max xðiÞyð jÞGðpÞði, jÞ:
y Dð J Þ x Dð I Þ
i, j the game G(p). So he can guarantee infT vT (p) in
this game, and this implies that lim supT
u is a Lispchitz function, with constant M, from vT (p) infT vT (p). As a consequence, we obtain:
D(K) to ℝ. Clearly, player 1 can guarantee u(p) in
the game G(p) by playing i.i.d. at each stage an Proposition 2 The sequence (vT (p))T converges
optimal strategy in G(p). By the concavification to infT vT (p), and this limit can be guaranteed by
lemma, we obtain: player 2 in the game G(p).
¼ E
sktþ1 ðht Þ stþ1 ðht Þ
j ht :
cavu(p). Let us now proceed to the formal proof.
Fix a strategy s1 of player 1, and define the Proof Fix t 0 and ht in (I J)t s.t. ℙp, s(ht) > 0.
strategy s2 of player 2 as follows: play at each For (it + 1, jt + 1) in I J, one has:
stage an optimal strategy in the matrix game G(pt),
where pt is the current a posteriori in D(K). Assume
that s = (s1, s2) is played in the repeated game pktþ1 ht , itþ1 , jtþ1 ¼ ℙ k~ ¼ kj ht , itþ1
G(p). To simplify notations, we write ℙ for ℙp, s, pt
ℙ k~ ¼ kj ht ℙ ðitþ1 jk, ht Þ
(ht) for pt (s, ht), etc. We use everywhere norms ||.||1. ¼
ℙ ðitþ1 j ht Þ
To avoid confusion between variables and random
p ðht Þs1 ðk, ht Þðitþ1 Þ
k
variables in the following computations, we will use ¼ t 1 tþ1
tildes to denote random variables, e.g., k~will denote stþ1 ðht Þðitþ1 Þ:
Consequently,
the random variable of the selected state.
X
E kptþ1 pt k j ht ¼ s1tþ1 ðht Þðitþ1 Þ
Lemma 4 P pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi X
itþ1 I
generated
s-algebra on plays first
2 by the t action
2
profiles. So E pktþ1 pkt ¼ E E pktþ1 þ We can now control payoffs. For t 0 and ht in
k 2 2 2 (I J)t:
pt 2pktþ1 pkt jHt ÞÞ ¼ E pktþ1 E pkt .
P
T 1 k
k 2 k 2 k 2 ~
So E
t¼0
ptþ1 pt Þ ¼ E pT p E Gk ~i tþ1 , ~j tþ1 j ht
pk 1 pk . By Cauchy-Schwartz inequality, we X
also have for each k, ¼ pkt ðht ÞGk s1tþ1 ðk, ht Þ, s2tþ1 ðht Þ
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
P
P kK
E 1 T1
t¼0 pktþ1 pkt TE
1 T 1 k
t¼0 ptþ1 pt
k 2 X
pkt ðht ÞGk s1tþ1 ðht Þ, s2tþ1 ðht ÞÞ
T
k and ht has previously occurred, and we write s1tþ1 s1tþ1 ðht Þk uðpt ðht ÞÞ
ðht Þ for the law of the action of player 1 of stage t + 1 X
P þM pkt ðht Þks1tþ1 ðk, ht Þ s1tþ1 ðht Þk,
after ht: s1tþ1 ðht Þ ¼ k K pkt ðht Þs1tþ1 ðk, ht Þ DðI Þ:stþ1 ðht Þ
kK
can be seen as the average action played by player
1 after ht and where u(pt(ht)) comes from the definition of s2.
will be used as a nonrevealing approx-
imation for s1tþ1 ðk, ht Þ k. The next lemma precisely By Lemma 5, we get:
links the variation of the martingale (pt(s))t 0, i.e.,
~
the information revealed by player 1, and the depen- E Gk ~i tþ1 ,~j tþ1 jht uðpt ðht ÞÞ
dence of player 1’s action on the selected state, i.e.,
the information used by player 1. þ M E kptþ1 pt k jht :
164 Repeated Games with Incomplete Information
1
¼ min max
Proposition 3 For p in D(K) and T 1, T þ 1 y DðJ Þ x DðI ÞK
!
P pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi X
M kK pk ð 1 pk Þ Gðp,x,yÞ þ T xðpÞðiÞvT ð^
p ðx, i ,
vT ðpÞ cavuðpÞ þ pffiffiffiffi :
T iI
It remains to conclude about the existence of where x = (xk(i))i I, k K, with xk the mixed action
the uniform value. We have seen that player used at stage 1 by player 1 if the state is k, G
1 can guarantee cavu(p) and that player 2 can (p, x, y) = k,i,j pkGk(xk(i), y( j)) is the expected
guarantee limT vT (p), and we obtain from Prop- payoff of stage 1, x(p)(i) = k K pkxk(i) is the
osition 3 that limT vT (p) cavu(p). This is probability that action i is played at stage 1, and
enough to deduce Aumann and Maschler’s cel- p̂(x, i) is the conditional probability on K given i.
ebrated “cavu” theorem. The next property interprets easily: the advan-
tage of the informed player can only decrease as
Theorem 1 Aumann and Maschler (1995). The
the number of stages increases (for a proof, one
game G(p) has a uniform value which is cavu(p).
can show that vT + 1 vT by induction on T, using
T -stage Values and the Recursive Formula the concavity of vT).
As the T -stage game is a zero-sum game with
incomplete information where player 1 is Lemma 6 The T -stage value vT (p) is non-
informed, we can write: increasing in T.
1 ,p
vT ðpÞ ¼ inf sup gT ðsÞ,
s2 S2 s1 S1 Vector Payoffs and Approachability
X 1 ,d
¼ inf 2 sup pk gT k ðsÞ,
s2 S s1 S1 k K The following model has been introduced by
! D. Blackwell (1956) and is, strictly speaking, not
X 1 ,d
¼ inf pk sup gT k ðsÞ : part of the general definition given in section “Def-
s2 S k K
2
s1 S1 inition of the Subject and Its Importance.” We still
have a family of I J matrices (Gk)k K, where K is
This shows that vT is the infimum of a family of a finite set of parameters. At each stage t, simulta-
affine functions of p, hence is a concave function neously player 1 chooses it I and player 2
Repeated Games with Incomplete Information 165
chooses jt J, and the stage “payoff” is the full Necessary and Sufficient Conditions for
vector G(it, jt) = (Gk(it, jt))k K in ℝK. Notice that Approachability
there is no initial probability or true state of nature Given a mixed action x in D(I), we write xG for the
here, and both players have a symmetric role. We set of possible vector payoffs when player 1 uses
assume here that after each stage both players x, i.e., xG = {G(x, y), y D(J)} = conv {i I
observe exactly the stage vector payoff (but one xiG(i, j), j J}. Similarly, we write Gy = {G(x, y),
can check that assuming that the action profiles x D(I)} for y in D(J).
are observed would not change the results).
A natural question is then to determine the sets
C in ℝK such that player 1 (for example) can force Definition 4 The set C is a B(lackwell)-set for
the average long term payoff to belong to C? Such player 1 if for every z 2= C, there exists z0 C and
sets will be called approachable by player 1. x D(I) such that: (i) ||z0 z|| = d(z, C) and
In section “Vector Payoffs and Approachability,” (ii) the hyperplane containing z0 and orthogonal to
we use Euclidean distances and norms. Denote by [z, z0 ] separates z from xG (Fig. 3).
F = {(Gk(i, j))k K, i I, j J} the finite set of
possible stage payoffs and by M a constant such that For example, any set xG, with x in D(I), is a
||u|| M for each u in F. A strategy for player 1, resp. B-set for player 1. Given a B-set for player 1, we
player 2, is an element s = (st)t 1, where st maps now construct a strategy s adapted to C as fol-
F t 1 into D(I), resp. D(J). Strategy spaces for lows. At each positive stage t + 1, player 1 con-
player 1 and 2 are, respectively, denoted by S and siders the current average payoff gt. If gt C, or if
T . A strategy profile (s, t) naturally induces a t = 0, s plays arbitrarily at stage t + 1. Otherwise,
unique probability on (I J F)1 denoted by s plays at stage t + 1 a mixed action x satisfying
ℙs, t. Let C be a “target” set that will always be the previous definition for z = ḡ t.
assumed, without loss of generality, a closed subset
of ℝK. We denote by gt the random variable, with Theorem 2 If C is a B-set for player 1, a strategy
value in F, of the payoff of stage t, and we use s adapted to C satisfies:
P
gt ¼ 1t tt0 ¼1 gt0 conv ðF Þ , and finally d t ¼ d
2M
ðgt ,C Þ for the distance from ḡ t to C. 8t T ,8t 1 E s,t ðd t Þ pffi and d t !t!1 0
t
Definition 3 C is approachable by player 1 if: ℙ s,t a:s:
8e > 0, ∃s S, ∃T, 8t T , 8t T , E s,t
As an illustration, in dimension 1 and for
(dt) e. C is excludable by player 1 if there exist
C = {0}, this theorem implies that a bounded
d > 0 such that {z ℝK, d(z, C) d} is approach-
sequence (xt)t of reals, such that the product
able by player 1.
P
1 T (i) C is a B-set for player 1,
xT þ1 T t¼1 xT is nonpositive for each T,
, (ii) 8y D(J), Gy \ C 6¼ ∅,
Cesaro converges to zero.
, (iii) C is approachable by player
X 1,
Proof Assume that player 1 plays s adapted to C, , (iv) 8q ℝ K , max min qk Gk ðx,yÞ
x DðI Þ y DðJ Þ
whereas player 2 plays some strategy t. Fix t 1, kK
and assume that gt2 = C. Consider z0 C and x inf < q,c >.
cC
D(I) satisfying (i) and (ii) of Definition 4 for z ¼ gt.
We have: Proof The implication (i) ) (iii) comes from
2
2 Theorem 2. Proof of (iii) ) (ii): assume there
d 2tþ1 ¼ d gtþ1 ,C
gtþ1 z0
exists y D(J) such that Gy\C = ∅. Since Gy
2
X tþ1
1 0
is approachable by player 2, then C is excludable
¼
tþ1 gl z
l¼1
by player 2 and thus C is not approachable by
1
2 player 1. Proof of (ii) ) (i): Assume that Gy \ C
¼
tþ1 gtþ1 z þ tþ1
0 t
ðgt z0 Þ
2
2 6¼ ∅ 8y D(J). Consider z 2 = C and define z0 as its
¼ tþ1 1
g z0
2 þ t d 2 projection onto C. Define the matrix game where
tþ1 tþ1 t
2t 0 0 payoffs are projected towards the direction z0 z,
þ < gtþ1 z ,gt z > : i.e., the matrix game k K(z0k zk)Gk. By
ð t þ 1Þ 2
assumption, one has: 8y D(J), ∃x D(I)
By hypothesis, the expectation, given the first such that G(x, y) C, hence such that:
t action profiles ht (I J)t, of the above scalar
t 2 2
product is nonpositive, so E d 2tþ1 jht tþ1 dt
2
< z0 z,Gðx,yÞ > minc c < z0 z,c >¼
E
gtþ1 z0
jht : Since E
g tþ1 z0
jht < z0 z,z0 > :
2 2
þ tþ1
1
2
E
gtþ1 gt
jht ð2M Þ2 , we have: So miny D(J) maxx D(I) < z0 z, G(x, y) >
< z0 z, z0 >. By the minmax theorem, there
2
t exists x in D(I) such that 8y D(J), < z0 z,
E d 2tþ1 j ht d 2t
tþ1 G(x, y) > < z0 z, z0 >, that is < z0 z, z0
G(x, y) > 0.
1 2 2
þ 4M : (2) (iv) means that any half-space containing C is
tþ1 approachable by player 1.
(iii) ) (iv) is thus clear. (iv) ) (i) is similar to
Taking the expectation, we get, whether gt2
=C
2 t 2 2 1 2 (ii) ) (i). ▢
or not: 8t 1, E d tþ1 tþ1 E d t þ tþ1 Up to minor formulation differences, Theo-
4M 2 . By induction, we obtain that for each t 1, rems 2 and 3 are due to Blackwell (1956). Later
pffi .
2
E d 2t 4Mt , and E ðd t Þ 2M on, X. Spinat (2002) proved the following
t
P 2 characterization.
Put now, as in Sorin (2002), et ¼ d 2t þ t0 >t 4M
t0 2
:
Inequality (2) gives E ðetþ1 j ht Þ et , so (et) is a Theorem 4 A closed set is approachable for
nonnegative supermartingale which expectation goes player 1 if and only if it contains a B-set for player 1.
to zero. By a standard probability result, we obtain
et ! t ! 1 0 ℙs, t a.s., and finally dt ! t ! 1 0 ℙs, As a consequence, it shows that adding the
t a.s. □ condition dt!t ! 10 ℙs, ta. s in the definition
This theorem implies that any B-set for player of approachability does not modify the notion.
1 is approachable by this player. The converse is
true for convex sets. Approachability for Player 1 Versus
Excludability for Player 2
As a corollary of Theorem 3, we obtain that: A
Theorem 3 Let C be a closed convex subset closed convex set in ℝK is either approachable by
of ℝK. player 1, or excludable by player 2.
Repeated Games with Incomplete Information 167
One can show that when K is a singleton, then Fix q = (qk)k in ℝK. If there exists k with
any set is either approachable by player 1, or q > 0, we clearly have infc C < q, c > =
k
excludable by player 2. A simple example of a 1 maxy D(J) minx D(I) k K qkGk(x, y).
set which is neither approachable for player 1 nor Assume now that qk 0 for each k, with q 6¼ 0.
excludable
by player 2 isgiven in dimension 2 by: Write s = k(qk).
ð0,0Þ ð0,0Þ X
G¼ , and C = {(1/2, v),
ð1,0Þ ð1,1Þ inf < q,c > ¼ qk l k
cC
0 v 1/4} [ {(1, v), 1/4 v 1} (see Sorin k K
q
2002). ¼ s < l, >
qs
s u
s
Weak Approachability X qk
On can weaken the definition of approachability s max min Gk ðx,yÞ
x Dð I Þ y Dð J Þ s
by giving up time uniformity. X k K
¼ max min qk Gk ðx,yÞ
y DðJ Þ x DðI Þ
k K
Definition 5 C is weakly approachable by player
1 if: 8e > 0, ∃T, 8t T, ∃s S, 8t T , E s,t This is condition (iv) of Theorem 3, adapted to
(dt) e. C is weakly excludable by player 1 if player 2. So C is a B-set for player 2, and a strategy
there exists d > 0 such that {z ℝK, d(z, C) d} t adapted to C satisfies by Theorem 2: 8s S, 8k
is weakly approachable by player 1. K,
!
N. Vieille (1992) has proved, via the consider- 1 XT
ation of certain differential games: E s ,t Gk ~i t , ~j t l k
T t¼1
Theorem 5 A subset of ℝK is either weakly !!
1 XT 2M
approachable by player 1 or weakly excludable E s ,t d Gk ~i t , ~j t ,C pffiffiffiffi ,
T t¼1 T
by player 2.
the decision-maker knows a priori nothing about played action 18 l at each stage where he actually
the way the sequence (jn)n is chosen. There is a played action i. For n 1, i and l in I, let us
given payoff function g: I J ! ℝ, known by the introduce the random variable:
decision-maker, and at the end of each stage n the
decision-maker observes jn and receives the pay-
1 X
off g(in, jn). Rn ði,l Þ ¼ ðgðl, jt Þ g ðit , jt ÞÞ:
n t f1,::,ng,it ¼i
Zero-Sum Games with Lack of Maxmin and Minmax of the Repeated Game
Information on Both Sides Theorem 1 generalizes as follows.
The following model has also been introduced by Theorem 7 Aumann and Maschler (1995) In the
Aumann and Maschler (1995). We are still in the repeated game G(p, q), the greatest quantity which
context of zero-sum repeated games with observ- can be guaranteed by player 1 is cavI vexII u(p, q),
able actions, but it is no longer assumed that one and the smallest quantity which can be guaranteed
of the players is fully informed. The set of states is by player 2 is vexII cavI u(p, q).
here a product K L of finite sets, and we have a
family of matrices (Gk, l)(k, l) KL with size I J, Aumann, Maschler, and Stearns also showed
as well as initial probabilities p on K, and q on L. that cavI vexII u(p, q) can be defended by player
In the game G(p, q), a state of nature (k, l) is first 2, uniformly in time, i.e., that 8e > 0,8s1 , ∃T 0 , ∃
p ,q
selected according to the product probability s2 , 8T T 0 , gT ðs1 , s2 Þ cavI vexII uðp, qÞv þ
p q, then k, resp. l, is announced to player 1, e. Similarly, vexII cavI u(p, q) can be defended by
resp. player 2 only. Then the matrix game Gk,l is player 1.
repeated over and over: at every stage, simulta- The proof uses the martingales of a posteriori
neously player 1 chooses a row i in I, whereas of each player, and a useful notion is that of the
player 2 chooses a column j in J, the stage payoff informational content of a strategy: for a strategy
for player 1 is Gk,l(i, j), but only i and j are publicly s1 of the first player, it is defined as: I ðs1 Þ¼ sups2
P P1 k
announced before proceeding to the next stage. p ,q k 1 2
E s 1 ,s 2 k K t¼0 ptþ1 ðs Þ pt ðs Þ
1
,
The average payoff for player 1 in the
1,p,q where pt(s1) is the a posteriori on K of player
T-stage game is written: gT ðs1 ,s2 Þ ¼
P ~ ~ 2 after stage t given that player 1 uses s1. By
E ps,1q,s2 T1 Tt¼1 Gk ,l ~i t , ~j t , and the T -stage linearity of the expectation, the supremum can
value is written vT (p, q). Similarly, the be restricted to strategies of player 2 which are
l-discounted value of the game will be written both pure and independent of l.
vl(p, q). Theorem 7 implies uðp,qÞ ¼
that cavI vexII
The nonrevealing game now corresponds to 1,p,q 1 2
sups1 S1 lim inf T inf s2 S2 gT ðs ,s Þ , and
the case where player 1 plays independently of
cavI vexII u(p, q) is called the maxmin of the
k and player 2 plays independently of l. Its value is
repeated game G(p, q). Similarly, vexII cav I uðp,qÞ
denoted by:
X ¼ inf s2 S2 limsupT sups1 S1 g1T ðs1 ,s2 Þ is called
uðp,qÞ ¼ max min pk ql Gk ,l ðx,yÞ: (3) the minmax of G(p, q). As a corollary, we obtain
x Dð I Þ y Dð J Þ
k ,l
that the repeated game G(p, q) has a uniform value
Given a continuous function f: D(K) D(L) ! if and only if: cavI vexII u(p, q) = vexII cavI
ℝ, we denote by cavI f the concavification of f with u(p, q). This is not always the case, and there
170 Repeated Games with Incomplete Information
exist counter-examples to the existence of the (Gk,l)k,l. One can easily show that any mapping
uniform value. in C is a uniform limit of elements in U.
Example 4 K = {a, a0 }, and L = {b, b0 }, with
Correlated Initial Information
p and q uniform.
A more general model can be written, where it is
no longer assumed that the initial information of
0 0 0 0
Ga,b ¼ the players is independent. The set of states is now
1 1 1 1
denoted by R (instead of K L), initially a state
0 1 1 1 1
Ga,b ¼ r in R is chosen according to a known probability
0 0 0 0 p = (pr)r R, and each player receives a determin-
0 1 1 1 1 istic signal depending on r. Equivalently, each
G a ,b ¼
0 0 0 0 player i has a partition Ri of R and observes the
element of his partition which contains the
0 0 0 0 0 0
Ga ,b ¼ selected state.
1 1 1 1
After the first stage, player 1 will play an action
x = (xr)r R which is measurable with respect to
Mertens and Zamir (1971) have shown that R1, i.e., (r ! xr) is constant on each atom of R1.
here, cavI vexII uðp,qÞ ¼ 14 < 0 ¼ vexII cavI uðp,qÞ. After having observed player 1’s action at the first
stage, the conditional probability on R necessarily
belongs to the set:
Limit Values
It is easy to see that for each T and l, the value (
X
functions vT and vl are concave in the first variable P I ð pÞ ¼ ðar pr Þr R , 8r ar 0, ar pr ¼ 1 and
and convex in the second variable. They are all r
Lipschitz functions, with the same constant )
M = maxi,j,k,l |Gk,l(i, j)|, and here also, recursive ða Þr is R measurable :
r 1
bimatrix. Formally, we have a finite set of states K, Definition 8 A joint plan is a triple (S, l, g),
an initial probability p on K, and families of where:
I J-payoff matrices (Ak)k K and (Bk)k K. Ini- – S is a finite non empty set (of messages),
tially, a state k in K is selected according to p, and – l = (lk)k K (signaling strategy) with for each
announced to player 1 only. Then the bimatrix k, lk D(S) and for each s, ls
game (Ak, Bk) is repeated over and over: at every P
¼def k K pk lks > 0,
stage, simultaneously player 1 chooses a row i in
– g = (gs)s S (contract) with for each s, gs
I, whereas player 2 chooses a column j in J, the
D(I J).
stage payoff for player 1 is then Ak(i, j), the stage
The idea is due to Aumann, Maschler, and
payoff for player 2 is Bk(i, j), but only i and j are
Stearns. Player 1 observes k, then chooses s
publicly announced before proceeding to the next
S according to lk and announces s to player
stage. Without loss of generality, we assume that
2. Then the players play pure actions
pk > 0 for each k and that each player has at least
corresponding to the frequencies gs(i, j), for i in
2 actions.
I and j in J. Given a joint plan (S, l, g), we define:
Given a strategy pair (s1, s2), it is here convenient
to denote the expected payoffs up to stage T by: pk lk
– 8s S, ps ¼ pks k K DðK Þ, with pks ¼ ls s
!
1 X T
~ for each k. ps is the a posteriori on K given s.
apT s1 ,s2 ¼ E p,s1 ,s2 Ak ~i t , ~j t – ’ = (’k)k K ℝK, with for each k, ’k =
T t¼1
maxs S Ak(gs).
X P
¼ pk akT s1 ,s2 : – 8s S, cs = B(ps)(gs) and c ¼ k K pk
P P
s S ls B ð g s Þ ¼
k K k k
s S ls cs .
!
1 X T
~ Definition 9 A joint plan (S, l, g) is an equilib-
bpT s1 ,s2 ¼ E p,s1 ,s2 Bk ~i t , ~j t
T t¼1 rium joint plan if:
X
¼ pk bkT s1 ,s2 :
kK (i) 8s S, cs vexv(ps)
(ii) 8k K, 8s S s.t. pks > 0, Ak(gs) = ’k
Given a probability q on K, we write A(q) = k (iii) 8q D(K), < ’, q > u(q)
qkAk, B(q) = k qkBk, u(q) = maxx D(I) miny D(J)
A(q)(x, y) and v(q) = maxy D(J) minx D(I) Condition (ii) can be seen as an incentive con-
B(q)(x, y). If g = (g(i, j))(i, j) IJ D(I J), dition for player 1 to choose s according to lk.
we put A(q)(g) = (i, j) IJ g(i, j)A(q)(i, j) and Given an equilibrium joint plan (S, l, g), one define
similarly B(q)(g) = (i, j) IJ g(i, j)B(q)(i, j). a strategy pair (s1, s2) adapted to it. For each
message s, first fix a sequence ist , jst t1 of elements
in I J such that for each (i, j), the empirical
Existence of Equilibria frequencies converge to the corresponding
proba-
The question of existence of an equilibrium has bility: T1 j t, 1 t T , ist , jst ¼ ði, jÞ j !T !1
remained unsolved for long. Sorin (1983) proved gs ði, jÞ. We also fix an injective mapping f from
the existence of an equilibrium for two states of S to I l, where l is large enough, corresponding to a
nature, and the general case has been solved by code between the players to announce an element
Simon et al. (1995). in S. s1 is precisely defined as follows. Player
Exactly as in the zero-sum case, a strategy pair 1 observes the selected state k, then chooses
s induces a sequence of a posteriori (pt(s))t0 s according to lk, and announces s to player 2 by
which is a ℙ p,s martingale with values in D(K). playing f(s) at the first l stages. Finally, s1 plays ist
We will concentrate on the cases where this mar- at each stage t > l as long as player 2 plays jst. If at
tingale moves only once. some stage t > l player 2 does not play jst , then
172 Repeated Games with Incomplete Information
player 1 punishes his opponent by playing an posteriori. This leads to the consideration of the
optimal strategy in the zero-sum game with initial following correspondence (for each r, F(r) is a
probability ps and payoffs for player 1 given by subset of ℝK):
(Bk)k K. We now define s2. Player 2 arbitrarily
plays at the beginning of the game, then compute F : DðK Þ⇉ℝ K
at the end of stage l the message s sent by player
1. Next he plays at each stage t > l the action jst as r 7! {(Ak(g))k K, where g D(I J) satisfies
long as player 1 plays ist . If at some stage t > l, B(r)(g) vex v(r)}. It is easy to see that the graph
player 1 does not play ist, or if the first l actions of of F, i.e., the set {(r, ’) D(K) ℝK, ’ F(r)},
player 1 correspond to no message, then player is compact that F has nonempty convex values
2 plays a punishing strategy s2 such that: 8e > 0, and satisfies: 8r D(K), 8q D(K), ∃’ F(r),
∃T0, 8T T0, 8s1 S1, 8k K, ak ðs1 ,s2 Þ < ’, q > u(q).
’k + e. Such a strategy s2 exists because of Assume now that one can find a finite family
condition (iii): it is an approachability strategy for (ps)s S of probabilities on K, as well as vectors ’
player 2 of the orthant {x ℝK, 8k K xk ’k} and, for each s, ’s in ℝK such that: (1) p conv
(see section “Back to the Standard Model”). {ps, s S}, (2) < ’, q > u(q) 8q D(K),
(3) 8s S, ’s F(ps), and (4) 8s S, 8k K,
’ks ’k with equality if pks > 0. It is then easy to
Lemma 7 Sorin (1983) A strategy pair adapted construct an equilibrium joint plan. Thus, we get
to an equilibrium joint plan is a uniform equilib- interested in proving the following result.
rium of the repeated game.
Proof The payoffs induced by (s1, s2) can be Proposition 4 Let p be in D(K), u: D(K) ! ℝ be a
easily computed: continuous mapping, and F: D(K) ⇉ ℝK be a
P
8k, akT ðs1 ,s2 Þ!T !1 s S lks Ak ðgs Þ ¼ ’k correspondence with compact graph and non-
P
because of (ii), and bpT ðs1 ,s2 Þ!T !1 k K pk empty convex values such that: 8r D(K), 8q
P
s S ls B ðgs Þ ¼ c:
k k D(K), ∃’ F(r), < ’, q > u(q). Then there
Assume that player 2 plays s2. The existence exists a finite family (ps)s S of elements of D(K), as
of s̄ 2 implies that no detectable deviation of well as vectors ’ and, for each s, ’s in ℝK such that:
player 1 is profitable, so if the state is k, player
1 will gain no more than maxs0 S Ak ðgs0 Þ. But this – p conv {ps, s S},
is just ’k. The proof can be made uniform in – < ’, q > u(q) 8q D(K),
s1 and we obtain: 8e > 0 ∃T0 8T T0, 8k K, – 8s S, ’s F(ps),
8s1 S1, akT ðs1 ,s2 Þ ’k þ . Finally assume – 8s S, 8k K, ’ks ’k with equality if pks
that player 1 plays s1. Condition (i) implies that if > 0.
player 2 uses s2, the payoff of this player will be
at least vex v(ps) if the message is s. Since vex The proof of Proposition 4 relies, as explained
v(ps) (= cav(v(ps))) is the value, from the in Renault (2000) or Simon (2002), on a fixed
point of view of player 2 with payoffs (Bk)k, of point theorem of Borsuk-Ulam type proved by
the zero-sum game with initial probability ps, Simon et al. (1995) via tools from algebraic geom-
player 2 fears the punition by player 1, and etry. A simplified version of this fixed point theo-
8e > 0, ∃T0, 8T T0, 8s2 S2, bpT ðs1 ,s2 Þ rem can be written as follows:
P
s S ls cs þ ¼ c þ e. □
Theorem 9 Simon et al. (1995): Let C be a
To prove the existence of equilibria, we then compact subset of an n-dimensional Euclidean
look for equilibrium joint plans. The first idea is to space, x C and Y be a finite union of affine
consider, for each probability r on K, the set of subspaces of dimension n 1 of an Euclidean
payoff vectors ’ compatible with r being an a space. Let F be a correspondence from C to Y with
Repeated Games with Incomplete Information 173
3. ∃p D(I J) s.t. b = kpk i, j pi, j Bk(i, j) “expectation of player 1’s future payoff” (which
and 8k K, ak i, j pi, j Ak(i, j) with equality can be properly defined) remains constant. Hence,
if pk > 0. the heuristic apparition of the bimartingale. And
We need to considerate every possible initial since bounded martingale converge, for large
probability because the main state variable of the stages everything will be fixed and the players
model is, here also, the belief, or a posteriori, of will approximately play a nonrevealing equilib-
player 2 on the state of nature. {(a, b), (a, b, p0) rium at a “limit a posteriori,” so the convergence
G} is the set of payoffs of nonrevealing equilibria of will be towards elements of G.
G(p0). The importance of the following definition Consider now the converse implication (. Let
will appear with Theorem 11 below (which unfor- (a, b) be such that (a, b, p0) G and assume for
tunately has not led to a proof of existence of equi- simplification that the associated bi-martingale
librium payoffs). (an, bn, pn) converges in a fixed number N of
stages: 8n N, (an, bn, pn) = (aN, bN, pN) G.
Definition 12 G is defined as the set of elements One can construct an equilibrium (s1, s2) of
g ¼ ða,b,pÞ ℝKM ℝ M DðK Þ such that there G(p0) with payoff (a, b) along the following
exist a probability space (O, A, Q), an increasing lines. For each index n, (an, bn) will be an equi-
sequence (F n)n 1 of finite sub-s-algebras of A, librium payoff of the repeated game with initial
and a sequence of random variables (gn)n1 = (an, probability pn. Eventually, player 1 will play inde-
bn, pn)n1 defined on (O, A ) with values in ℝ KM pendently of the state, the a posteriori of player 2
ℝM DðK Þ satisfying: (i) g1 = g a.s., (ii) (gn)n1 will be pN, and the players will end up playing a
is a martingale adapted to (F n)n1, (iii) 8n 1, an+1 nonrevealing equilibrium of the repeated game
= an a.s. or pn+1 = pn a.s., and (iv) (gn)n converges G(pN) with payoff (aN, bN). What should be
a.s. to a random variable g1 with values in G. played before? Since we are in an undiscounted
setup, any finite number of stages can be used for
Let us forget for a while the component of communication without influencing payoffs. Let
player 2’s payoff. A process (gn)n satisfying (ii) n < N be such that an + 1 = an. To move from
and (iii) may be called a bi-martingale; it is a (an, bn, pn) to (an, bn+1, pn+1), player 1 can simply
martingale such that at every stage, one of the use the splitting lemma (Lemma 1) in order to
two components remains a.s. constant. So the set signal part of the state to player 2. Let now
G can be seen as the set of starting points of n < N be such that pn + 1 = pn, so that we want
converging bi-martingales with limit points in G. to move from (an, bn, pn) to (an+1, bn+1, pn). Player 1
will play independently of the state, and both
Theorem 11 Hart (1985) Let (a, b) be in ℝK ℝ. players will act so as to convexify their future
payoffs. This convexification is done through pro-
ða,bÞ is an equilibrium payoff of Gðp0 Þ cedures called “jointly controlled lotteries” and
, ða,b, p0 Þ G : introduced in the sixties by Aumann and Maschler
(1995), with the following simple and brilliant
Theorem 11 is too elaborate to be proved here, idea. Imagine that the players have to decide
but let us give a few ideas about the proof. First with even probability whether to play the equilib-
consider the implication ) and fix an equilibrium rium E1 with payoff (a1, b1) or to play the equi-
s = (s1, s2) of G(p0) with payoff (a, b). The librium E2 with payoff (a2, b2). The players may
sequence of a posteriori (pt(s))t 0 is a ℙ p0 ,s not be indifferent between E1 and E2, e.g., player
martingale. Modify now slightly the time struc- 1 may prefer E1, whereas player 2 prefers E2.
ture so that at each stage, player 1 plays first, and They will proceed as follows, with i and i0 , respec-
then player 2 plays without knowing the action tively, j and j0 , denoting two distinct actions of
chosen by player 1. At each half-stage where player 1, resp. player 2. Simultaneously and inde-
player 2 plays, his a posteriori remains constant. pendently, player 1 will select i or i0 with proba-
At each half-stage where player 1 plays, the bility 1/2, whereas player 2 will behave similarly
Repeated Games with Incomplete Information 175
0
j j Definition 15 Let A be a measurable subset of
with j and j0 . i . Then the equilibrium X Y.
i0
E1 will be played if the diagonal has been reached, A = {z X Y, there exists a bimartingale
i.e., if (i, j) or (i0 , j0 ) has been played, and otherwise (Zn)n1 converging to a limit Z1 such that Z1
the equilibrium E2 will be played. This procedure A a.s. and Z1 = z a.s.}.
is robust to unilateral deviations: none of the One can show that any atomless probability
players can deviate and prevent E1 and E2 to be space (O, F , P), or any product of convex com-
chosen with probability 1/2. In general, jointly pact spaces X Y containing A, induces the same
controlled lotteries are procedures allowing to set A. One can also substitute condition (2) by:
select an alternative among a finite set according 8n 1, (Xn = Xn+1 or Yn = Yn+1) a.s. Notice that
to a given probability (think of binary expansions without condition (2), the set A would just be the
if necessary), in a way which is robust to devia- convex hull of A.
tions by a single player. S. Hart has precisely We always have A A conv (A), and these
shown how to combine steps of signaling and inclusions can be strict. For example, if
jointly controlled lotteries to construct an equilib- X = Y = [0, 1] and A = {(0, 0), (1, 0), (0, 1)}, it
rium of G1(p0) with payoff (a, b). is possible to show that A = {(x, y) [0,
1] [0, 1], x = 0 or y = 0}. A always is biconvex
and thus contains biconv (A), which is defined as
Biconvexity and Bimartingales
the smallest biconvex set which contains A. The
The previous analysis has led to the introduction
inclusion biconv (A) A can also be strict, as
and study of biconvexity phenomena. The refer-
shown by the following example:
ence here is Aumann and Hart (1986). Let X and
Y be compact convex subsets of Euclidean spaces,
and let (O, F , P) be an atomless probability space. Example 5 Put X = Y = [0, 1], u1 = (1/3, 0),
u2 = (0, 2/3), u3 = (2/3, 1), u4 = (1, 1/3),
Definition 13 A subset B of X Y is biconvex if w1 = (1/3, 1/3), w2 = (1/3, 2/3), w3 = (2/3, 2/3)
for every x in X and y in Y, the sections Bx. = {y0 et w4 = (2/3, 1/3), and A = {v1, v2, v3, v4}
Y, (x, y0 ) B} and B.y = {x0 X, (x0 , y) B} are (Fig. 5).
convex. If B is biconvex, a mapping f: B ! ℝ is
called biconvex if for each (x, y) X Y, f(., y) A is biconvex, so A = biconv (A). Consider
and f(x,.) are convex. now the following Markov process (Zn)n1, with
Z1 = w1. If Zn A, then Zn+1 = Zn. If Zn = wi for
As in the usual convexity case, we have that if some i, then Zn+1 = wi+1(mod 4) with probability
f is biconvex, then for each a in ℝ, the set {(x, y) 1/2, and Zn + 1 = vi with probability 1/2. (Zn)n is a
B, f(x, y) a} is biconvex.
bimartingale converging a.s. to a point in A, hence Theorem 1 has been generalized (Aumann and
w1 A\biconv (A). Maschler 1995) to the general case of signaling
We now present a geometric characterization of function. We keep the notations of section “The
the set A and assume here that A is closed. For Standard Model of Aumann and Maschler.” Given
each biconvex subset B of X Y containing A, we a mixed action x D(I), an action j in J and a state
denote by nsc(B) the set of elements of B which k, we denote by Q(k, x, j) the marginal distribution
cannot be separated from A by a continuous on U 2 of the law i I x(i) q(k, i, j), i.e., Q(k, x, j)
bounded biconvex function on A. More precisely, is the law of the signal received by player 2 if the
nsc(B) = {z B, 8f: B ! ℝ bounded biconvex, state is k, player 1 uses x and player 2 plays j. The
and continuous on A, f (z) sup{f (z0 ), z0 A}}. set of nonrevealing strategies
n of player 1 is then
defined as: NRðpÞ ¼ x ¼ x k k K DðI ÞK , 8k
0
Theorem 12 Aumann and Hart (1986): A is the K,8k 0 K s:t: p k p k > 0,8j J ,Q k, x k , j ¼
0
largest biconvex set B containing A such that Qðk 0 ,x k ,jg. If the initial probability is p and player
nsc(B) = B. 1 plays a strategy x in NR(p) (i.e., plays xk if the
state is k), the a posteriori of player 2 will remain
a.s. constant: player 2 can deduce no information
Let us now come back to repeated games and
on the selected state k. The value of the non-
to the notations of subsection “Characterization of
revealing game becomes:
Equilibrium Payoffs.” To be precise, we need to
add the component of player 2’s payoff and con- X
sequently to slightly modify the definitions. G is uðpÞ ¼ max min pk Gk x k ,y
x NRðpÞ y DðJ Þ
kK
closed in ℝ KM ℝ M DðK Þ. For B ℝ KM ℝ M X
DðK Þ, B is biconvex if for each a in ℝ KM and for ¼ min max pk Gk x k ,y ,
y DðJ Þ x NRðpÞ
each p in D(K), the sections {(b, p0 ), (a, b, p0 ) B} kK
and lack of information on both sides, see (Zamir contrary, the discounted value vl can be quite a
1971, 1973). For state-dependent signaling and lack complex function of p: in Example 2 of section
of information on one side, it was shown by Mertens “Definition of the Subject and Its Importance,”
(1998) that the convergence occurs with worst case Mayberry (1967) has proved that for 2/3 < l < 1,
error (ln n/n)1/3. ul is, at each rational value of p, nondifferentiable.
A particular class of zero-sum repeated games
pffiffiffiffi
with state dependent signaling has been studied 3. limT T ðuT ðpÞ cavuðpÞÞ and the normal
(games with no signals, see (Mertens and Zamir distribution
1976b; Sorin 1989; Waternaux 1983). In these
games, the state k is first selected according to a Convergence of the value functions (uT)T and
known probability and is not announced to the (ul)l has been widely studied. We have already
players; then after each stage both players receive mentioned the speed of convergence in section
the same signal which is either “nothing” or “the “Non-Observable Actions,” but much more can
state is k.” It was shown that the maxmin and the be said.
minmax may differ, although limT uT always exists.
In nonzero-sum repeated games with lack of Example 6 Standard model of lack of information
information on one side, the existence of “joint
plan” equilibria have been generalized to the case on one side and observable actions. K = {a, b}, Ga
of state independent signaling (Renault 2000) and
3 1 2 2
more generally to the case where “player 1 can ¼ and Gb ¼ . One can
3 1 2 2
send non revealing signals to player 2” (Simon
show (Mertens and Zamir 1976a) that for each p
et al. 2002). The existence of a uniform equilib-
[0, 1], viewedpas ffiffiffiffi the initial probability of state a,
rium in the general signaling case is still an open
the sequence T uT (p) converges to ’(p), where
question (see Simon et al. 2008). Ð xp x2 =2
’ðpÞ ¼ p1ffiffiffiffi exp =2 , and xp satisfies p1ffiffiffiffi
2
2p 1
e
2p
pffiffiffiffi
dx ¼ p. So the limit of T uT (p) is the standard
Advances normal density function evaluated at its p-quantile.
P
player 1’s payoff finally is T1 Tt¼1 Gk ðit , jt Þ zk . 6. Markov chain games with lack of information
This player is thus now able to fix the state equal
to k, but has to pay zk for it. It can be shown that In Renault (2006), the standard model of lack of
the T -stage dual game GT ðzÞ has a value wT (z). wT information, as well as the proof of Theorem 1, is
is convex and is linked to the value of the primal generalized to the case where the state is not fixed at
game by the conjugate formula: the beginning of the game but evolves according to
a Markov chain uniquely observed by player 1 (see
wT ðzÞ ¼ max ðuT ðpÞ < p, z >Þ, and also Neyman (2008) for nonobservable actions).
p DðK Þ
The limit value is however difficult to compute, as
uT ðpÞ ¼ inf K ðwT ðzÞþ < p, z >Þ: shown by the following example from Renault
zℝ
K = {a, b}, the payoff matrices are
(2006):
1 0 0 0
And (wT)T satisfies the dual recursive formula: Ga ¼ and Gb ¼ , the initial probabil-
0 0 0 1
ity is (1/2, 1/2), and the state
evolves according
to
wT þ1 ðzÞ ¼ min max
T a 1a
the Markov chain M ¼ with
y Dð J Þ i I T þ 1 1a a
!
T þ1 1 X k parameter a. If a = 1 this is Example 2, and the
wT z y G ði, jÞ k limit value is 1/4 by Theorem 1.
T T jJ j
For a [1/2, 2/3], the limit value is 4a1 a
7. Extension to zero-sum dynamic games with state In this setup, one can prove (Renault 2012) the
process controlled and observed by player 1 existence of the uniform value u(p), satisfying:
It is known since (Sorin 1984a) that the uniform u ðpÞ ¼ inf sup um,n ðpÞ ¼ sup inf um,n ðpÞ:
n1 m0 m0 n1
value may not exist in general for stochastic games
with lack of information on one side on the payoff
where vm,n P(p) is the value of the game with payoff
matrices (where the payoff matrices of the stochastic
E p,s,t 1n mþn
t¼mþ1 g t , gt being the payoff of
game to be played are first randomly selected and
announced to player 1 only). Rosenberg et al. (2004) stage t.
studied stochastic games with a single controller and Moreover, one can prove for such games the
lack of information on one side on the payoff matri- existence of the stronger notion of “general uni-
ces, showing the existence of the uniform value if form value.” Let us first define the values vy (p) of
the dynamic game with payoff gy ðp,s,tÞ ¼ E p,s,t
the informed player controls the transition, and pro- P
viding a counter-example if the uninformed player t1 yt g t , where y is an evaluation (yt)t1 with
controls the transitions. One can also consider the nonnegative weights satisfying t1yt = 1, and
model of general repeated games with an informed total variation denoted by TV(y) = t j yt+1 ytj.
controller (Renault 2012), generalizing the model of And u(p) is the general uniform value of the game
Markov chain games with lack of information on with initial probability p if for each e > 0 one can
one side), i.e., dynamic games with finitely many find a > 0 and a couple of strategies s and t such
states, actions and signals, and state processes con- that for all evaluations y with TV (y) a:
trolled and observed by player 1.
A general repeated game is given by: 5 non empty 8t,gy ðp,s ,tÞ v ðpÞ e and 8s,gy ðp,s,t Þ
finite sets: a set of states or parameters K, a set I of v ðpÞ þ e:
actions for player 1, a set J of actions for player 2, a
set C of signals for player 1, and a set D of signals for Considering only Cesaro-evaluations (i.e., of
player 2, an initial distribution p D(K C D), a the type yt = 1/n for t n, =0 for t > n for some n)
payoff function g: K I J to [0, 1] for player 1, and recovers our Definition 1. Renault and Venel
a transition function q: K I J to D(K C D). (2017) introduce a new distance (compatible
The progress of the game is the following: Initially, with the weak topology) on the belief space
(k1, c1, d1) is selected according to p, player 1 learns D(D(K)) of Borel probabilities over the simplex
c1 and player 2 learns d1. Then simultaneously player X = D(K) and prove the existence of the general
1 chooses i1 in I and player 2 chooses j1 in J, and the uniform value in general repeated games with an
payoff for player 1 at stage 1 is g(k1, i1, j1), etc. At any informed controller. Clearly, the values only
stage t 2, (kt, ct, dt) is selected according to q(kt1, depend on player 2’s belief p on the initial state,
it1, jt1), player 1 learns ct and player 2 learns dt. and the limit value u can be characterized as:
Simultaneously, player 1 chooses it in I and player
2 chooses jt in J. The stage payoffs are g(kt, it, jt) for 8p X ,v ðpÞ ¼ inf wðpÞ,w : DðX Þ ! ½0,1
affine C 0 s:t:
player 1 and the opposite for player 2, and the play
proceeds to stage t + 1. 1. 8p0 X ,wðp0 Þ sup wðqðp0 ,aÞÞ
In repeated games with an informed controller, a DðI ÞK
associated payoffs. In the standard model of tools (Cardaliaguet et al. 2012). A generalization
Aumann and Maschler, (1) is equivalent to of the cavu theorem (Theorem 1) to infinite action
w being a concave function on D(K) and (2) is spaces and partial information can be found in
equivalent to w being not lower than the non- Gensbittel (2015), using a probabilistic method
revealing function u: so v is the smallest concave based on martingales and a functional method
function above u, and we recover the cavu theo- based on approximation schemes for viscosity
rem (Theorem 1). solutions of Hamilton Jacobi equations.
Finally, the existence of the uniform value has
been generalized to the case where Player 1 con- 10. The operator approach for zero-sum games
trols the transitions and is more informed than
player 2 (but player 1 does not necessarily observe Repeated games with incomplete information,
the current state) in Gensbittel et al. (2014). as well as stochastic games, can also be studied in
a functional analysis setup called the operator
8. Symmetric information approach. This general approach is based on the
study of the recursive formula (Laraki 2001b;
Another model deals with the symmetric case, Rosenberg and Sorin 2001; Sorin 2002).
where the players have an incomplete, but identi-
cal, knowledge of the selected state. After each 11. Uncertain duration
stage, they receive the same signal, which may
depend on the state. A. Neyman and S. Sorin have One can consider zero-sum repeated games
proved the existence of equilibrium payoffs in the with incomplete information on both sides and
case of two players (see Neyman and Sorin 1998, uncertain duration. In these games, the payoff to
the zero-sum case being solved in Forges 1982; the players is the sum of their stage payoffs, up to
Kohlberg and Zamir 1974). some stopping time y which may depend on plays,
This result does not extend to the case where divided by the expectation of y. Theorem 8 here
the stage evolves from stage to stage, i.e., to generalizes to the case of public uncertain dura-
stochastic games with incomplete information. tion process
p(as E ðyÞ ! 1, with a convergence
ffiffiffi
In the zero-sum symmetric information case in O 1=E y , see Neyman and Sorin (2010).
where at the end of each stage, the players observe
The situation is different if one allows for private
both actions but receive no further information on
uncertain duration processes: any number
the current state (hidden stochastic games), B.
between the maxmin cavIvexIIu(p, q) and the
Ziliotto provided in his PhD thesis an example
minmax vexIIcavIu(p, q) is the value of a long
where limTvT and limlvl may fail to exist
finitely repeated game GT where players’ informa-
(Ziliotto 2016).
tion about the uncertain number of repetitions T is
One can also consider zero-sum general
asymmetric (Neyman 2012).
repeated games with payoffs defined by the
expectation of a Borel function over plays. In the 12. Frequent actions
public case where he players have the same infor-
mation at the end of every stage, the value exists One can consider a repeated game with incom-
(Gimbert et al. 2016). plete information and fixed discount factor, where
the time span between two consecutive stages is
9. Continuous-time approach 1/n. In the context of zero-sum Markov chain
games of lack of information on one side,
A continuous time approach can also be used Cardaliaguet et al. (2016) show the existence of
to prove convergence results in general zero-sum a limit value when n goes to +1; this value is
repeated games, and in particular Theorem 7, characterized through an auxiliary stochastic opti-
embedding the discrete repeated game into a con- mization problem and, independently, as the solu-
tinuous time game and using viscosity solution tion of an Hamilton-Jacobi equation.
Repeated Games with Incomplete Information 181
13. Repeated market games with incomplete the unpublished manuscript Koren 1992), but uni-
information form equilibria may fail to exist even though both
players known their own payoffs.
De Meyer and Moussa Saley studied the
modelization via Brownian motions in financial 16. More than 2 players
models (de Meyer and Moussa Saley 2003). They
introduced a marked game based on a repeated Few papers study the case of more than
game with lack of information on one side and 2 players. The existence of uniform equilibrium
showed the endogenous apparition of a Brownian has been studied for 3 players and lack of infor-
motion (see also de Meyer and Marino 2004 for mation on one side (Renault 2001a), and in the
incomplete information on both sides, and de case of two states of nature it appears that a
Meyer 2010). completely revealing equilibria, or a joint plan
equilibria by one of the informed players, always
14. Cheap-talk and communication exists. Concerning n-player repeated games with
incomplete information and signals, several
In the nonzero-sum setup of section “Non papers study how the initial information can be
Zero-Sum Games with Lack of Information on strategically transmitted, independently of the
One Side,” it is interesting to study the number payoffs (Renault 2001b; Renault et al. 2014;
of communication stages which is needed to con- Renault and Tomala 2004, 2008), with crypto-
struct the different equilibria. This number is graphic considerations. As an application, the
linked with the convergence of the associated existence of completely revealing equilibria, i.e.,
bimartingales (see Aumann and Hart 1986; equilibria where each player eventually learns the
Forges 1984, 1990; Aumann and Maschler state with probability one, is obtained in particular
1995). Let us mention also that F. Forges (1988) cases (see also Ḧorner et al. 2011 for the related
gave a similar characterization of equilibrium notion of “belief-free” equilibria).
payoffs, for a larger notion of equilibria called
communication equilibria (see also Forges 1985 17. Perturbations of repeated games with com-
for correlated equilibria). Amitai (1996b) studied plete information
the set of equilibrium payoffs in case of lack of
information on both sides. Aumann and Hart Repeated games with incomplete information
(2003) characterized the equilibrium payoffs in have been used to study perturbations of repeated
two player games with lack of information on games with complete information (see Cripps and
one side when long, payoff-irrelevant, preplay Thomas 2003; Fudenberg and Maskin 1986) for
communication is allowed (see Amitai 1996a for Folk theorem-like results (Aumann and Sorin
incomplete information on both sides). 1989), for enforcing cooperation in games with a
Paretodominant outcome, and (Israeli 2010) for a
15. Known own payoffs perturbation with known own payoffs). The case
where the players have different discount factors
The particular nonzero-sum case where each has also been investigated (Cripps and Thomas
player knows his own payoffs is particularly 2003; Lehrer and Yariv 1999).
worthwhile studying. In the two-player case with
lack of information on one side, this amounts to
say that player 2’s payoffs do not depend on the Future Directions
selected state. In this case, Shalev (1994) showed
that any equilibrium payoff can be obtained as the Several open problems are well formulated and
payoff of an equilibrium which is completely deserve attention. Does a uniform equilibrium
revealing. This result generalizes to the nonzero- always exist in two-player repeated games with
sum case of lack of information of both sides (see lack of information on one side and general
182 Repeated Games with Incomplete Information
signaling or in n-player repeated games with lack de Meyer B (1996b) Repeated games, duality and the
of information on one side? Does the limit value central limit theorem. Math Oper Res 21:237–251
de Meyer B (1998) The maximal variation of a bounded
always exist in zero-sum repeated games with martingale and the central limit theorem. Ann Inst
incomplete information and signals? More concep- Henri Poincaŕe Probab Stat 34:49–59
tually, one should look for classes of n-player de Meyer B (1999) From repeated games to Brownian
repeated games with incomplete information games. Annales de l’Institut Henri Poincaŕe, Pro-
babilit́es et statistiques 35:1–48
which allow for the existence of equilibria, and/or de Meyer B (2010) Price dynamics on a stock market with
for a tractable description of equilibrium payoffs asymmetric information. Games Econom Behav 69:42–71
(or at least of some of these payoffs). Regarding de Meyer B, Marino A (2004) Repeated market games with
applications, there is certainly a lot of room in the lack of information on both sides. DP 2004.66, MSE
Universit́e Paris I
vast fields of financial markets, cryptology, learn- de Meyer B, Marino A (2005) Duality and optimal strate-
ing, and sequential decision problems. gies in the finitely repeated zero-sum games with
incomplete information on both sides. DP 2005.27,
MSE Universit́e Paris I
de Meyer B, Moussa Saley H (2003) On the strategic origin
Bibliography of Brownian motion in finance. Int J Game Theory
31:285–319
Primary Literature de Meyer B, Rosenberg D (1999) “Cavu” and the dual
Amitai M (1996a) Cheap-talk with incomplete information game. Math Oper Res 24:619–626
on both sides. PhD thesis, The Hebrew University of Forges F (1982) Infinitely repeated games of incomplete
Jerusalem. http://ratio.huji.ac.il/dp/dp90.pdf information: symmetric case with random signals. Int
Amitai M (1996b) Repeated games with incomplete infor- J Game Theory 11:203–213
mation on both sides. PhD thesis, The Hebrew Univer- Forges F (1984) A note on Nash equilibria in repeated
sity of Jerusalem. http://ratio.huji.ac.il/dp/dp105.pdf games with incomplete information. Int J Game Theory
Aumann RJ (1964) Mixed and behaviour strategies in 13:179–187
infinite extensive games. In: Dresher M, Shapley LS, Forges F (1985) Correlated equilibria in a class of repeated
Tucker AW (eds) Advances in game theory. Annals of games with incomplete information. Int J Game Theory
Mathematics Study 52. Princeton University Press, 14:129–149
pp 627–650 Forges F (1988) Communication equilibria in repeated
Aumann RJ, Hart S (1986) Bi-convexity and games with incomplete information. Math Oper Res
bi-martingales. Israel J Math 54:159–180 13:191–231
Aumann RJ, Hart S (2003) Long cheap talk. Econometrica Forges F (1990) Equilibria with communication in a job
71:1619–1660 market example. Q J Econ 105:375–398
Aumann RJ, Sorin S (1989) Cooperation and bounded Foster D (1999) A proof of calibration via Blackwell’s
recall. Games Econom Behav 1:5–39 approachability theorem. Games Econom Behav 29:73–78
Blackwell D (1956) An analog of the minmax theorem for Foster D, Vohra R (1999) Regret in the on-line decision
vector payoffs. Pac J Math 65:1–8 problem. Games Econom Behav 29:7–35
Bressaud X, Quas A (2006) Dynamical analysis of a Fudenberg D, Maskin E (1986) The folk theorem in
repeated game with incomplete information. Math repeated games with discounting or with incomplete
Oper Res 31:562–580 information. Econometrica 54:533–554
Cardaliaguet P, Rainer C, Rosenberg D, Vieille N (2016) Gensbittel F (2015) Extensions of the Cav(u) theorem for
Markov games with frequent actions and incomplete repeated games with one-sided information. Math Oper
information? The limit case. Math Oper Res 41:49–71 Res 40(1):80–104
Cardaliaguet P, Laraki R, Sorin S (2012) A continuous time Gensbittel F, Renault J (2015) The value of Markov Chain
approach for the asymptotic value in two-person zero- Games with incomplete information on both sides.
sum repeated games. SIAM J Control Optim Math Oper Res 40(4):820–841
50:1573–1596 Gensbittel F, Oliu-Barton M, Venel X (2014) Existence of
Cesa-Bianchi N, Lugosi G (2006) Prediction, learning and the uniform value in repeated games with a more
games. Cambridge University Press, Cambridge informed controller. J Dynam Games 1(3):411–445
Cesa-Bianchi N, Lugosi G, Stoltz G (2006) Regret mini- Gimbert H, Renault J, Sorin S, Zielonka W (2016) On
mization under partial monitoring. Math Oper Res values of repeated games with signals. Ann Appl Pro-
31:562–580 bab 26:402–424
Cripps MW, Thomas JP (2003) Some asymptotic results in Harsanyi J (1967-68) Games with incomplete information
discounted repeated games of one-sided incomplete played by ‘Bayesian’ players, parts I-III. Manag Sci
information. Math Oper Res 28:433–462 8:159–182, 320–334, 486–502
de Meyer B (1996a) Repeated games and partial differen- Hart S (1985) Nonzero-sum two-person repeated games with
tial equations. Math Oper Res 21:209–236 incomplete information. Math Oper Res 10:117–153
Repeated Games with Incomplete Information 183
Hart S (2005) Adaptative Heuristics. Econometrica Mertens J-F (1998) The speed of convergence in repeated
73:1401–1430 games with incomplete information on one side. Int
Hart S, Mas-Colell A (2000) A simple adaptative proce- J Game Theory 27:343–357
dure leading to correlated equilibrium. Econometrica Mertens J-F, Zamir S (1971) The value of two-person zero-
68:1127–1150 sum repeated games with lack of information on both
Heuer M (1992) Optimal strategies for the uninformed sides. Int J Game Theory 1:39–64
player. Int J Game Theory 20:33–51 Mertens J-F, Zamir S (1976a) The normal distribution and
Ḧorner J, Lovo S, Tomala T (2011) Belief-free equilibria in repeated games. Int J Game Theory 5:187–197
games with incomplete information: characterization Mertens J-F, Zamir S (1976b) On a repeated game without
and existence. J Econ Theory (5):1770–1795 a recursive structure. Int J Game Theory 5:173–182
Ḧorner J, Rosenberg D, Solan E, Vieille N (2010) On a Mertens J-F, Zamir S (1977) A duality theorem on a pair of
Markov game with one-sided incomplete information. simultaneous functional equations. J Math Anal Appl
Oper Res 58:1107–1115 60:550–558
Israeli E (2010) Sowing doubt optimally in two-person Mertens J-F, Zamir S (1985) Formulation of Bayesian
repeated games. Games Econom Behav 28:203–216. analysis for games with incomplete information. Int
1999 J Game Theory 14:1–29
Kohlberg E (1975) Optimal strategies in repeated games Neyman A (2008) Existence of optimal strategies in Mar-
with incomplete information. Int J Game Theory 4:7–24 kov games with incomplete information. Int J Game
Kohlberg E, Zamir S (1974) Repeated games of incomplete Theory 37:581–596
information: the symmetric case. Ann Stat 2:40–41 Neyman A (2012) The value of two-person zero-sum
Koren G (1992) Two-person repeated games where players repeated games with incomplete information and
know their own payoffs, master thesis, Tel-Aviv Uni- uncertain duration. Int J Game Theory 41:95–207
versity. http://www.ma.huji.ac.il/hart/papers/koren.pdf Neyman A, Sorin S (1998) Equilibria in repeated games
Kuhn HW (1953) Extensive games and the problem of with incomplete information: the general symmetric
information. In: Kuhn and Tucker (eds) Contributions case. Int J Game Theory 27:201–210
to the theory of games, vol II. Annals of Mathematical Neyman A, Sorin S (2010) Repeated games with public
Studies 28. Princeton University Press, pp 193–216 uncertain duration processes. Int J Game Theory 39:29–52
Laraki R (2001a) Variational inequalities, system of func- Ponssard JP, Sorin S (1980) The LP formulation of finite
tional equations and incomplete information repeated zero-sum games with incomplete information. Int
games. SIAM J Control Optim 40:516–524 J Game Theory 9:99–105
Laraki R (2001b) The splitting game and applications. Int Renault J (2000) 2-player repeated games with lack of
J Game Theory 30:359–376 information on one side and state independent signal-
Laraki R (2002) Repeated games with lack of information ling. Math Oper Res 4:552–572
on one side: the dual differential approach. Math Oper Renault J (2001a) 3-player repeated games with lack of
Res 27:419–440 information on one side. Int J Game Theory 30:221–246
Lehrer E (2001) Any inspection is manipulable. Renault J (2001b) Learning sets in state dependent signal-
Econometrica 69:1333–1347 ling game forms: a characterization. Math Oper Res
Lehrer E (2003a) Approachability in infinite dimensional 26:832–850
spaces. Int J Game Theory 31:253–268 Renault J (2006) The value of Markov chain games with lack
Lehrer E (2003b) A wide range no-regret theorem. Games of information on one side. Math Oper Res 31:490–512
Econom Behav 42:101–115 Renault J (2012) The value of repeated games with an
Lehrer E, Solan E (2003) No regret with bounded compu- informed controller. Math Oper Res 37:154–179
tational capacity, DP 1373, Center for Mathematical Renault J, Tomala T (2004) Learning the state of nature in
Studies in Economics and Management Science, repeated games with incomplete information and sig-
Northwestern University nals. Games Econom Behav 47:124–156
Lehrer E, Solan E (2006) Excludability and bounded com- Renault J, Tomala T (2008) Probabilistic reliability and
putational capacity. Math Oper Res 31:637–648 privacy of communication using multicast in general
Lehrer E, Yariv L (1999) Repeated games with lack of neighbor networks. J Cryptol 21(2):250–279
information on one side: the case of different discount Renault J, Venel X (2017) A distance for probability spaces,
factors. Math Oper Res 24:204–218 and long-term values in Markov decision processes and
Marino A (2005) The value of a particular Markov chain repeated games. Math Oper Res 42(2):349–376
game. Chapters 5 and 6, PhD thesis, Universit́e Paris I, Renault J, Solan E, Vieille N (2013) Dynamic sender-
2005. http://alexandre.marino.free.fr/theseMarino.pdf receiver games. J Econ Theory 148:502–534
Mayberry J-P (1967) Discounted repeated games with Renault J, Renou L, Tomala T (2014) Secure message transmis-
incomplete information, Report of the U.S. Arms con- sion on directed networks. Games Econom Behav 85:1–18
trol and disarmament agency, ST116, chapter V, Rosenberg D (1998) Duality and Markovian strategies. Int
Mathematica, Princeton, pp 435–461 J Game Theory 27:577–597
Mertens J-F (1972) The value of two-person zero-sum repeated Rosenberg D, Sorin S (2001) An operator approach to
games: the extensive case. Int J Game Theory 1:217–227 zero- sum repeated games. Israel J Math 121:221–246
184 Repeated Games with Incomplete Information
Rosenberg D, Solan E, Vieille N (2004) Stochastic games Waternaux C (1983) Solution for a class of repeated games
with a single controller and incomplete information. without recursive structure. Int J Game Theory
SIAM J Control Optim 43:86–110 12:129–160
Rustichini A (1999) Minimizing regret: the general case. Zamir S (1971) On the relation between finitely and infi-
Games Econom Behav 29:224–243 nitely repeated games with incomplete information. Int
Shalev J (1994) Nonzero-sum two-person repeated games J Game Theory 1:179–198
with incomplete information and known-own payoffs. Zamir S (1973) On repeated games with general informa-
Games Econom Behav 7:246–259 tion function. Int J Game Theory 21:215–229
Simon RS (2002) Separation of joint plan equilibrium Ziliotto B (2016) Zero-sum repeated games: counterexam-
payoffs from the min-max functions. Games Econom ples to the existence of the asymptotic value and the
Behav 1:79–102 conjecture maxmin=limvn. Ann Probab 44:1107–1133
Simon RS, Spież S, Toruńczyk H (1995) The existence of
equilibria in certain games, separation for families of
convex functions and a theorem of Borsuk- Ulam type. Books and Reviews
Israel J Math 92:1–21 Aumann RJ, Maschler M (1995) Repeated games with
Simon RS, Spież S, Toruńczyk H (2002) Equilibrium incomplete information, with the collaboration of
existence and topology in some repeated games with R.E. Stearns. M.I.T. Press, 1995 (contains a reedition of
incomplete information. Trans AMS 354:5005–5026 chapters of Reports to the U.S. Arms Control and Disar-
Simon RS, Spież S, Toruńczyk H (2008) Equilibria in a mament Agency ST-80, 116 and 143, Mathematica,
class of games and topological results implying their 1966-1967-1968)
existence. Rev R Acad Cien Serie A Mat 102:161–179 Forges F (1992) Repeated games of incomplete informa-
Sion M (1958) On general minimax theorems. Pac J Math tion: non-zero sum. In: Aumann RJ, Hart S (eds)
8:171–176 Handbook of game theory, vol I. Elsevier, North-
Sorin S (1983) Some results on the existence of Nash Holland, pp 155–177
equilibria for non- zero sum games with incomplete Laraki R, Sorin S (2014) Chapter 2: Advances in zero-sum
information. Int J Game Theory 12:193–205 dynamic games. In: Zamir S, Young P (eds) Handbook of
Sorin S (1984a) Big match with lack of information on one game theory, vol IV. Elsevier, North-Holland, pp 27–93
side (Part I). Int J Game Theory 13:201–255 Laraki R, Renault J, Tomala T (2006) Th́eorie des Jeux,
Sorin S (1984b) On a pair of simultaneous functional Introduction à la th́eorie des jeux ŕeṕet́es. Editions de
equations. J Math Anal Appl 98:296–303 l’Ecole Polytechnique, jourńees X-UPS 2006. ISBN:
Sorin S (1989) On recursive games without a recursive 978-2-7302-1366-0, in French (Chapter 3 deals with
structure: existence of limvn. Int J Game Theory repeated games with incomplete information)
18:45–55 Mertens J-F (1987) Repeated games. In: Proceedings of the
Sorin S (1997) Merging, reputation, and repeated games international congress of mathematicians, Berkeley 1986.
with incomplete information. Games Econom Behav American Mathematical Society, Dordrecht, pp 1528–1577
29:274–308 Mertens J-F, Sorin S, Zamir S (1994) Repeated games.
Sorin S, Zamir S (1985) A 2-person game with lack of CORE discussion paper 9420-9422
information on 1 and 1/2 sides. Math Oper Res Sorin S (2002) A first course on zero-sum repeated games.
10:17–23 Math́ematiques et Applications. Springer-Verlag Berlin
Spinat X (2002) A necessary and sufficient condition for Heidelberg
approachability. Math Oper Res 27:31–44 Zamir S (1992) Repeated games of incomplete information:
Vieille N (1992) Weak approachability. Math Oper Res zero-sum. In: Aumann RJ, Hart S (eds) Handbook of
17:781–791 game theory, vol I. Elsevier, North-Holland, pp 109–154
Perfect monitoring Past actions of all players
Reputation Effects are public information.
Repeated game The finite or infinite repetition
George J. Mailath of a stage game.
Department of Economics, University of Reputation bound The lower bound on equilib-
Pennsylvania, Philadelphia, USA rium payoffs of a player that the other player
(s) believe may be a simple action type
(typically the Stackelberg type).
Article Outline Short-lived player Player not subject to
intertemporal incentives, having a one-period
Glossary horizon and so is myopically optimizing.
Definition of the Subject Simple action type An action who plays the
Introduction same (pure or mixed) stage-game action in
A Canonical Model every period, regardless of history.
Two Long-Lived Players Stackelberg action In a stage game, the action a
Future Directions player would commit to, if that player had the
Bibliography chance to do so, i. e., the optimal commitment
action.
Glossary Stackelberg type A simple action type that plays
the Stackelberg action.
Action type A type of player who is committed Stage game A game played in one period.
to playing a particular action, also called a Subgame perfect equilibrium A strategy pro-
commitment type or behavioral type. file that induces a Nash equilibrium on every
Complete information Characteristics of all subgame of the original game.
players are common knowledge. Subgame In a repeated game with perfect mon-
Flow payoff Stage game payoff. itoring, the game following any history.
Imperfect monitoring Past actions of all players Type The characteristic of a player that is not
are not public information. common knowledge.
Incomplete information Characteristics of
some player are not common knowledge.
Long-lived player Player subject to Definition of the Subject
intertemporal incentives, typically has the
same horizon as length of the game. Repeated games have many equilibria, including
Myopic optimum An action maximizing stage the repetition of stage game Nash equilibria. At
game payoffs. the same time, particularly when monitoring is
Nash equilibrium A strategy profile from which imperfect, certain plausible outcomes are not con-
no player has a profitable unilateral deviation sistent with equilibrium. Reputation effects is the
(i. e., it is self-enforcing). term used for the impact upon the set of equilibria
Nash reversion In a repeated game, permanent (typically of a repeated game) of perturbing the
play of a stage game Nash equilibrium. game by introducing incomplete information of a
Normalized discounted value The discounted particular kind. Specifically, the characteristics of
sum of an infinite sequence {at}t 0, calculated a player are not public information, and the other
as (1 d) t 0dtat, where d (0, 1) is the players believe it is possible that the distinguished
discount value. player is a type that necessarily plays some action
(typically the Stackelberg action). Reputation sufficiently large (though finite) number of
effects fall into two classes: “Plausible” phenom- times, the two players would find a way to play
ena that are not equilibria of the original repeated cooperatively (C) at least in the initial stages. In
game are equilibrium phenomena in the presence response, (Kreps et al. 1982) argued that intuition
of incomplete information, and “implausible” can be rescued in the finitely repeated prisoners’
equilibria of the original game are not equilibria dilemma by introducing incomplete information.
of the incomplete information game. As such, In particular, suppose each player assigns some
reputation effects provide an important qualifica- probability to their opponent being a behavioral
tion to the general indeterminacy of equilibria. type who mechanistically plays tit-for-tat (i. e.,
plays C in the first period or if the opponent had
played C in the previous period, and plays D if the
opponent had played D in the previous period)
Introduction
rather than being a rational player. No matter how
small the probability, if the number of repetitions
Repeating play of a stage game often allows for
is large enough, the rational players will play C in
equilibrium behavior inconsistent with equilib-
early periods, and the fraction of periods in which
rium of that stage game. If the stage game has
CC is played is close to one.
multiple Nash equilibrium payoffs, a large finite
This is the first example of a reputation effect: a
number of repetitions provide sufficient
small degree of incomplete information (of the
intertemporal incentives for behavior inconsistent
right kind) both rescues the intuitive CC for
with stage-game Nash equilibria to arise in some
many periods as an equilibrium outcome, and
subgame perfect equilibria. However, many clas-
eliminates the unintuitive always DD as one. In
sic games do not have multiple Nash equilibria.
the same issue of the Journal of Economic Theory
For example, mutual defection DD is the unique
containing (Kreps et al. 1982), Kreps and
Nash equilibrium of the prisoners’ dilemma, illus-
Wilson(1982) and Milgrom and Roberts (1982)
trated in Fig. 1.
explored reputation effects in the finite chain store
A standard argument shows that the finitely
of Selten (1978), showing that intuition is again
repeated prisoner’s dilemma has a unique sub-
rescued, this time by introducing the possibility
game perfect equilibrium, and in this equilibrium,
that the chain store is a “tough” type who always
DD is played in every period: In any subgame
fights entry.
perfect equilibrium, in the last period, DD must
Reputation effects describe the impact upon the
be played independently of history, since the stage
set of equilibria of the introduction of small
game has a unique Nash equilibrium. Then, since
amounts of incomplete information of a particular
play in the last period is independent of history,
form into repeated games (and other dynamic
there are no intertemporal incentives in the penul-
games). Reputation effects fall into two classes:
timate period, and so DD must again be played
“Plausible”phenomena that are not equilibria of
independently of history. Proceeding recursively,
the complete information game are equilibrium
DD must be played in every period independently
phenomena in the presence of incomplete infor-
of history. (In fact, the finitely repeated prisoners’
mation, and “implausible” equilibria of the com-
dilemma has a unique Nash equilibrium outcome,
plete information game are not equilibria of the
given by DD in every period.)
incomplete information game.
This contrasts with intuition, which suggests
Reputation effects are distinct from the equi-
that if the prisoners’ dilemma were repeated a
librium phenomenon in complete information
repeated games that are sometimes described as
Reputation Effects, capturing reputations. In this latter use, an equi-
Fig. 1 The prisoners’ C D
dilemma. The cooperative C 2, 2 –1, 3 librium of the complete information repeated
action is labeled C, while D 3, –1 0, 0 game is selected, involving actions along the equi-
defect is labeled D librium path that are not Nash equilibria of the
Reputation Effects 187
payoffs in the infinite horizon game are compara- the normal type (since the action type plays H in
ble to flow payoffs). However, there are many every period). In such an equilibrium, customers
other equilibria, including one in which low effort play h before t (since both types of firm are choos-
is exerted and low price purchased in every ing H). After observing H in period t, customers
period, leading to a payoff of 1 for the long-lived conclude the firm is the H-action type. Conse-
player. Indeed, for d 1 / 2, the set of pure- quently, as long as H is always chosen thereafter,
strategy subgame-perfect-equilibrium player customers subsequently play h (since they con-
1 payoffs is given by the entire interval (Abreu tinue to believe the firm is the H-action type, and
and Gul 2000; Benabou and Laroque 1992). so necessarily plays H). An easy lower bound on
Reputation effects effectively rule out any pay- the normal firm’s equilibrium payoff is then
off less than 2 as an equilibrium payoff for player obtained by observing that the normal firm’s pay-
1. Suppose customers are not entirely certain of off must be at least the payoff from mimicking the
the characteristics of the firm. More specifically, action type in every period. The payoff from such
suppose they attach high probability to the firm’s behavior is at least as large as
being “normal,” that is, having the payoffs given
above, but they also entertain some (possibly very X
t1
small) probability that they face a firm who fortu- ð 1 dÞ dt 2
itously has a technology or some other character- t¼0
|fflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflffl}
istic that ensures high effort. Refer to the latter as payoff in t<t from pooling
the “H-action” type of firm. Since such a type with Haction type
t
necessarily plays H in every period, it is a type þ ð1 dÞd 0
described by behavior (not payoffs), and such a |fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl}
payoff in t from playing H when
type is often called a behavioral or L may be myopically optimal
commitment type.
X
1
This is now a game of incomplete information, þ ð1 dÞ dt 2
with the customers uncertain of the firm’s type. t¼tþ1
|fflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflffl}
Since the customers assign high probability to the
payoff in t>t from playing like
firm being “normal,” the game is in some sense and being treated as the Haction type
close to the game of complete information. None
the less, reputation effects are present: For a suf- ¼ ð1 dt Þ2 þ dtþ1 2 ¼ 2 2dt ð1 dÞ
ficiently patient firm, in any Nash equilibrium of 2 2ð1 dÞ ¼ 2d:
the repeated game, the firm’s payoff cannot be
significantly less than 2. This result holds no mat- The outcome in which the stage game Nash
ter how unlikely customers think the H-action equilibrium L‘ is played in every period is thus
type to be, though increasing patience is required eliminated.
from the normal firm as the action type becomes Since reputation effects are motivated by the
less likely. hypothesis that the short-lived players are uncer-
The intuition behind this result is most easily tain about some aspect of the long-lived player’s
seen by considering pure strategy Nash equilibria characteristics, it is important that the results are
of the incomplete information game where the not sensitive to the precise nature of that uncer-
customers believe the firm is either the normal or tainty. In particular, the lower bound on payoffs
the H-action type. In that case, there is no pure should not require that the short-lived players only
strategy Nash equilibrium with a payoff less than assign positive probability to the normal and the H-
2d (which is clearly close to 2 for d close to 1). In action type (as in the game just analyzed). And it
the pure strategy Nash equilibrium, either the firm does not: The customers in the example may assign
always plays H, (in which case, the customers positive probability to the firm being an action type
always play h and the firm’s payoff is 2), or that plays H on even periods, and L on odd periods,
there is a first period (say t) in which the firm as well as to an action type that plays H in every
plays L, revealing to future customers that he is period before some period t 0(that can depend on
Reputation Effects 189
history), and then always plays L. Yet, as long as when always observing H, the action of the H-
the customers assign positive probability to the H- action type. In contrast, imperfect monitoring
action type, for a sufficiently patient firm, in any requires consideration of belief evolution on all
Nash equilibrium of the repeated game, the firm’s histories that arise with positive probability.
payoff cannot be significantly less than 2. None the less, the intuition is the same: Con-
Reputation effects are more powerful in the sider a putative equilibrium in which the normal
presence of imperfect monitoring. Suppose that firm receives a payoff less than 2 e. Then the
the firm’s choice of H or L is not observed by the normal and action types must be making different
customers. Instead, the customers observe a pub- choices over the course of the repeated game, since
lic signal y {y, y} at the end of each period, an equilibrium in which they behave identically
where the signal y is realized with probability p would induce customers to choose h and would
(0, 1) if the firm chose H, and with the smaller yield a payoff of 2. As in the perfect monitoring
probability q (0, p) if the firm chose L. Interpret case, the normal firm has the option of mimicking
y as a good meal: while customers do not observe the behavior of the H-action type. Suppose the
effort, they do observe a noisy signal (the quality normal firm does so. Since the customers expect
of the meal) of that effort, with high effort leading the normal type of firm to behave differently from
to a good meal with higher probability. In the the H-action type, they will more often see signals
game with complete information, the largest equi- indicative of the H-action type (rather than the
librium payoff to the firm is now given by normal type), and so must eventually become con-
vinced that the firm is the H-action type. Hence, in
1p response to this deviation, the customers will even-
v1 2 , (1)
pq tually play their best response to H of h. While
“eventually” may take a while, that time is inde-
reflecting the imperfect monitoring of the firm’s pendent of the equilibrium (indeed of the discount
actions (the firm is said to be subject to binding factor), depending only on the imperfection in the
moral hazard, see Sect. 7.6 in Mailath and monitoring and the prior probability assigned to the
Samuelson (2006)). Since deviations from H-action type. Then, if the firm is sufficiently
H cannot be detected for sure, there are no equi- patient, the payoff from mimicking the H-action
libria with the deterministic outcome path of Hh in type is arbitrarily close to 2, contradicting the exis-
every period. In some periods after some histories, tence of an equilibrium in which the firm’s payoff
L‘ must be played in order to provide the appro- fell short of 2 e.
priate intertemporal incentives to the firm. At the same time, because monitoring is imper-
As under perfect monitoring, as long as cus- fect, as discussed in section “Temporary Reputa-
tomers assign positive probability to the H-action tion Effects,” the reputation effects are necessarily
type in the incomplete information game with transient. Under general conditions in imperfect-
imperfect monitoring, for a sufficiently patient monitoring games, the incomplete information
firm, in any Nash equilibrium of the repeated that is at the core of reputation effects is a short-
game, the firm’s payoff cannot be significantly run phenomenon. Player 2 must eventually come
less than 2 (in particular, this lower bound exceeds to learn player 1’s type and continuation play must
vˉ 1). Thus, in this case, reputation effects provide converge to an equilibrium of the complete
an intuitive lower bound on equilibrium payoffs information game.
that both rules out “bad” equilibrium payoffs, as Reputation effects arise for very general spec-
well as rescues outcomes in which Hh occurs in ifications of the incomplete information as long as
most periods. the customers assign strictly positive probability
Proving that a reputation bound holds in the to the H-action type. It is critical, however, that the
imperfect monitoring case is considerably more customers do assign strictly positive probability to
involved than in the perfect monitoring case. In the H-action type. For example, in the product-
perfect-monitoring games, it is only necessary to choice game, the set of Nash equilibria of the
analyze the evolution of the customers’ beliefs repeated game is not significantly impacted by
190 Reputation Effects
the possibility that the firm is either normal or the particular, it is not necessary that the actions of
L-action type only. While reputation effects per se player 2 be public. If these are also imperfectly
do not arise from the L-action type, it is still of monitored, then the ex post payoff for player 1 is
interest to investigate the impact of such uncer- independent of player 2 actions. Since player 2 is
tainty on behavior using stronger equilibrium short-lived, when player 2’s actions are not public,
notions, such as Markov perfection (see Mailath it is then natural to also assume that the period
and Samuelson (2001)). t player 2 does not know earlier player 2’s actions.
denoted by y is drawn from a finite set Y, with ðY AÞt and the set of public histories (which
the probability that y is realized under the pure coincides with the set of player 2’s histories) is H
t
action profile a A A1 A2 denoted by r(y j U1 t¼0 ðY A2 Þ . If the game has perfect monitor-
a). Player 1’s ex post payoff from the action pro- ing, histories h = (y0, a0; y1, a1; . . .; yt1, at 1) in
file a and signal realization y is r1(y, a), and so the which y 6¼ at1 for some t t 1 arise with zero
ex ante (or expected) flow payoff is u1(a) yr1(y, probability, independently of behavior, and so can
a)r(y j a). Player 2’s ex post payoff from the be ignored. A strategy s1 for player 1 specifies
action profile a and signal realization y is r2(y, a probability distribution over 1’s pure action
a2), and so the ex ante (or expected) flow payoff set for each possible private history, i. e., s1 : H1
is u2(a) yr2(y, a2)r(y j a). Since player 2’s ex ! D(A1). A strategy s2 for player 2 specifies a
post payoff is independent of player 1’s actions, probability distribution over 2’s pure action set for
player 1’s actions only affect player 2’s payoffs each possible public history, i. e., s2: H ! D(A2)..
through the impact on the distribution of the sig-
nals and so on ex ante payoffs. While the ex post Definition 1 The strategy profile ( s1 , s2 ) is a
payoffs ri play no explicit role in the analysis, they Nash equilibrium if
justify the informational assumptions to be made.
In particular, the model requires that histories of 1. there does not exist a strategy s1 yielding a
signals and past actions are the only information strictly higher payoff for player 1 when player
players receive, and so it is important that stage 2 plays s2 , and
game payoffs ui are not informative about the 2. in all periods t, after any history ht H arising
t
action choice (and this is the critical feature deliv- with positive probability
t under (st 1 , s2 ), s2 (h )
ered by the assumptions that ex ante payoffs are maximizes E u2 s1 h1 , a1 j h , where the
not observable and that payer 2’s ex post payoffs expectation is taken over the period t-private
do not depend on a 1). histories that player 1 may have observed.
Perfect monitoring is the special case where The Incomplete Information Repeated Game
Y = A1 and r(y j a) = 1 if y = a1, and 0 otherwise. In the incomplete information game, the type of
The results in this section hold under signifi- player 1 is unknown to player 2. A possible type
cantly weaker monitoring assumptions. In of player 1 is denoted by x X, where X is a finite
Reputation Effects 191
or countable set (see Fudenberg and Levine Definition 2 The strategy profile ( s1 , s2 ) is a
(1992) for the uncountable case). Player 2’s prior Nash equilibrium of the incomplete information
belief about 1’s type is given by the distribution m, game if
with support X. The set of types is partitioned into
a set of payoff types X1, and a set of action types 1. for all x X1, there does not exist a repeated
X2 X∖X1. Payoff types maximize the average game strategy s1 yielding a strictly higher
discounted value of payoffs, which depend on payoff for payoff type x of player 1 when
their type and which may be nonstationary, player 2 plays s2 , and
2. in all periods t, after any history ht H arising
u1 : A1 A2 X1 N0 ! R: with positive probability under (s1 , s2 ) and m,
s2 (ht) maximizes E u2 s1 ht1 , x , a1 j ht ,
Type x0 X1 is the normal type of player 1, where the expectation is taken over both the
who happens to have a stationary payoff function, period t-private histories that player 1 may
given by the stage game in the benchmark game of have observed and player 1’s type.
complete information,
Example 1 Consider the product-choice game
u1 ða, x0 , tÞ ¼ u1 ðaÞ 8a A, 8t N0 : (Fig. 2) under perfect monitoring. The firm is will-
ing to commit to H to induce h from customers. This
It is standard to think of the prior probability incentive to commit is best illustrated by consider-
m(x0) as being relatively large, so the games of ing a sequential version of the product-choice game:
incomplete information are a seemingly small The firm first publicly commits to an effort, and then
departure from the underlying game of complete the customer chooses between h and ‘, knowing the
information, though there is no requirement that firm’s choice. In this sequential game, the firm
this be the case. chooses H in the unique subgame perfect equilib-
Action types (also called commitment or behav- rium. Since Stackelberg (1934) was the first inves-
ioral types) do not have payoffs, and simply play a tigation of such leader-follower interactions, it is
specified repeated game strategy. For any repeated- traditional to call H the Stackelberg action, and the
game strategy from the complete information game, H-action type of player 1 the Stackelberg type, with
^ 1 : H1 ! DðA1 Þ, denote by x(^
s s 1 ) the action type associated Stackelberg payoff 2. Suppose X = {x0,
committed to the strategy s ^ 1. In general, a commit- x(H), x(L)}. For d 1 / 2, the grim trigger strategy
ment type of player 1 can be committed to any profile of always playing Hh, with deviations
strategy in the repeated game. If the strategy in punished by Nash reversion, is a subgame perfect
question plays the same (pure or mixed) stage- equilibrium of the complete information game. Con-
game action in every period, regardless of history, sider the following adaptation of this profile in the
that type is called a simple action type. For example, incomplete information game:
the H-action type in the product-choice game is a
simple action type. The (simple action) type that ht , xÞ
s1 ð8
plays the pure action a 1 in every period is denoted < H, if x ¼ xðH Þ, or x ¼ x0 and
by x(a1) and similarly the simple action type com- ¼ at ¼ Hh for all t < t,
mitted to a1 D(A1) is denoted by x(a1). As will be :
L, otherwise,
seen soon, allowing for mixed action types is an t h, if at ¼ Hh for all t < t,
and s2 ðh Þ ¼
important generalization from simple pure types. ‘, otherwise:
A strategy for player 1, also denoted by s1 :
H1 X ! D(A1), specifies for each type x X a In other words, player 2 and the normal type of
repeated game strategy such that for all x(^ s1 ) player 1 follow the strategies from the Nash-rever-
X2, the strategy s ^ 1 is specified. A strategy s2 for sion equilibrium in the complete information
player 2 is as in the complete information game, game, and the action types x(H) and x(L) play
i. e., s2: H ! D(A2). their actions.
192 Reputation Effects
This is a Nash equilibrium for d 1 / 2 and game, for example, a commitment by player 1 to
m(x(L)) < 1 / 2. The restriction on m(x(L)) ensures mixing between H and L, with slightly larger
that player 2 finds h optimal in period 0. Should probability on H, still induces player 2 to choose
player 2 ever observe L, then Bayes’ rule causes h and gives player 1 a larger payoff than a com-
her to place probability 1 on type x(L) (if L is mitment to H. Define the mixed-action
observed in the first period) or the normal type Stackelberg payoff as
(if L is first played in a subsequent period), making
her participation in Nash reversion optimal. The
v
1 sup min u1 ða1 , a2 Þ, (3)
restriction on d ensures that Nash reversion pro- a1 DðA1 Þ a2 Bða1 Þ
vides sufficient incentive to make H optimal for the
normal player 1. After observing a01 ¼ H in period
0, player 2 assigns zero probability to x = x(L). where Bða1 Þ ¼ arg maxa2 u2 ða1 , a2 Þ is the set of
However, the posterior probability that 2 assigns to player 2’s best responses to a. In the product
the Stackelberg type does not converge to 1. In choice game, v1 ¼ 2, while v 1 ¼ 5=2. Typically,
period 0, the prior probability is m(x(H)). After the supremum is not achieved by any mixed
one observation of H, the posterior increases to action, and so there is no mixed-action
m(x ) / [m(x ) + m(x0)], after which it is constant. Stackelberg type. However, there are mixed
By stipulating that an observation of H in a history action types that, if player 2 is convinced she is
in which L has previously been observed causes facing such a type, will yield payoffs arbitrarily
player 2 to place probability one on the normal type close to the mixed-action Stackelberg payoff.
of player 1, a specification of player 2’s beliefs that As with imperfect monitoring, simple mixed
is consistent with sequentiality is obtained. action types under perfect monitoring raise issues
As seen in the introduction, for d close to 1, of monitoring, since a deviation by the normal type
s1(ht, x0) = L for all ht is not part of any Nash from the distribution a1 of a mixed action type
equilibrium. x(a1), to some action in the support cannot be
detected. However, when monitoring of the pure
The Reputation Bound actions is perfect, it is possible to statistically detect
Which type would the normal type most like to be deviations, and this will be enough to imply the
treated as? Player 1’s pure-action Stackelberg appropriate reputation lower bound.
payoff is defined as When monitoring is imperfect, the public sig-
nals are statistically informative about the actions
v1 ¼ sup min u1 ða1 , a2 Þ, (2) of the long-lived player under the next assumption
a1 A a2 Bða1 Þ
(Lemma 1).
where Bða1 Þ ¼ arg maxa2 u2 ða1 , a2 Þ is the set of
player 2 myopic best replies to a. If the supremum Assumption 1 For all a2 A2, the collection of
is achieved by some action a1 , that action is an probability distributions {r(y j (a1, a2): a1 A1}
associated Stackelberg action, is linearly independent.
This assumption is trivially satisfied in the
a1 arg max min u1 ða1 , a2 Þ: perfect monitoring case. Reputation effects still
a1 A1 a2 Bða1 Þ
exist when this assumption fails, but the bounds
This is a pure action to which player 1 would are more complicated to calculate (see Fudenberg
commit, if player 1 had the chance to do so (and and Levine 1992 or Sect. 15.4.1 in Mailath and
hence the name “Stackelberg” action, see the dis- Samuelson 2006).
cussion in Example 1), given that such a commit- Fixing an action for player 2, a2, the mixed action
P
ment induces a best response from player 2. If a1 implies the signal distribution a1 rðy j ða1 , a2 ÞÞ
there is more than one such action for player 1, a1 ða1 Þ.
the action can be chosen arbitrarily.
However, player 1 would typically prefer to Lemma 1 Suppose r satisfies Assumption 1.
commit to a mixed action. In the product-choice Then, if for some a2,
Reputation Effects 193
X
rðy j ða1 , a2 ÞÞa1 ða1 Þ that plays the Stackelberg action a1 with proba-
a1 bility 1, Eq. 5 becomes
X
¼ rðy j ða1 , a2 ÞÞa1 ða1 Þ, 8y, (4)
a1 v1 ðx0 , m, dÞ ð1 ÞdK v1
þ 1 ð1 ÞdK min u1 ðaÞ
then a1 = a0 1. aA
K
v1 1 ð1 Þd 2M,
Proof Suppose 4 holds for some a 2. Let R denote
the | Y | | A1 | matrix whose y-a 1 element is given where M maxa | u1 (a)|. This last expression is at
by r(y j (a1, a2)) (so that the a 1-column is the least as large as v1 e when < e / (2 M) and d is
probability distribution on Y implied by the action sufficiently close to 1. The mixed action
profile a a). Then, 4 can be written as Ra1 = Ra0 1, Stackelberg reputation bound is also covered:
or more simply as R(a1 a01) = 0. By Assump-
tion 1, R has full column rank, and so x = 0 is the Corollary 1 Suppose r satisfies Assumption 1
only vector x RjA1 j solving Rx = 0. and m assigns positive probability to some
1
Consequently, if player 2 believes that the sequence of simple types x ak1 k¼1 with each
long-lived player’s behavior implies a distribution ak1 in D(A) satisfying
over the signals close to the distribution implied
by some particular action a0 1, then player 2 must v
1 ¼ lim min u ak1 , a :
k!1 a2 Bðak Þ
believe that the long-lived player’s action is also 1
can be viewed as a one-step ahead prediction of the Proof of Proposition 1 Fix > 0. From Lemma
signal y that will be realized conditional on the 1, by choosing c sufficiently small in Lemma 2,
history ht, P(y j ht). Let m ^ t(ht) = P(^x j ht) denote ^
with P-probability at least 1 , there are at most
the posterior probability after observing ht that the K periods in which the short-lived players are not
short-lived player assigns to the long-lived player best responding to ^a 1 .
having type ^ x . Note also that if the long-lived Since a deviation by the long-lived player to
player is the action type ^x, then the true probability the simple strategy of always playing ^a 1 induces
^ (y j ht) = r (y j (H, s2(ht))). Then,
of the signal y is P ^ the
the same distribution on public histories as P,
long-lived player’s expected payoff from such a
^ ðy j ht Þ
^ t ð ht Þ P
Pð y j h t Þ ¼ m deviation is bounded below by the right side of 5.
~ ðy j ht Þ:
^ t ð ht Þ Þ P
þ ð1 m
The key step in the proof of Proposition 1 is a Temporary Reputation Effects
statistical result on merging . The following lemma Under perfect monitoring, there are often pooling
essentially says that the short-lived players cannot equilibria in which the normal and some action
be surprised too many times. Note first that an type of player 1 behave identically on the equilib-
infinite public history h1 can be thought of as a rium path (as in Example 1). Deviations on the
sequence of ever longer finite public histories ht . part of the normal player 1 are deterred by the
Consider the collection of infinite public histories prospect of the resulting punishment. Under
with the property that player 2 often sees histories imperfect monitoring, such pooling equilibria do
ht that lead to very different one-step ahead pre- not exist. The normal and action types may play
dictions about the signals under P ~ and under P
^ and identically for a long period of time, but the nor-
have a “low” posterior that the long-lived player is mal type always eventually has an incentive to
^x. The lemma asserts that if the long-lived player cheat at least a little on the commitment strategy,
contradicting player 2’s belief that player 1 will
is in fact the action type ^x , this collection of
exhibit commitment behavior. Player 2 must then
infinite public histories has low probability. See-
eventually learn player 1’s type.
ing the signals more likely under ^x leads the short-
In addition to Assumption 1, disappearing rep-
lived players to increase the posterior probability
utation effects require full support monitoring.
on ^
x. The posterior probability fails to converge to
1 under P ^ only if the play of the types different
^ Assumption 2 For all a A, y Y, r(y j a) > 0.
from x leads, on average, to a signal distribution
This assumption implies that Bayes’ rule deter-
similar to that implied by ^x. For the purely statis-
mines the beliefs of player 2 about the type of
tical statement and its proof, see Section 15.4.2 in
player 1 after all histories.
(Mailath and Samuelson 2006).
Suppose there are only two types of player 1,
Lemma 2 For all , c > 0 and m † (0, 1], there the normal type x0 and a simple action type ^x ,
where ^x = x( ^a 1 ) for some ^a 1 D(A1). The
exists a positive integer K such that for all m(^x)
analysis is extended to many commitment types in
[m †, 1), for every strategy s1 : H1 X ! D(A1)
Section 6.1 in Cripps et al. (2004). It is convenient
and s2 : H ! D(A2),
to denote a strategy for player 1 as a pair of
functions e s 1 and e s 1 (so e s 1 (h1) = ^a 1 for all
^ 1
P h : j t 1 : ð1 m ~ ðy j ht Þ
^ t ðht ÞÞ max j P h1 H1 ), the former for the normal type and
y
the latter for the action type.
^ ðy j ht Þj cgj KÞ :
P (6) Recall that P D(O) is the unconditional
probability measure induced by the prior m, and
Note that the bound K holds for all strategy pro- the strategy profile ( s ^1 , e
s 1 , s2), while P^ is the
files (s1, s2) and all prior probabilities m(^x) [m†, measure induced by conditioning on x . Since ^
1). This allows us to bound equilibrium payoffs. {x0} = X ∖ { ^x }, P ~ is the measure induced by
Reputation Effects 195
conditioning on x0. That is, P ^ is induced by the probability to H and L. This indifference can be
strategy profile s ^ ¼ ðs^ 1 , s2 Þ and P ~ by exploited to construct an equilibrium in which
e
s ¼ ðes 1 , s2 Þ, describing how play evolves when (the normal) player 1 plays a0 1 after every history
player 1 is the commitment and normal type, (Section 7.6.2 in (Mailath and Samuelson 2006)).
respectively. This will still be an equilibrium in the game of
The action of the commitment type satisfies the incomplete information in which the commitment
following assumption. type plays a0, with the identical play of the normal
and commitment types ensuring that player
Assumption 3 Player 2 has a unique stage-game 2 never learns player 1’s type. In contrast, player
best response to ^a 1 (denoted by ^a 2 ) and 2 has a unique best response to any other mixture
a1, ^
^a ð^ a2Þ is not a stage-game Nash on the part of player 1. Therefore, if the commit-
equilibrium. ment type is committed to any mixed action other
^ 2 denote the strategy of playing the unique
Let s than a0 1, player 2 will eventually learn player
best response ^
a 2 to ^a 1 in each period independently 1’s type.
of history. Since ^a is not a stage-game Nash As in Proposition 1, a key step in the proof of
equilibrium, ðs^ 1, s
^ 2 Þis not a Nash equilibrium of Proposition 2 is a purely statistical result on
the complete information infinite horizon game. updating. Either player 2’s expectation (given
her history) of the strategy played by the normal
t
Proposition 2 ((Cripps et al. 2004)) Suppose the type E~ e
s 1 j ht , where E~ denotes expectation with
monitoring distribution r satisfies Assumptions 1 ~ is in the limit identical to the strategy
respect to P)
and 2, and the commitment action ^a 1 satisfies played by the action type ( ^a 1 ), or player 2’s
Assumption 3. In any Nash equilibrium of the posterior probability that player 1 is the action
game with incomplete information, the posterior type (^m (ht)) converges to zero (given that player
probability assigned by player 2 to the commit- 1 is indeed normal). This is a merging argument
~ i. e.,
^ t , converges to zero under P,
ment type, m and closely related to Lemma 2. If the distribu-
tions generating player 2’s signals are different for
^ t ðht Þ ! 0,
m ~ a:s:
P the normal and action type, then these signals
provide information that player 2 will use in
The intuition is straightforward: Suppose there is updating her posterior beliefs about the type she
a Nash equilibrium of the incomplete information faces. This (converging, since beliefs are a mar-
game in which both the normal and the action type tingale) belief can converge to an interior proba-
receive positive probability in the limit (on a positive bility only if the distributions generating the
probability set of histories). On this set of histories, signals are asymptotically uninformative, which
player 2 cannot distinguish between signals gener- requires that they be asymptotically identical.
ated by the two types (otherwise player 2 could
ascertain which type she is facing), and hence Lemma 3 Suppose the monitoring distribution r
must believe that the normal and action types are satisfies Assumptions 1 and 2. Then in any Nash
playing the same strategies on average. But then equilibrium,
player 2 must play a best response to this strategy,
t
and hence to the action type. Since the action type’s ^ t max
^a ðaÞ E~ e
lim m s 1 ð a1 Þ j ht
x!1 a1
behavior is not a best response for the normal type
(to this player 2 behavior), player 1 must eventually ~ a:s:
¼ 0, P (7)
find it optimal to not play the action-type strategy,
contradicting player 2’s beliefs. Given Proposition 2, it should be expected that
Assumption 3 requires a unique best response continuation play converges to an equilibrium of
to ^
a 1 . For example, in the product-choice game, the complete information game, and this is indeed
every action for player 2 is a best response to the case. See Theorem 2 (Cripps et al. 2004) for
player 1’s mixture a0 1 that assigns equal the formal statement.
196 Reputation Effects
Proposition 2 leaves open the possibility that game. Eventually, however, such behavior must
for any period T, there may be equilibria in which give way to a regime in which player 2 is
uncertainty about player 1’s type survives beyond (correctly) convinced of player 1’s type.
T, even though such uncertainty asymptotically For any prior probability m^ that the long-lived
disappears in any equilibrium. This possibility player is the commitment type and for any e > 0,
cannot arise. The existence of a sequence of there is a discount factor d sufficiently large that
Nash equilibria with uncertainty about player 1’s player 1’s expected payoff is close to the commit-
type persisting beyond period T ! 1 would ment-type payoff. This holds no matter how small
imply the (contradictory) existence of a limiting ^ . However, for any fixed d and in any equilib-
m
Nash equilibrium in which uncertainty about rium, there is a time at which the posterior prob-
player 1’s type persists. ability attached to the commitment type has
dropped below the corresponding critical value
Proposition 3 ((Cripps et al. 2007)) Suppose the of m^ , becoming too small (relative to d) for repu-
monitoring distribution r satisfies Assumptions 1 tation effects to operate.
and 2, and the commitment action ^a 1 satisfies A reasonable response to the results on
Assumption 3. For all e > 0, there exists T such disappearing reputation effects is that a model of
that for any Nash equilibrium of the game with long-run reputations should incorporate some mech-
incomplete information, anism by which the uncertainty about types is con-
tinually replenished. For example, Holmström
~ ðm
P ^ t < e, 8t > T Þ > 1 e: (1982), (Cole et al. 1995), Mailath and Samuelson
(2001), and Phelan (2006) assume that the type of the
Example 2 Recall that in the product-choice long-lived player is governed by a stochastic process
game, the unique player 2 best response to H is to rather than being determined once and for all at the
play h, and Hh is not a stage-game Nash equilib- beginning of the game. In such a situation, reputation
rium. Proposition 1 ensures that the normal player effects can indeed have long-run implications.
1’s expected value in the repeated game of incom-
plete information with the H-action type is arbi-
trarily close to 2, when player 1 is very patient. In Reputation as a State
particular, if the normal player 1 plays H in every The posterior probability that short-lived players
period, then player 2 will at least eventually play assign to player 1 being ^x is sometimes interpreted
her best response of h. If the normal player as player 1’s reputation, particularly if ^x is the
1 persisted in mimicking the action type by playing Stackelberg type. When X contains only the nor-
H in each period, this behavior would persist indef- mal type and ^x , the posterior belief m ^ t is a state
initely. It is the feasibility of such a strategy that lies variable of the game, and attention is sometimes
at the heart of the reputation bounds on expected restricted to Markov strategies (i. e., strategies that
payoffs. However, this strategy is not optimal. only depend on histories through their impact on
Instead, player 1 does even better by attaching the posterior beliefs of the short-lived players). An
some probability to L, occasionally reaping the informative example is Benabou and Laroque
rewards of his reputation by earning a stage-game (1992), who study the Markov perfect equilibria
payoff even larger than 2. The result of such equi- of a game in which the uninformed players
librium behavior, however, is that player 2 must respond continuously to their beliefs. They show
eventually learn player 1’s type. The continuation that the informed player eventually reveals his
payoff is then bounded below 2 (recall (1)). type in any Markov perfect equilibrium. On the
Reputation effects arise when player 2 is uncer- other hand, Markov equilibria need not exist in
tain about player 1’s type, and there may well be a finitely repeated reputation games (Section 17.3 in
long period of time during which player 2 is suffi- (Mailath and Samuelson 2006)).
ciently uncertain of player 1’s type (relative to the The literature on reputation effects has typi-
discount factor), and in which play does not resem- cally not restricted attention to Markov strategies,
ble an equilibrium of the complete information since the results do not require the restriction.
Reputation Effects 197
Indeed, player 1 can often be assured of an even Chatterjee K, Samuelson L (1987) Bargaining with two-sided
higher payoff, in the presence of commitment types incomplete information: an infinite horizon model with
alternating offers. Rev Econ Stud 54(2):175–192
who play nonstationary strategies (Celentani et al. Chatterjee K, Samuelson L (1988) Bargaining with two-
1996). At the same time, these reputation effects sided incomplete information: the unrestricted offers
are temporary (Theorem 2 in (Cripps et al. 2007)). case. Oper Res 36(4):605–638
Finally, there is a literature on reputation Cole HL, Dow J, English WB (1995) Default, settlement,
and signalling: lending resumption in a reputational
effects in bargaining games (see (Abreu and Gul model of sovereign debt. Int Econ Rev 36(2):365–385
2000; Chatterjee and Samuelson 1987, 1988; Cripps MW, Thomas JP (1997) Reputation and perfection
Schmidt 1993a)), where the issues described in repeated common interest games. Games Econ
above are further complicated by the need to Behav 18(2):141–158
Cripps MW, Mailath GJ, Samuelson L (2004) Imperfect
deal with the bargaining model itself. monitoring and impermanent reputations. Econometrica
72(2):407–432
Cripps MW, Mailath GJ, Samuelson L (2007)
Future Directions Disappearing private reputations in long-run relation-
ships. J Econ Theory 134(1):287–316
Evans R, Thomas JP (1997) Reputation and experimenta-
The detailed structure of equilibria of the incom- tion in repeated games with two long-run players.
plete information game is not well understood, Econometrica 65(5):1153–1173
even for the canonical game of section “A Canon- Fudenberg D, Levine DK (1989) Reputation and equilib-
rium selection in games with a patient player.
ical Model.” A more complete description of the Econometrica 57(4):759–778
structure of equilibria is needed. Fudenberg D, Levine DK (1992) Maintaining a reputation
While much of the discussion was phrased in when strategies are imperfectly observed. Rev Econ
terms of the Stackelberg type, Proposition 1 pro- Stud 59(3):561–579
Holmström B (1982) Managerial incentive problems: a
vides a reputation bound for any action type. dynamic perspective. In: Essays in economics and
While in some settings, it is natural that the management in honour of Lars Wahlbeck. Swedish
uninformed players assign strictly positive proba- School of Economics and Business Administration,
bility to the Stackelberg type, it is not natural in Helsinki, pp 209–230. Published in: Rev Econ Stud
66(1):169–182
other settings. A model endogenizing the nature Kreps D, Wilson R (1982) Reputation and imperfect infor-
of action types would be an important addition to mation. J Econ Theory 27:253–279
the reputation literature. Kreps D, Milgrom PR, Roberts DJ, Wilson R (1982) Ratio-
Finally, while the results on reputation effects nal cooperation in the finitely repeated prisoner’s
dilemma. J Econ Theory 27:245–252
with two long-lived players are discouraging, Mailath GJ, Samuelson L (2001) Who wants a good repu-
there is still the possibility that some modification tation? Rev Econ Stud 68(2):415–441
of the model will rescue reputation effects in this Mailath GJ, Samuelson L (2006) Repeated games and
important setting. reputations: long-run relationships. Oxford University
Press, New York
Acknowledgments I thank Eduardo Faingold, Milgrom PR, Roberts DJ (1982) Limit pricing and entry
KyungMin Kim, Antonio Penta, and Larry Sam- under incomplete information: an equilibrium analysis.
uelson for helpful comments. Econometrica 50:443–459
Phelan C (2006) Public trust and government betrayal.
J Econ Theory 130(1):27–43
Schmidt KM (1993a) Commitment through incomplete
Bibliography information in a simple repeated bargaining game.
J Econ Theory 60(1):114–139
Abreu D, Gul F (2000) Bargaining and reputation. Schmidt KM (1993b) Reputation and equilibrium charac-
Econometrica 68(1):85–117 terization in repeated games of conflicting interests.
Benabou R, Laroque G (1992) Using privileged informa- Econometrica 61(2):325–351
tion to manipulate markets: insiders, gurus, and credi- Selten R (1978) Chain-store paradox. Theory Decis
bility. Q J Econ 107(3):921–958 9:127–159
Celentani M, Fudenberg D, Levine DK, Pesendorfer Stackelberg HV (1934) Marktform und Gleichgewicht.
W (1996) Maintaining a reputation against a long- Springer, Vienna
lived opponent. Econometrica 64(3):691–704
Introduction
Zero-Sum Two Person Games
Conflicts are an inevitable part of human existence.
T. E. S. Raghavan This is a consequence of the competitive stances of
Department of Mathematics, Statistics and greed and the scarcity of resources, which are
Computer Science, University of Illinois, rarely balanced without open conflict. Epic poems
Chicago, IL, USA of the Greek, Roman, and Indian civilizations
which document wars between nation-states or
clans reinforce the historical legitimacy of this
Article Outline statement. It can be deduced that domination is
the recurring theme in human conflicts. In a prim-
Introduction itive sense this is historically observed in the dom-
Games with Perfect Information ination of men over women across cultures while
The Game of Hex on a more refined level it can be observed in the
Approximate Fixed Points imperialistic ambitions of nation-state actors. In
An Application of the Algorithm modern times, a new source of conflict has
Extensive Games and Normal Form Reduction emerged on an international scale in the form of
Saddle Point economic competition between multinational
Mixed Strategy and Minimax Theorem corporations.
Historical Remarks While conflicts will continue to be a perennial
Solving for Value and Optimal Strategies via part of human existence, the real question at hand
Linear Programming is how to formalize mathematically such conflicts
Simplex Algorithm in order to have a grip on potential solutions. We
Fictitious Play can use mock conflicts in the form of parlor games
Search Games to understand and evaluate solutions for real con-
Search Games on Trees flicts. Conflicts are unresolvable when the partici-
Umbrella Folding Algorithm pants have no say in the course of action. For
Completely Mixed Games and Perron’s Theorem example one can lose interest in a parlor game
on Positive Matrices whose entire course of action is dictated by chance.
Behavior Strategies in Games with Perfect Recall Examples of such games are Chutes and Ladders,
Efficient Computation of Behavior Strategies Trade, Trouble etc. Quite a few parlor games com-
General Minimax Theorems bine tactical decisions with chance moves. The
Geometric Consequences game Le Her and the game of Parcheesi are typical
Ky Fan-Sion Minimax Theorems examples. An outstanding example in this category
Applications of Infinite Games is the game of backgammon, a remarkably deep
General Minimax Theorem and Statistical game. In chess, the player who moves first is usu-
Estimation ally determined by a coin toss, but the rest of the
Borel’s Poker Model game is determined entirely by the decisions of the
War Duels and Discontinuous Payoffs on the Unit two players. In such games, players make strategic
Square decisions and attempt to gain an advantage over
Bibliography their opponents.
A game played by two rational players is called removing 3 out of 14. Player I could have
zero-sum if one player’s gain is the other player’s exploited this. But he did not! Though he made a
loss. Chess, Checkers, Gin Rummy, Two-finger good second move, he reverted back to his naive
Morra, and Tic-Tac-Toe are all examples of zero- strategy and made a bad third move. The question
sum two-person games. Business competition is: Can Player II ensure victory for himself by
between two major airlines, two major publishers, intelligently choosing a suitable strategy? Indeed
or two major automobile manufacturers can be Player II can win the game with any strategy
modeled as a zero- sum two-person games (even if satisfying the conditions of g where
the outcome is not precisely zero-sum). Zero-sum
8
games can be used to construct Nash equilibria in >
> 1 if x1 is a multiple of 5
<
many dynamic non-zero-sum games (Thuijsman 2 if x2 is a multiple of 5
g ðxÞ ¼
and Raghavan 1997). >
> 3 if x3 is a multiple of 5
:
4 if x4 is a multiple of 5
Games with Perfect Information Since the game starts with 15 pebbles, Player
I must leave either 14 or 13 or 12, or 11 pebbles.
Emptying a Box Then Player II can in his turn remove 1 or 2 or 3 or
Example 1 A box contains 15 pebbles. Players I and 4 pebbles so that the number of pebbles Player
II remove between one and four pebbles from the box I finds is a multiple of 5 at the beginning of his
in alternating turns. Player I goes first, and the game turn. Thus Player II can leave the box empty in the
ends when all pebbles have been removed. The last round and win the game.
player who empties the box on his turn is the winner, Many other combinatorial games could be
and he receives $1 from his opponent. studied for optimal strategic behavior. We give
The players can decide in advance how many one more example of a combinatorial game, called
pebbles to remove in each of their turn. Suppose a the game of Nim (Bouton 1902).
player finds x pebbles in the box when it is his
turn. He can decide to remove 1, 2, 3 or at most
Nim Game
4 pebbles. Thus a strategy for a player is any
Example 2 Three baskets contain 10, 11, and
function f whose domain is X = {1, 2. . ., 15}
16 oranges respectively. In alternating turns,
and range is R {1, 2, 3, 4} such that
Players I and II choose a non-empty basket and
f(x) min (x, 4). Given strategies f, g for players
remove at least one orange from it. The player
I and II respectively, the game evolves by execut-
may remove as many oranges as he wishes from
ing the strategies decided in advance. For example
the chosen basket, up to the number the basket
if, say
contains. The game ends when the last orange is
2 if x is even removed from the last non-empty basket. The
f ðx Þ ¼
1 if x is odd, player who takes the last orange is the winner.
3 if x 3 In this game as in the previous example at any
gðxÞ ¼ stage the players are fully aware of what has hap-
x otherwise:
pened so far and what moves have been made. The
The alternate depletions lead to the following full history and the state of the game at any instance
scenario are known to both players. Such a game is called a
game with perfect information. How to plan for
move by I II I II I II I II future moves to one’s advantage is not at all clear in
removes 1 3 1 3 1 3 1 2 this case. Bouton (1902) proposed an ingenious
leaving 14 11 10 7 6 3 2 0 solution to this problem which predates the devel-
opment of formal game theory.
In this case the winner is Player II. Actually in His solution hingces on the binary representa-
his first move Player II made a bad move by tion of any number and the inequality that 1 + 2 +
Zero-Sum Two Person Games 201
4 + . . . + 2n < 2n + 1. The numbers 10, 11, 16 have For the game of Nim we found a constructive
the binary representation and explicit strategy for the winner regardless of
any action by the opponent. Sometimes one may
Number Binary representation be able to assert who should be the winner without
10 ¼ 1010 knowing any winning strategy for the player!
11 ¼ 1011
16 ¼ 10000 Definition 3 A zero-sum two person game has
Column totals perfect information if, at each move, both players
ðin base 10 digitsÞ ¼ 12021: know the complete history so far.
There are many variations of nim games and
Bouton made the following key observations: other combinatorial games like Chess and Go that
exploit the combinatorial structure of the game or
1. If at least one column total is an odd number, the end games to develop winning strategies. The
then the player who is about to make a move can classic monographs on combinatorial game the-
choose one basket and by removing a suitable ory is by Berlekamp et al. (1982) on Winning
number of oranges leave all column totals even. Ways for your Mathematical Plays, whose math-
2. If at least one basket is nonempty and if all ematical foundations were provided by Conway’s
column totals are even, then the player who has earlier book On Numbers and Games. These are
to make a move will end up leaving an odd often characterized by sequential moves by two
column total. players and the outcome is either a win or lose
kind. Since the entire history of past moves is
By looking for the first odd column total from common knowledge, the main thrust is in devel-
the left, we notice that the basket with 16 oranges oping winning strategies for such games.
is the right choice for Player I. He can remove the
left most 1 in the binary expansion of 16 and Definition 4 A zero-sum two person game is
change all the other binary digits to the right by called a win-lose game if there are no chance
0 or 1. The key observation is that the new number moves and the final outcome is either Player
is strictly less than the original number. In Player I wins or loses (Player II wins) the game.
I’s move, at least one orange will be removed from (In other words, there is no way for the game to
a basket. Furthermore, the new column totals can end in a tie.)
all be made even. If an original column total is The following is a fundamental theorem of
even we leave it as it is. If an original column total Zermelo (1913).
is odd, we make it even by making any 1 a 0 and
any 0 a 1 in those cases which correspond to the Theorem 5 Any zero-sum two person perfect
basket with 16 oranges. For example the new information win-lose game G with finitely many
binary expansion corresponds to removing all moves and finitely many choices in each move has
but 1 orange from basket 3. We have a winner with an optimal winning strategy.
game as follows. Suppose the subgames are G1, iff either x1 x2, y1 y2 or x1 x2, y1 y2 and max
G2, . . ., Gk. Now among these subgames, let Gs be (| x1 x2 |, | y1 y2 |) = 1. For example the lattice
a game where Player I can ensure a victory for points (4, 10) and (5, 11) are adjacent while (4, 10)
himself, no matter what Player II does in the and (5, 9) are not adjacent. Six vertices are adjacent
subgame. In this case, Player I can determine at to any interior lattice point of the Hex board B n
the very beginning, the right choice of action while lattice points on the frame will have fewer
which leads to the subgame Gs. A good strategy than six adjacent vertices.
for Player I is simply the choice s in the first move The game is played as follows: Players I and II,
followed by his good strategy in the subgame Gs. in alternate turns, choose a vertex from the available
Player II’s strategy for the original game is simply set of unoccupied vertices. The aim of Player I is to
a k-tuple of strategies, one for each subgame. occupy a bridge of adjacent vertices that links a
Player II must be ready to use an optimal strategy vertex on the west boundary with a vertex on the
for the subgame Gr in case the first move of Player east boundary. Player II has a similar objective to
I leads to playing Gr, which is favorable to Player connect the north and south boundary with a bridge.
II. Suppose no subgame Gs has a winning strategy
for Player I. Then Player II will be the winner in
Theorem 6 The game of Hex can never end in a
each subgame. To achieve this, Player II must use
draw. For any T Bn occupied by Player I and the
his winning strategy in each subgame they are
complement T c occupied by Player II, either
lead to. Such a k-tuple of winning strategies, one
T contains a winning bridge for Player I or T c
for each subgame, is a winning strategy for Player
contains a winning bridge for Player II. Further
II for the original game G.
only one can have a winning bridge.
which shares the common side BC which has first player can win, but we don’t know how (for
vertex labels 1 and 2. The mate triangle D1 has a sufficiently large board).
vertices (1, 0), (1, 1), and (1, 0) (0, 0) +
(1, 1) = (2, 1). Suppose D = (2, 1) is labeled 2,
then we exit via the side CD to the mate triangle
with vertices C, D, and E = (1, 1) (1, 0) + Approximate Fixed Points
(2, 1) = (2, 2). Each time we find the player of the
new vertex with his label, we drop out the other Let I 2 be the unit square 0 x, y 1. Given any
vertex of the same player from the current trian- continuous function: f = ( f1, f2):I2 ! I2,
gle and move into the new mate triangle. In each Brouwer’s fixed point theorem asserts the existence
iteration there is exactly one new mate triangle to of a point (x, y) such that f(x, y) = (x, y). Our
move into. Since in the initial step we had a Hex path building algorithm due to Gale (1979)
unique mate triangle to move into from D0, gives a constructive approach to locating an
there is no way for the algorithm to reenter a approximate fixed point.
mate triangle visited earlier. This process must Given ϵ > 0, by uniform continuity we can find
terminate at a vertex on the North or East bound- a d > 1n > 0 such that if (i, j) and (i0 , j0 ) are adjacent
ary. One side of these triangles will all have the vertices of a Hex board Bn, then
same label forming a bridge which joins the
appropriate boundaries and forms a winning 0
j0
path. The winning player’s bridge will obstruct f i , j i
f1 , ϵ,
1 n n n n
the bridge the losing player attempted to com-
0 (1)
plete. The game of Hex and its winning strategy j0
f i , j i
f2 , ϵ:
is a powerful tool in developing algorithms for 2 n n n n
computing approximate fixed points. Hex is an
example of a game where we do know that the Consider the the 4 sets:
204 Zero-Sum Two Person Games
þ i j i union of sets H +, H , V +, V . Hence we reach an
H ¼ ði, jÞϵ Bn : f 1 , > ϵ , (2) approximate fixed point while building the bridge.
n n n
i j i
H ¼ ði, jÞϵ Bn : f 1 , < ϵ , (3)
n n n
An Application of the Algorithm
i j j
Vþ ¼ ði, jÞϵ Bn : f 2
, > ϵ , (4) Consider the continuous map of the unit square
n n n into itself given by:
i j j
V ¼ ði, jÞϵ Bn : f 2 , > ϵ : (5) x þ maxð2 þ 2x þ 6y 6xy,0Þ
n n n f 1 ðx,yÞ ¼
1 þ maxð2 þ 2x þ 6y 6xy,0Þ þ maxð2x 6xy,0Þ
Intuitively the points in H + under f are moved y þ maxð2 6x 2y þ 6xy,0Þ
f 2 ðx,yÞ ¼ :
1 þ maxð2 6x 2y þ 6xy,0Þ þ maxð2y 6xy,0Þ
further to the right (with increased x coordinate) by
more than ϵ. Points in V under f are moved further
With ϵ = .05, we can start with a grid of d = .1
down (with decreased y coordinate) by more than
(hopefully adequate) and find an approximate
ϵ. We claim that these sets cannot cover all the
fixed point. In fact for a spacing of .1 units we
vertices of the Hex board. If it were so, then we
have the following iterations. The iterations
will have a winner, say Player I with a winning
according to Hex rule passed through the follow-
path, linking the East and West boundary frames.
ing points with | f1(x, y) x | and | f2(x, y) y |
Since points of the East boundary have the highest
given by Table 1. Thus the approximate fixed
x coordinate, they cannot be moved further to the
point is x = .4, y = .3.
right. Thus vertices in H + are disjoint with the East
boundary and similarly vertices in H are disjoint
with the West boundary. The path must therefore Extensive Games and Normal Form
contain vertices from both H + and H . However Reduction
for any (i, j) ϵ H +, (i0, j0) ϵ H we have
Any game as it evolves can be represented by a
f1 ,
i j i
> ϵ, rooted tree G where the root vertex corresponds to
n n n the initial move. Each vertex of the tree represents
0 0
i j i0
f1 , > ϵ: Zero-Sum Two Person Games, Table 1 Table giving
n n n the Hex building path
(x, y) | f1 x | | f2 y | L
Summing the above two inequalities and using
(.0, .0) 0 .6667 1
(1) we get (.1, 0) .0167 .5833 2
(.1, .1) .01228 .4667 2
i0 i (0, .1) .0 .53333 1
> 2ϵ:
n n (.1, .2) .007 .35 2
(0, .2) 0 .4 1
Thus the points (i, j) and (i0, j0) cannot be (.1, .3) .002 .233 2
adjacent and this contradicts that they are part of (0, .3) 0 .26667 1
a connected path. We have a contradiction. (.1, .4) .238 .116 1
(.2, .4) .194 .088 1
Remark 7 The algorithm attempts to build a win- (.2, .3) .007 .177 2
(.3, .4) .153 .033 1
ning path and advances by entering mate trian-
(.3, .3) .017 .067 2
gles. Since the algorithm will not be able to cover
(.4, .4) .116 0 1
the Hex board, partial bridge building should fail
(.4, .3) .0296 0 *
at some point, giving a vertex that is outside the
Zero-Sum Two Person Games 205
a particular move of a particular player. The alter- on the information given. If he is told that the
natives available in any given move are identified game has reached a move in information set V1,
with the edges emanating from the vertex that it simply means that the outcome of the toss is
represents the move. If a vertex is assigned to 1. He has two alternatives for each move of this
chance, then the game associates a probability information set. One corresponds to guessing the
distribution with the the descending edges. The die is fake and the other corresponds to guessing it
terminal vertices are called plays and they are is genuine. The same applies to the information
labeled with the payoff to Player I. In zero-sum set V2. If the outcome is in V3, . . ., V6, the clear
games Player II’s payoff is simply the negative of choice is to guess the die as genuine. Thus a pure
the payoff to Player I. The vertices for a player are strategy (master plan) for Player II is to choose a
further partitioned into information sets. Informa- 2-tuple with coordinates taking the values F or G.
tion sets must satisfy the following requirements: Here there are 4 pure strategies for Player II. They
are: (F1, F2), (F1, G), (G, F2), (G, G). For exam-
• The number of edges descending from any two ple, the first coordinate of the strategy indicates
moves within an information set are same. what to guess when the outcome is 1 and the
• No information set intersects the unique second coordinate indicates what to guess for
unicursal path from the root to any end vertex the outcome 2. For all other outcomes II guesses
of the tree in more than one move. the die is genuine (unbiased). The payoff to Player
• Any information set which contains a chance I when Player I uses a pure strategy i and Player II
move is a singleton. uses a pure strategy j is simply the expected
We will use the following example to illus- income to Player I when the two players choose
trate the extensive form representation: i and j simultaneously. This can as well be
represented by a matrix A = (aij) whose rows and
Example 8 Player I has 3 dice in his pocket. Die columns are pure strategies and the corresponding
1 is a fake die with all sides numbered one. Die entries are the expected payoffs. The payoff matrix
2 is a fake die with all sides numbered two. A given by
Die 3 is a genuine unbiased die. He chooses one
ðF1 , F2 Þ ðF1 , GÞ ðG, F2 Þ ðG, GÞ
of the 3 dice secretly, tosses the die once, and 0 1
announces the outcome to Player II. Knowing F1 0 0 1 1
B C
the outcome but not knowing the chosen die, A ¼ F2 B
@1
0 1 0 1 C:
A
Player II tries to guess the die that was tossed. 1 1
He pays $1 to Player I if his guess is wrong. If he G 3 6 6 0
guesses correctly, he pays nothing to Player I.
The game is represented by the above tree with is called the normal form reduction of the original
the root vertex assigned to Player I. The 3 alterna- extensive game.
tives at this move are to choose the die with all
sides 1 or to choose the die with all sides 2 or to
choose the unbiased die. The end vertices of these Saddle Point
edges descending from the root vertex are moves
for chance. The certain outcomes are 1 and 2 if the The normal form of a zero sum two person game
die is fake. The outcome is one of the numbers 1, has a saddle point when there is a row r and
. . ., 6 if the die chosen is genuine. The other ends column c such that the entry arc is the smallest in
of these edges are moves for Player II. These row r and the largest in column c. By choosing the
moves are partitioned into information sets V1 pure strategy corresponding to row r Player
(corresponding to outcome 1), V2 (corresponding I guarantees a payoff arc = minjarj. By choosing
to outcome 2), and singleton information sets V3, column c, Player II guarantees a loss no more than
V4, V5, V6 corresponding to outcomes 3, 4, 5 and maxiaic = arc. Thus row r and column c are good
6 respectively. Player II must guess the die based pure strategies for the two players. In a payoff
206 Zero-Sum Two Person Games
Blue: Player 1 I
Red: Player 2
F1 F2
G
1 2
1 1
6 1 1 1 1 6
1 6 6 6 6 2
3 4 5 6
II II
F G F G G G G G G F G F
0 1 1 0 0 0 0 0 0 1 1 0
Zero-Sum Two Person Games, Fig. 2 Game tree for a single throw with fake or genuine dice
matrix A = (aij) row r is said to strictly dominate saddle point for this game. In fact we have the
row t if arj > atj for all j. Player I, the maximizer, following:
will avoid row t when it is dominated. If rows
r and t are not identical and if arj atj, then we Theorem 10 The normal form of any zero sum
say that row r weakly dominates row t. two person game with perfect information admits
a saddle point. A saddle point can be arrived at by
Example 9 Player I chooses either 1 or 2. Know- a sequence of row or column deletions. A row that
ing player I’s choice Player II chooses either 3 or is weakly dominated by another row can be
4. If the total T is odd, Player I wins $T from deleted. A column that weakly dominates another
Player II. Otherwise Player I pays $T to Player II. column can be deleted. In each iteration we can
The pure strategies for Player I are simply always find a weakly or strictly dominated row or
s1 = choose 1, s2 = choose 2. For Player II a weakly or strictly dominating column to be
there are four pure strategies given by: t1: choose deleted from the current submatrix.
3 no matter what I chooses. t2: choose 4 no matter
what I chooses. t3: choose 3 if I chooses 1 and
choose 4 if I chooses 2. t4: choose 4 if I chooses
Mixed Strategy and Minimax Theorem
3 1 and choose 3 if I chooses 2. This results in a
normal form with payoff matrix A for Player
Zero sum two person games do not always have
I given by:
saddle points in pure strategies. For example, in
the game of guessing the die (Example 8) the
ð3, 3Þ ð4, 4Þ ð3, 4Þ ð4, 3Þ
normal form has no saddle point. Therefore it
1 4 5 4 5 makes sense for players to choose the pure strat-
A¼ :
2 5 6 6 5 egies via a random mechanism. Any probability
distribution on the set of all pure strategies for a
Here we can delete column 4 which dominates player is called a mixed strategy. In Example 8 a
column 3. We don’t have row domination yet. We mixed strategy for Player I is a 3-tuple x = (x1, x2,
can delete column 2 as it weakly dominates col- x3) and a mixed strategy for Player II is a 4-tuple
umn 3. Still we have no row domination after y = (y1, y2, y3, y4). Here x i is the probability that
these deletions. We can delete column 1 as it player I chooses pure strategy i and yj is the
weakly dominates column 3. Now we have strict probability that player II chooses pure strategy j.
row domination of row 2 by row 1 and we are left Since the players play independently and make
with the row 1, column 3 entry = 4. This is a their choices simultaneously, the expected payoff
Zero-Sum Two Person Games 207
to Player I from Player II is K(x, y) = i jaijx iyj probability vectors x = (x1, x2, . . ., xm) and
where aij are elements in the payoff matrix A. y = (y1, y2, . . ., yn) such that for a unique constant v
We call K(x, y) the mixed payoff where players
choose mixed strategies x and y instead of
X
m
aij xi v j ¼ 1, 2, . . . , n,
pure strategies i and j. Suppose x ¼ 18 , 18 , 34
i¼1
and y ¼ 34 , 0, 0, 14 . Here x guarantees Player I
an expected payoff of 14 against any pure strategy X
n
j of Player II. By the affine linearity of K(x*, y) in aij yj v i ¼ 1, 2, . . . , m:
y it follows that Player I has a guaranteed expecta- j¼1
1 Xn
1 C 1 þ 2 C 2 þ þ n C n þ s 1 e1 þ s 2 e2 þ s m em ¼ 1
max ¼ max j (10)
v1 j¼1
1 , 2 , . . . , n , s1 , s2 , . . . , sm 0:
such that Here Cj, j = 1..., n are the columns of the matrix
A and ei are the columns of the m m identity
X
n
matrix. The vector 1 is the vector with all coordi-
aij j 1 for all j, (11)
j¼1
nates unity. With any extreme point (, s) = (1, 2,
..., n, s1, ..., sm) of the convex polyhedron of
j 0 for all j: (12) feasible solutions one can associate with it a set of
m linearly independent columns, which form a basis
With A > 0, the Z j’s are bounded. The maximum for the column span of the matrix (A, I). Here the
of the linear function jj is attained at some coefficients Z j and s i are equal to zero for coordi-
extreme point of the convex set of constraints nates other than for the specific m linearly indepen-
(11) and (12). By introducing nonnegative slack dent columns. By slightly perturbing the entries we
variables s1, s2, . . ., sm we can replace the inequal- can assume that any extreme point of feasible solu-
ities (11) by equalities (13). The problem reduces to tions has exactly m positive coordinates. Two
X
n extreme feasible solutions are called adjacent if the
max j (13) associated bases differ in exactly one column.
j¼1 The key idea behind the simplex algorithm is
subject to that an extreme point P = (, s) is an optimal
solution if and only if there is no adjacent extreme
X
n point Q for which the objective function has a
aij j þ si ¼ 1, i ¼ 1, 2, . . . , m, (14) higher value. Thus when the algorithm is initiated
j¼1 at an extreme point which is not optimal, there
must be an adjacent extreme point that strictly
yj 0, j ¼ 1, 2, . . . , n, (15) improves the objective function. In each iteration,
a column from outside the basis replaces a column
si 0, i ¼ 1, 2, . . . , m: (16) in the current basis corresponding to an adjacent
extreme point. Since there are m + n columns in
Of the various algorithms to solve a linear pro- all for the matrix (A, I), and in each iteration we
gramming problem, the simplex algorithm is among have strict improvement by our non-degeneracy
the most efficient. It was first investigated by Fourier assumption on the extreme points, the procedure
(1890). But no other work was done for more than a must terminate in a finite number of steps
century. The need for its industrial application moti- resulting in an optimal solution.
vated active research and lead to the pioneering
contributions of Kantarowich (1939) (see a transla- Example 12 Players I and II simultaneously show
tion in Management Science (Kantorowich 1960)) either 1 or 2 fingers. If T is the total number of
and Dantzig (1951). It was Dantzig who brought out fingers shown then Player I receives from Player
the earlier investigations of Fourier to the forefront II $T when T, is odd and loses $T to Player II when
of modern applied mathematics. T is even.
The payoff matrix is given by
Simplex Algorithm 2 3
A¼ :
3 4
Consider our linear programming problem above.
Any solution = (y1, . . ., yn), s = (s1, . . ., sm) to Add 5 to each entry to get a new payoff matrix
the above system of equations is called a feasible with all entries strictly positive. The new game is
solution. We could also rewrite the system as strategically same as A.
Zero-Sum Two Person Games 209
3 8 for Player II is obtained by normalizing the opti-
: mal solution of the linear program, it is 1 ¼ 127
,
8 1
2 ¼ 12 . Similarly, from the dual linear program
5
The linear programming problem is given by we can see that the strategy x1 ¼ 12 7
, x2 ¼ 12
5
is
max 1 . y1 + 1 . y2 + 0 . s1 + 0 . s2 such that optimal for Player I.
2 3
y1
3 8 1 0 6 7
6 y2 7 ¼ 1 : Fictitious Play
8 1 0 1 4 s1 5 1
s2 Though optimal strategies are not easily found,
even naive players can learn to steer their average
We can start with the trivial solution (0, 0, 1, 1)T. payoff towards the value of the game from past
This corresponds to the basis e1, e2 with plays by certain iterative procedures. This learn-
s1 = s2 = 1. The value of the objective function is ing procedure is known as fictitious play. The two
0. If we make y2 > 0, then the value of the objective players make their next choice under the assump-
function can be increased. Thus we look for a tion that the opponent will continue to choose
solution to pure strategies at the same frequencies as what
he/she did in the past. If x(n), y(n) are the empirical
s1 ¼ 0, s2 > 0, y2 > 0 mixed strategies used by the two players in the
first n rounds, then in round n + 1 Player I pretends
satisfying the constraints
that Player II will continue to use y (n) in the future
and selects any row i such that
8y2 þ 0s2 ¼ 1
y2 þ s2 ¼ 1 X ðnÞ
X ðnÞ
ai j y j ¼ max aij yj :
i
j j
or to
The new empirical mixed strategy is given by
s2 ¼ 0, s1 > 0, y2 > 0
1 n ðnÞ
satisfying the constraints xðnþ1Þ ¼ I i þ x :
nþ1 nþ1
player II’s guess as “High”, “Low” or “Correct” unique Nash equilibrium. It extends only to some
as the case may be. The game continues till player very special classes like 2 2 bimatrix games and
II guesses player I’s choice correctly. Player II to the so called potential games.
pays to Player I $N where N is the number of (See Miyasawa (1961), Shapley (1964),
guesses he made. Monderer and Shapley (1996), Krishna and
The payoff matrix is given by Sjoestrom (1998), Berger (2007)).
We treat this as a game between the cobra (Player the survival of the eggs is directly proportional to the
I) and the shop keeper (Player II). Let the probability distance the snake travels to locate the nest.
of survival be the payoff to the cobra. Thus While birds and snakes work out their strate-
gies based on instinct and evolutionary behavior,
1 if y < x, or y > x þ t
K ðx, yÞ ¼ we can surely approximate the problem by the
0 otherwise:
following zero sum two person search game. Let
The pure strategy spaces are 0 x 1 t for T = (X, E) be a finite tree with vertex set X and
the snake and 0 y 1 for the shop keeper. It can edge set e. Let O ϵ X be the root vertex. A hider
be shown that the game has no saddle point and hides an object at a vertex x of the tree. A searcher
has optimal mixed strategies. The value function starts at the root and travels along the edges of the
v(t) is a discontinuous function of t. In case 1t is an tree such that the path traced covers all the termi-
integer n then, a good strategy for the snake is to nal vertices. The search ends as soon as the
hide along [0, t], or [t, 2t] or [(n 1)t, 1] chosen searcher crosses the hidden location and the pay-
with equal chance. In this case the optimal strat- off to the hider is the distance traveled so far.
egy for the shop keeper is to choose a random By a simple domination argument we can as
well assume that the optimal hiding locations are
1 value is 1 n . In case t is a
1 1
point in [0, 1]. The
fraction, let n ¼ t then the optimal strategy for simply the terminal vertices.
the snake is to hide along [0, t], or [t, 2t], . . . or
[(n 1)t, nt]. An optimal strategy for the shop Theorem 19 The search game has value and opti-
keeper is to shoot at one of the points nþ1 1 2
, nþ1 , mal strategies. The value coincides with the sum
. . . , nþ1 chosen at random.
n of all edge lengths. Any optimal strategy for the
hider will necessarily restrict to hide at one of the
Example 17 While mowing the lawn a lady sud- terminal vertices. Let the least distance traveled to
denly realizes that she has lost her diamond exhaust all end vertices one by one correspond
engagement ring some where in her lawn. She to a permutation s of the end vertices in the order
has maximum speed s and will be able to locate w1, w2, . . ., wk. Let s1 be its reverse permutation.
the diamond ring from its glitter if she is suffi- Then an optimal strategy for the searcher is to
ciently close to, say within a distance ϵ from the choose one of these two permutations by the toss
ring. What is an optimal search strategy that min- of a coin. The hider has a unique optimal mixed
imizes her search time. strategy that chooses each end vertex with posi-
If we treat Nature as a player against her, she is tive probability.
playing a zero sum two person game where Nature f
would find pleasure in her delayed success in 5
g
finding the ring. e y
2
c 3
b 4
s
Search Games on Trees 4 2
x d 8
The following is an elegant search game on a tree.
For many other search games the readers can refer a 5 7
7 h,
to the monographs by Gal (1980) and Alpern and 6
t u
Gal (2003). Also see (Reijnierse and Potters 1993).
5 j
3 9
Example 18 A bird has to look for a suitable loca-
tion to build its nest for hatching eggs and protecting O
them against predator snakes. Having identified a
Zero-Sum Two Person Games, Fig. 3 Bird trying to
large tree with a single predator snake in the neigh-
hide at a leaf and snake chasing to reach the appropriate
borhood, the bird has to further decide where to leaf via optimal Chinese postman route starting at root
build its nest on the chosen tree. The chance for O and ending at O
212 Zero-Sum Two Person Games
Suppose the tree is a path with root O and a The average distance is the same for every other
single terminal vertex x. Since the search begins leaf when P and P1 are used. The optimal Chi-
at O, the longest trip is possible only when hider nese postman route can allow all permutations
hides at x and the theorem holds trivially. In subject to permuting any leaf of any subtree only
case the tree has just two terminal vertices among themselves. Thus the subtree rooted at
besides the root vertex, the possible hiding loca- t has leaves a, b, c, d and the subtree rooted at
tions are say, O, x1, x2 with edge lengths a1, a2. u has leaves e, f, g, h, j. For example while per-
The possible searches are via paths: O ! x1 ! muting a, b, c, d only among themselves we have
O ! x2 abbreviated Ox1Ox2 or O ! x2 ! O ! the further restriction that between b, c we cannot
x1, abbreviated Ox2Ox1. The payoff matrix can allow insertion of a or d. For example a, b, c, d and
be written as a, d, c, b are acceptable permutations, but not a, b,
d, c. It can never be the optimal permuting choice.
Ox1 Ox2 Ox2 Ox1 The same way it applies to the tree rooted at u. For
example h, j, e, g, f is part of the optimal Chinese
x1 a1 2a2 þ a1
: postman route, but h, g, j, e, f is not. We can think
x2 2a1 þ a2 a2
of the snake and bird playing the game as follows:
The value of this game is a1 + a2 = sum of the The bird chooses to hide in a leaf of the subgame
edge lengths. We can use an induction on the G t rooted at t or at a leaf of the subgame G u rooted
number of subtrees to establish the value as the at u. These leaves exhaust all leaves of the original
sum of edge lengths. We will use an example to game. The snake can restrict to only the optimal
just provide the intuition behind the formal proof. route of each subgame. This can be thought of as a
Given any permutation t of the end vertices 2 2 game where the strategies for the two
(leaves), of the above tree let P be the shortest path players (bird) and snake are:
from the root vertex that travels along the leaves in Bird:
that order and returns to the root. Let the reverse
path be P1. Observe that it will cover all edges Strategy 1: Hide optimally in a leaf of Gt
twice. Thus if the two paths P and P1 are chosen Strategy 2: Hide optimally in a leaf of Gu.
with equal chance by the snake, the average dis-
tance traveled by the snake when it locates the Snake:
bird’s nest at an end vertex will be independent of
the particular end vertex. For example along the Strategy 1: Search first the leaves of Gt along the
closed path O ! t ! d ! t ! a ! t ! x ! b ! optimal Chinese postman route of Gt and then
x ! c ! x ! t the distance traveled by the snake search along the leaves of Gu.
to reach leaf e is (3 + 7 + 7 + + 8 + 3) = 74. If Strategy 2: Search first the leaves of Gu along the
the snake travels along the reverse path to reach optimal Chinese postman route and then search
e the distance traveled is (9 + 5 + 5 + + 5 + the leaves of Gt along the optimal postman
4 + 3) = 66. For example if it is to reach the vertex route. The expected outcome can be written
d then via path P it is (3 + 7). Via P1 it is to make as the following 2 2 game. (Here v(Gt),
travel to e and travel from e to d by the reverse v(Gu) are the values of the subgames rooted at
path. This is (66 + 3 + + 6 + 6 + 7) = 130. Thus t, u respectively.)
in both cases the average distance traveled is 70.
Gt Gu Gu Gt
Gt ½3 þ vðG t Þ 2 9 þ v Gu þ ½3 þ vðGt Þ
Gu 2 3 þ v Gt þ ½9 þ vðGu Þ ½9 þ vðGu Þ
Zero-Sum Two Person Games 213
Observe that the 2 2 game has no saddle Theorem 20 A matrix game A with value v is
point and hence has value 3 + 9 + v(Gt) + v(Gu). completely mixed if and only if
By induction we can assume v(Gt) = 24,
v(Gu) = 34. Thus the value of this game is 70. 1. The matrix is square.
This is also the sum of the edge lengths of the 2. The optimal strategies are unique for the two
game tree. An optimal strategy for the bird can be players.
recursively determined as follows. 3. If v 6¼ 0, then the matrix is nonsingular.
4. If v = 0, then the matrix has rank n 1 where
n is the order of the matrix.
Umbrella Folding Algorithm
The theory of completely mixed games is a
useful tool in linear algebra and numerical analy-
Ladies, when storing umbrellas inside their hand-
sis (Bapat and Raghavan 1997). The following is a
bag shrink the central stem of the umbrella and
sample application of this theorem.
then the stems around all in one stroke. We can
mimic a somewhat similar procedure also for our
above game tree. We simultaneously shrink the Theorem 21 (Perron 1909) Let A be any n
edges [xc] and [xb] to x. In the next round {a, x, d} n matrix with positive entries. Then A has a pos-
edges [a, t], [x, t], [d, t] can be simultaneously itive eigenvalue with a positive eigenvector which
shrunk to t and so on till the entire tree is shrunk to is also a simple root of the characteristic equation.
the root vertex O. We do know that the optimal
strategy for the bird when the tree is simply the Proof Let I be the identity matrix. For any
subtree with root x and with leaves b, c is given by l > 0, the maximizing player prefers to play the
pðbÞ ¼ ð4þ2 game A rather than the game A lI. The payoff
Þ , pðcÞ ¼ ð4þ2Þ . Now for the subtree
4 2
gets worse when the diagonal entries are reached.
with vertex t and leaves {a, b, c, d}, we can treat
The value function v(l) of A lI is a non-
this as collapsing the previous subtree to x and
increasing continuous function. Since v(0) >
treat stem length of the new subtree with vertices
0 and v(l) < 0 for large l we have for some
{t, a, x, d} as though the three stems [ta], [tx], [td]
l0 > 0 the value of A l0I is 0. Let y be optimal
have lengths 6, 5 + (4 + 2), 7. We can check that
for player II, then (A l0I)y 0 implies 0 <
for this subtree game the leaves a, x, d are chosen
Ay l0y. That is 0. Since the optimal y is
with probabilities pðaÞ ¼ ð6þ9þ7 6
Þ, pðxÞ ¼ ð6þ9þ7Þ,
9
completely mixed, for any optimal x of player I,
pðdÞ ¼ ð6þ9þ77
Þ . Thus the optimal mixed strategy we have (A l0I)x = 0. Thus x > 0 and the game
for the bird for choosing leaf b for our original tree is completely mixed. By (2) and (4) if (A l0I)
game is to pass through vertices t, x, b and is given u = 0 then u is a scalar multiple of y and so the
by the product p(t)p(x)p(b). We can inductively eigenvector y is geometrically simple. If B = A
calculate these probabilities. l0I, then B is singular and of rank n 1. If (Bij) is
the cofactor matrix of the singular matrix B then
jbijBk j = 0, i = 1, . . ., n. Thus row k of the
cofactor matrix is a scalar multiple of y. Similarly
Completely Mixed Games and Perron’s each column of B is a scalar multiple of x. Thus all
Theorem on Positive Matrices cofactors are of the same sign and are different
from 0. That is
A mixed strategy x for player I is called
d X
completely mixed if it is strictly positive (x > 0). detðA lI Þjl0 ¼ Bii 6¼ 0:
A matrix game A is completely mixed if and only dl i
all optimal mixed strategies for Player I and Player
II are completely mixed. The following elegant Thus l0 is also algebraically simple. See
theorem was proved by Kaplanski (1945). (Bapat and Raghavan 1997) for the most general
214 Zero-Sum Two Person Games
extensions of this theorem to the theorems of given that the game has progressed to a move in
Perrron and Frobenius and to the theory of U, namely
M-matrices and power positive and polynomially
matrices). b1 ð8 nÞ
U,P
> q
< Pp1 ϵ Sn p1 if U is relevant for m1 ,
¼ p1 ϵ S q p1
Behavior Strategies in Games with >
:P
p1 ϵ T q p1 if U is not relevant for m1 :
Perfect Recall
Consider any extensive game G where the unique The following theorem of Kuhn (1953) is a
unicursal path from an end vertex w to the root x 0 consequence of the assumption of perfect recall.
intersects two moves x and y of say, Player I. We
say x ≺ y if the the unique path from y to x 0 is via Theorem 22 Let m1, m2 be mixed strategies for
move x. Let U 3 x and V 3 y be the respective players I and II respectively in a zero sum two
information sets. If the game has reached a move person finite game G of perfect recall. Let b1, b2
y ϵ V, Player I will know that it is his turn and the be the induced behavior strategies for the two
game has progressed to some move in V. The game players. Then the probability of reaching any
is said to have perfect recall if each player can end vertex w using m1, m2 coincides with the
remember all his past moves and the choices made probability of reaching w using the induced
in those moves. For example if the game has behavior strategy b1, b2. Thus in zero-sum two
progressed to a move of Player I in the informa- person games with perfect recall, players can play
tion set V he will remember the specific alternative optimally by restricting their strategy choices just
chosen in any earlier move. A move x is possible to behavior strategies.
for Player I with his pure strategy p1, if for some The following analogy may help us understand
suitable pure strategy p2 of Player II, the move the advantages of behavior strategies over mixed
x can be reached with positive probability using strategies. A book has 10 pages with 3 lines per
p1, p2. An information set U is relevant for a pure page. Someone wants to glance through the book
strategy p1, for Player I, if some move x ϵ U is reading just 1 line from each page. A master plan
possible with p1. Let P1, P2 be pure strategy (pure strategy) for scanning the book consists of
spaces for players I and II. choosing one line number for each page. Since
Let m1 ¼ qp1 , p1 ϵ P1 be any mixed strategy for each page has 3 lines, the number of possible
Player I. The information set U for Player I is plans is 310. Thus the set of mixed strategies is a
relevant for the mixed strategy m1 if for some qp1 set of dimension 310 1. There is another ran-
> 0, U is relevant for p1. We say that the infor- domized approach for scanning the book. When
mation set U for Player I is not relevant for the page i is about to be scanned choose line 1 with
mixed strategy m1 if for all qp1 > 0 , U is not probability xi1, line 2 with probability xi2 and line
relevant for p1. Let 3 with probability xi3. Since for each i we have
xi1 + xi2 + xi3 = 1 the dimension of such a strategy
Sn ¼ fp1 : U is relevant for p1 and p1 ðU Þ ¼ ng, space is just 20. Behavior strategies are easier to
work with. Further Kuhn’s theorem guarantees
S ¼ fp1 : U is relevant for p1 g, that we can restrict to behavior strategies in
games with perfect recall.
T ¼ fp1 : U is not relevant for p1 and p1 ðU Þ ¼ ng: In general if there are k alternatives at each
information set for a player and if there are
The behavior strategy induced by a mixed n information sets for the player, the dimension
strategy pair (m1, m2) at an information set U for of the mixed strategy space is k n 1. On the other
Player I is simply the conditional probability of hand the dimension of the behavior strategy space
choosing alternative n in the information set U, is simply n(k 1). Thus while the dimension of
Zero-Sum Two Person Games 215
mixed strategy space grows exponentially the mixed strategy for Player I, the payoff to Player I is
dimension of behavior strategy space grows line- independent of Player II’s actions. Similarly, we can
arly. The following example will illustrate the rewrite K(p, q) as a function of pi’s for i = 1, ... ,
advantages of using behavior strategies. 6 where
" #
Example 23 Player I has 7 dice. All but one are X 1X 6
p6 ¼ 16 p0 . Thus p ¼ 12 , 12 , 12 , 12 , 12 , 12 , 12 : For this that induce the realization probabilities grow only
216 Zero-Sum Two Person Games
linearly in the size of the terminal vertex set. minal payoff to player I at terminal vertex o is
Another major advantage is that the sequence h(o), by defining h(a) = 0 for all nodes a that are
form induces a sparse matrix. It has at most as not terminal vertices, we can easily check that the
many non-zero entries as the number of terminal behavioral payoff
vertices or plays.
X 2
H ðb1 , b2 Þ ¼ hðsÞ ∏ r i ðsi Þ:
Sequence Form i¼0
sϵ S
When the game moves to an information set U1 of
say, player I, the perfect recall condition implies
that wherever the true move lies in U1, the player When we work with realization functions
knows the actual alternative chosen in any of the ri, i = 1, 2 we can associate with these
past moves. Let su1 denote the sequence of alter- functions the sequence form of payoff matrix
natives chosen by Player I in his past moves. If no whose rows correspond to sequence s1 ϵ S1
past moves of player I occurs we take su1 ¼ ∅. forPlayer I and columns correspond to
Suppose in U1 player I selects an action “c” with sequence s2 ϵ S2 for Player II and with payoff
behavioral probability b1(c) and if the outcome is matrix
c the new sequence is su1 [ c. Thus any sequence
s1 for player I is the string of choices in his moves X
along the partial path from the initial vertex to any K ðs1 , s2 Þ ¼ hðs0 , s1 , s2 Þ:
s0 ϵ S0
other vertex of the tree. Let S0, S1, S2 be the set of
all sequences for Nature (via chance moves),
Player I and Player II respectively. Given behavior Unlike the mixed strategies we have more
strategies b0, b1, b2 Let constraints on the sequences r1, r2 for each
player given by the linear constraints above. It
may be convenient to denote the sequence func-
r i ðsi Þ ¼ ∏ bi ðcÞ, i ¼ 0, 1, 2:
cϵ si tions r1, r2 by vectors x, y respectively. The
vector x has |S1| coordinates and vector y has
|S2| coordinates. The constraints on x and y are
The functions: ri : Si :! R:i = 0, 1, 2 satisfy the linear given by Ex = e, Fy = f where the first row
following conditions is the unit vector (1, 0, . . ., 0) of appropriate size
in both E and F. If u1 is the collection of infor-
r i ð øÞ ¼ 1 (17) mation sets for player I then the number of rows
X in E is 1 + |u1|. Similarly the number of rows in
r i ðsui Þ ¼ r i ðsui , cÞ, i ¼ 0, 1, 2 (18) F is 1 + |u2|. Except for the first row, each row
cϵ AðU i Þ
has the starting entry as 1 and some 1’s and
0’s. Consider the following extensive game with
r i ðsi Þ 0 for all si , i ¼ 0, 1, 2: (19) perfect recall.
The set of sequences for player I is given by
Conversely given any such realization func- S1 = {ø, l, r, L, R}. The set of sequences for player
tions r1, r2 we can define behavior strategies, b1 II is given by S2 = {ø, c, d}. The sequence form
say for player I, by payoff matrix is given by
r 1 ðsu1 [ cÞ 2 3
b1 ð U 1 , c Þ ¼ for cϵ AðU 1 Þ,
r 1 ðsu1 Þ
and r 1 ðsu1 Þ > 0: 60 0 0 7
6 7
K ðs1 , s2 Þ ¼ A ¼ 6
60 1 1 7
7:
When r 1 ðsu1 Þ ¼ 0 we define b1(U1, c) arbi- 40 2 4 5
P
trarily so that cϵ AðU1 Þ b1 ðU 1 , cÞ ¼ 1. If the ter- 1 0 0
Zero-Sum Two Person Games 217
3 −3 −3 6
Proof Let T be the convex hull of S. Here T is also
Zero-Sum Two Person Games, Fig. 4 compact. Let
218 Zero-Sum Two Person Games
X
v ¼ min max ti ¼ max ti : xi dðSi , xÞ v for all xϵ S (20)
tϵ T i i
i
set G = {s : maxisi < v} are disjoint. By the weak mj d Si , x j v for all i ¼ 1 . . . , m: (21)
j
separation theorem for convex sets there exists a
x 6¼ 0 and constant c such that
Since d(Si, x) are convex functions, the second
inequality (1) implies
for all sϵ G, ðx, sÞ c and
!
for all tϵ T, ðx, tÞ c: X
d Si , mj x j
v: (22)
Using the property that v = (v, v, . . ., v) ϵ G and j
can assume x is a probability vector, in which case for Player II. We are also given
\ Si 6¼ ø, j ¼ 1, 2 . . . , m:
ðx, vÞ ¼ v c ¼ ðx, t Þ max ti ¼ v: i 6¼j
n P
from P2. The payoffs to Player I (Nature) when T 1 ¼ x : U ðxÞ ¼ ðm1 m2 ÞT 1 x
the observation is chosen from P1, P2 is given by )
the risks (expected costs): 1 X
1
ðm1 m2 ÞT ðm1 þ m2 Þ k
2
Ð
r ð1, ðT 1 , T 2 ÞÞ ¼ cð2=1Þ T 2 f 1 ðxÞdx,
Ð
r ð2, ðT 1 , T 2 ÞÞ ¼ cð1=2Þ T 1 f 2 ðxÞdx: and
(
The following theorem, based on an extension X
1
1
of Lyapunov’s theorem for non-atomic vector mea- T 2 ¼ U ð x Þ ¼ ð m1 m 2 Þ T x ð m1 m2 Þ T
2
sures (Lindenstrauss 1966), is due to Blackwell )
(1951), Dvoretsky-Wald and Wolfowitz (1951). X
1
ðm1 þ m2 Þ < k
Theorem 32 Let
for some suitable k. Let a = (m1 m2)T
ð 1
(m1 m2).
S¼ ðs1 , s2 Þ : s1 ¼ cð2=1Þ f 1 ðxÞdx; The random variable U is univariate normal
T2
ð with mean a2, and variance a if x ϵ P1. The random
þ m1 x2 2y y2 þ y2 :
information about nature. We can think of this as an
ordinary zero sum two person game where the pure
For fixed values of m2, m1 the minimum value
strategy for nature is to choose any y ϵ [0, 1] and a
of m2( 2x + 1 + 2y) + m1(x 2 2y y2) + y2 must
pure strategy for Player II (statistician) is any point
satisfy the first order conditions
in the unit square I = {(x, y) :0 x 1, 0 y 1}
with the payoff given above. Expanding the payoff m2 m2 m1
function we get ¼ x , and ¼ y :
m1 m1 1
independent random variables uniformly distrib- for if u x
Player I : gðuÞ ¼
raise if u > x,
uted on [0, 1].
The game begins with Player I. After seeing his for if v y
Player II : hðvÞ ¼
hand he either folds losing the pot to Player II, or raise if v > y:
raises by adding $1 to the pot. When Player
I raises, Player II after seeing his hand can either The computational details of the expected pay-
fold, losing the pot to Player I, or call by adding $1 off K(x, y) based on the above partitions of the u,
to the pot. If Player II calls, the game ends and the v space is given below in Tables 3 and 4.
player with the better hand wins the entire pot. The payoff K(x, y) is simply the sum of each
Suppose players I and II restrict themselves to area times the local payoff given by
using only strategies g, h respectively where
v
The payoff K(x, y) is simply the sum of each
(1-x) u=v area times the local payoff given by
−1 −2
2
(1-y)
A B
C
2x2 þ xy þ x y for x > y
x y-x (1-y) K ðx, yÞ ¼
1
2y2 3xy þ x y for x < y:
E
D 1 Also K(x, y) is continuous on the diagonal and
y
concave in x and convex in y. By the general
G
minimax theorem of Ky Fan or Sion there exist
x, y pure optimal strategies for K(x, y). We will
F find the value of the game by explicitly computing
−1 minyK(x, y) for each x and then taking the maxi-
u
x mum over x.
Zero-Sum Two Person Games, Fig. 5 Poker partition Let 0 < x < 1. (Intuitively we see that in our
when x < y search for saddle point, x = 0 or x = 1 are quite
unlikely.)
v
min0y1 K ðx, yÞ ¼ min inf y<x K ðx, yÞ, minyx
u=v K ðx, yÞg:Observe that inf y<x K ðx, yÞ ¼ minð2x2
(1-x)
−1 −2
þxy þ x yÞ ¼ yðx 1Þ terms not involving y.
Clearly the infimum is attained at y = x giving a
L
value x2. Now for y > x we have K(x, y) = 2y2
H 3xy + x y. This being a convex function in y, it
(1-y)
Thus the value of the game is 1 /9. A good same speed. Each player carries a noisy gun which
pure strategy for I is to raise when x > 1 /9 and a has been loaded with just one bullet. Player I’s
good pure strategy for II is to call when y > 1 /3. accuracy at x is p(x) and Player II’s accuracy at x is
For other poker models see von Neumann and q(x) where x is the distance traveled from the
Morgenstern (1947), Blackwell and Bellman starting point. Because the guns can be heard,
(1949), Binmore (1992) and Ferguson and each player knows whether or not the other player
Ferguson (2003) who discuss other discrete and has fired. The player who fires and hits the balloon
continuous poker models. first wins the game.
Some natural assumptions are
8
< ð1ÞpðxÞ þ ð1Þ 1 pðxÞ ¼ 2pðxÞ 1 when x < y
K ðx, yÞ ¼ pðxÞ qðxÞ when x ¼ y,
:
ð1ÞqðyÞ þ 1 qðyÞð1Þ ¼ 1 2qðyÞ when x > y:
224 Zero-Sum Two Person Games
We claim that the players have optimal pure payoff is 1 whether player I wins or player II
strategies and there is a saddle point. Consider wins. This is crucial for the existence of optimal
minyK(x, y) = min {2p(x) 1, p(x) q(x), pure strategies for the two players. Suppose the
infy < x(1 2q(y))}. We can replace infy < x(1 payoff has the following structure:
2q(y)) by 1 2q(x). Thus we get
a if I wins
min K ðx, yÞ ¼ min f2pðxÞ 1, pðxÞ b if II wins
0y1
g if neither wins
qðxÞ, ð1 2qðxÞÞg: 0 if both shoot accurately and ends in a draw:
Since the middle function p(x) q(x) is the Depending on these values while the value
average of the two functions 2p(x) 1 and 1 exists only one player may have optimal pure
2q(x), the minimum of the three functions is sim- strategy while the other may have only an epsilon
ply the min {2p(x) 1, (1 2q(x))}. While the optimal pure strategy, close to but not the same as
first function is increasing in x the second one is the solution to the equation p(x) + q(x) = 1.
decreasing in x. Thus the maxx minyK(x, y) = the
solution to the eq. 2 p(x) 1 = 1 2q(x). There is Example 36 (silent duel) Players I and II have the
a unique solution to the equation p(x) + q(x) = 1 as same accuracy p(x) = q(x) = x. However, in this
both functions are strictly increasing. Let x be duel, the players are both deaf so they do not know
that solution point. We get p(x) + q(x) = 1. The whether or not the opponent has fired. The payoff
value v satisfies function is given by
v ¼ pðx Þ qðx Þ
¼ miny maxx K ðx, yÞ
¼ maxx miny K ðx, yÞ:
8
This game has no saddle point. In this case, the < ð1Þx ð1 xÞ x<
K ðx, Þ ¼ 0 x¼
value of the game if it exists must be zero. One can :
ð1Þ þ ð1 Þð1Þ x > :
directly verify that the density
( pffiffiffi
0 0 t < 1=3 Let a ¼ 6 2. Then the game has value v ¼
f ðt Þ ¼ 1 3 pffiffiffi
t 1=3 t 1: 1 2a ¼ 5 2 6 . An optimal strategy for
4
Player I is given by the density
is optimal for both players with value zero.
0 for 0 x < a
Remark 37 In the above game suppose exactly f ðxÞ ¼ pffiffiffi 2
3
2a x þ 2x 1 2 for a x 1:
one player, say Player II is deaf. Treating winners
symmetrically, we can represent this game with
the payoff For Player II it is given by
Zero-Sum Two Person Games 225
(
0 for 0 x < a point. Even more than this extension it is this
gð Þ ¼ pffiffiffi a 2
pffiffiffi
2 2: þ 2 1 62 1 concept which is simply the most seminal solution
2þa
concept for non-cooperative game theory. This
Nash equilibrium solution is extendable to any
with an additional mass of 2þa a
at = 1. The
non-cooperative N person game in both extensive
deaf player has to maintain a sizeable suspicion
and normal form. Many of the local properties of
until the very end!
optimal strategies and the proof techniques of
The study of war duels is intertwined with the
zero sum two person games do play a significant
study of positive solutions to integral equations.
role in understanding Nash equilibria and their
The theorems of Krein and Rutman (1950) on
structure (Bubelis 1979; Jansen 1981; Kreps
positive operators and their positive eigenfunctions
1974; Raghavan 1970, 1973).
are central to this analysis. For further details see
When a non-zero sum game is played repeat-
(Dresher 1962; Karlin 1959; Radzik 1988).
edly, the players can peg their future actions on
Dynamic versions of zero sum two person
past history. This leads to a rich theory of equilib-
games where the players move among several
ria for repeated games (Sorin 1992). Some of
games according to some Markovian transition
them, like the repeated game model of Prisoner’s
law leads to the theory of stochastic games
dilemma impose tacitly, actual cooperation at
(Shapley 1953). The study of value and the com-
equilibrium among otherwise non-cooperative
plex structure of optimal strategies of zero-sum
players. This was first recognized by Schelling
two person stochastic games is an active area of
(1960), and was later formalized by the so called
research with applications to many dynamic
folk theorem for repeated games (Aumann and
models of competition (Filar and Vrieze 1996).
Shapley 1986; Axelrod and Hamilton 1981).
While stochastic games study movement in dis-
Indeed Nash equilibrium per se is a weak solution
crete time, the parallel development of dynamic
concept is also one of the main messages of Folk
games in continuous time was initiated by Isaacs
theorem for repeated games. With a plethora of
in several Rand reports culminating in his mono-
equilibria, it fails to have an appeal without further
graph on Differential games (Isaacs 1965). Many
refinements. It was Selten who initiated the need
military problems of pursuit and evasion, attrition
for refining equilibria and came up with the notion
and attack lead to games where the trajectory of a
of subgame perfect equilibria (Selten 1975). It
moving object is being steered continuously by the
turns out that subgame perfect equilibria are the
actions of two players. Introducing several natural
natural solutions for many problems in sequential
examples Isaacs explicitly tried to solve many of
bargaining (Rubinstein 1982). Often one searches
these games via some heuristic principles. The
for specific types of equilibria like symmetric
minimax version of Bellman’s optimality principle
equilibria, or Bayesian equilibria for games with
lead to the so called Isaacs Bellman equations. This
incomplete information (Harsanyi 1967). Auc-
highly non-linear partial differential equation on
tions as games exhibit this diversity of equilibria
the value function plays a key role in this study.
and Harsanyi’s Bayesian equilibria turn out to be
the most appropriate solution concept for this
Epilogue class of games (Myerson 1991).
The rich theory of zero sum two person games that Zero sum two person games and their solutions
we have discussed so far hinges on the fundamen- will continue to inspire researchers in all aspects
tal notions of value and optimal strategies. When non-cooperative game theory.
either zero sum or two person assumption is
dropped, the games cease to have such well Acknowledgment The author wishes to acknowledge the
defined notions with independent standing. In try- unknown referee’s detailed comments in the revision of the
first draft. More importantly he drew the author’s attention
ing to extend the notion of optimal strategies and to the topic of search games and other combinatorial
the minimax theorem for bimatrix games Nash games. The author would like to thank Ms. Patricia Collins
(1950) introduced the concept of an equilibrium for her assistance in her detailed editing of the first draft of
226 Zero-Sum Two Person Games
this manuscript. The author owes special thanks to Fan K (1953) Minimax theorems. Proc Natl Acad Sci Wash
Mr. Ramanujan Raghavan and Dr. A.V. Lakshmi 39:42–47
Narayanan for their help in incorporating the graphics Ferguson TS (1967) Mathematical stat, a decision theoretic
drawings into the latex file. approach. Academic, New York
Ferguson C, Ferguson TS (2003) On the Borel and von
Neumann poker models. Game Theory Appl 9:17–32
Filar JA, Vrieze OJ (1996) Competitive Markov decision
Bibliography processes. Springer, Berlin
Fisher RA (1936) The use of multiple measurements in
Alpern S, Gal S (2003) The theory of search games and taxonomic problems. Ann Eugenics 7:179–188
rendezvous. Springer Fourier JB (1890) Second extrait. In: Darboux GO
Aumann RJ (1981) Survey of repeated games, essays in (ed) Gauthiers Villars, Paris, pp 325–328; English
game theory and mathematical economics. In: Honor of Translation: Kohler DA (1973)
Oscar Morgenstern. Bibliographsches Institut, Mann- Gal S (1980) Search games. Academic, New York
heim, pp 11–42 Gale D (1979) The game of Hex and the Brouwer fixed-
Axelrod R, Hamilton WD (1981) The evolution of coop- point theorem. Am Math Mon 86:818–827
eration. Science 211:1390–1396 Gale D, Kuhn HW, Tucker AW (1951) Linear programming
Bapat RB, Raghavan TES (1997) Nonnegative matrices and the theory of games. In: Activity analysis of production
and applications. In: Encyclopedia in mathematics. and allocation. Wiley, New York, pp 317–329
Cambridge University Press, Cambridge Harsanyi JC (1967) Games with incomplete information
Bellman R, Blackwell D (1949) Some two person games played by Bayesian players, parts I, II, and III. Sci
involving bluffing. Proc Natl Acad Sci U S A 35:600–605 Manag 14:159–182; 32–334; 486–502
Berge C (1963) Topological spaces. Oliver Boyd, Isaacs R (1965) Differential games: mathematical a theory
Edinburgh with applications to warfare and pursuit. Control and
Berger U (2007) Brown’s original fictitious play. J Econ optimization. Wiley, New York; Dover Paperback Edi-
Theory 135:572–578 tion, 1999
Berlekamp ER, Conway JH, Guy RK (1982) Winning Jansen MJM (1981) Regularity and stability of equilibrium
ways for your mathematical plays, vol 1, 2. Academic, points of bimatrix games. Math Oper Res 6:530–550
New York Johnson RA, Wichern DW (2007) Applied multivariate
Binmore K (1992) Fun and game theory a text on game statistical analysis, 6th edn. Prentice Hall, New York
theory. Lexington, DC Heath Kakutani S (1941) A generalization of Brouwer’s fixed
Blackwell D (1951) On a theorem of Lyapunov. Ann Math point theorem. Duke Math J 8:457–459
Stat 22:112–114 Kantorowich LV (1960) Mathematical methods of orga-
Blackwell D (1961) Minimax and irreducible matrices. nizing and planning production. Manag Sci 7:366–422
Math J Anal Appl 3:37–39 Kaplansky I (1945) A contribution to von Neumann’s
Blackwell D, Girshick GA (1954) Theory of games and theory of games. Ann Math 46:474–479
statistical decisions. Wiley, New York Karlin S (1959) Mathematical methods and theory in
Bouton CL (1902) Nim-a game with a complete mathe- games, programming and econs, vol 1, 2. Addison
matical theory. Ann Math 3(2):35–39 Wesley, New York
Brown GW (1951) Iterative solution of games by fictitious Kohler DA (1973) Translation of a report by Fourier on his
play. In: Koopmans TC (ed) Activity analysis of pro- work on linear inequalities. Opsearch 10:38–42
duction and allocation. Wiley, New York, pp 374–376 Krein MG, Rutmann MA (1950) Linear operators leaving
Bubelis V (1979) On equilibria in finite games. Int J Game invariant a cone in a Banach space. Amer Math Soc
Theory 8:65–79 Transl 26:1–128
Chin H, Parthasarathy T, Raghavan TES (1973) Structure Kreps VL (1974) Bimatrix games with unique equilibrium
of equilibria in N-person noncooperative games. Int points. Int Game J Theory 3:115–118
Game J Theory 3:1–19 Krishna V, Sjostrom T (1998) On the convergence of
Conway JH (1982) On numbers and games, monograph fictitious play. Math Oper Res 23:479–511
16. London Mathematical Society, London Kuhn HW (1953) Extensive games and the problem of
Dantzig GB (1951) A proof of the equivalence of the information. Contributions to the theory of games.
programming problem and the game problem. In: Ann Math Stud 28:193–216
Koopman’s actvity analysis of production and allo- Lindenstrauss J (1966) A short proof of Liapounoff’s con-
cationation, Cowles Conumesion monograph 13. vexity theorem. Math J Mech 15:971–972
Wiley, New York, pp 333–335 Loomis IH (1946) On a theorem of von Neumann. Proc Nat
Dresher M (1962) Games of strategy. Prentice Hall, Engle- Acad Sci Wash 32:213–215
wood Cliffs Miyazawa K (1961) On the convergence of the learning
Dvoretzky A, Wald A, Wolfowitz J (1951) Elimination of process in a 2 2 non-zero-sum two-person game.
randomization in certain statistical decision problems Econometric research program, research memorandum
and zero-sum two-person games. Ann Math Stat no. 33. Princeton University, Princeton
22:1–21
Zero-Sum Two Person Games 227
Monderer D, Shapley LS (1996) Potential LS games. Sion M (1958) On general minimax theorem. Pac J Math
Games Econ Behav 14:124–143 8:171–176
Myerson R (1991) Game theory. Analysis of conflict. Sorin S (1992) Repeated games with complete informa-
Harvard University Press, Cambridge, MA tion, ster 4. In: Aumann RJ, Hart S (eds) Handbook
Nash JF (1950) Equilibrium points in n-person games. of game theory, vol 1. North Holland, Amsterdam,
Proc Natl Acad Sci Wash 88:48–49 pp 71–103
Owen G (1985) Game theory, 2nd edn. Academic, Thuijsman F, Raghavan TES (1997) Stochastic games with
New York switching control or ARAT structure. Int Game
Parthasarathy T, Raghavan TES (1971) Some topics in two J Theory 26:403–408
person games. Elsevier, New York Ville J (1938) Note sur la theorie generale des jeux ou
Radzik T (1988) Games of timing related to distribution of intervient l’habilite des joueurs. In: Borel E, Ville J
resources. Optim J Theory Appl 58(443-471):473–500 (eds) Applications aux jeux de hasard, tome IV, fasci-
Raghavan TES (1970) Completely mixed strategies in cule II of the Traite du calcul des probabilites et de ses
bimatrix games. Lond J Math Soc 2:709–712 applications, by Emile Borel
Raghavan TES (1973) Some geometric consequences of a von Neumann J (1928) Zur Theorie der Gesellschaftspiele.
game theoretic result. Math J Anal Appl 43:26–30 Math Ann 100:295–320
Rao CR (1952) Advanced statistical methods in biometric von Neumann J, Morgenstern O (1947) Theory of games
research. Wiley, New York and economic behavior, 2nd edn. Princeton University
Reijnierse JH, Potters JAM (1993) Search games with Press, Princeton
immobile hider. Int J Game Theory 21:385-394 von Stengel B (1996) Efficient computation of behavior
Robinson J (1951) An iterative method of solving a game. strategies. Games Econ Behav 14:220–246
Ann Math 54:296–301 Wald A (1950) Statistical decision functions. Wiley,
Rubinstein A (1982) Perfect equilibrium in a bargaining New York
model. Econometrica 50:97–109 Weyl H (1950) Elementary proof of a minimax theorem
Schelling TC (1960) The strategy of conflict. Harvard due to von Neumann. Ann Math Stud 24:19–25
University Press, Cambridge, MA Zermelo E (1913) Uber eine Anwendung der Mengenlehre
Selten R (1975) Reexamination of the perfectness concept auf die Theorie des Schachspiels. In: Proceedings of the
for equilibrium points in extensive games. Int Game fifth congress mathematicians. Cambridge University
J Theory 4:25–55 Press, Cambridge, MA, pp 501–504
Shapley LS (1953) Stochastic games. Proc Natl Acad of
Sci USA 39:1095–1100
stochastically, and it depends on the decisions
Stochastic Games of the participants.
A strategy A rule that dictates how a participant in
Yehuda John Levy1 and Eilon Solan2 an interaction makes his decisions as a function
1
Adam Smith Business School, University of of the observed behavior of the other participants
Glasgow, Glasgow, UK and of the evolution of the environment.
2
The School of Mathematical Sciences, Tel Aviv An equilibrium A collection of strategies, one
University, Tel Aviv, Israel for each player, such that each player maxi-
mizes (or minimizes, in case of stage costs)
his evaluation of stage payoffs given the strat-
Article Outline egies of the other players.
Evaluation of stage payoffs The way that a par-
Glossary ticipant in an ongoing interaction evaluates the
Definition of the Subject and Its Importance stream of stage payoffs that he receives (or stage
Strategies, Evaluations, and Equilibria costs that he pays) along the interaction.
Zero-Sum Games
Multiplayer Games
Definition of the Subject and Its
Correlated Equilibrium
Importance
Imperfect Monitoring
Folk Theorems
Stochastic games, first introduced by Shapley
Algorithms
(1953), model dynamic interactions in which the
Continuous-Time Games
environment changes in response to the behavior
Additional and Future Directions
of the players.
D Formally, a stochastic game is a
Bibliography
tuple G ¼ N, S, ðAi , Ai , ui Þi N, q where:
Glossary
• N is a set of players.
A correlated equilibrium An equilibrium in an • S is a state space. If S is uncountable, it is
extended game in which either at the outset of supplemented with a s-algebra of
the game or at various points during the play of measurable sets.
the game each player receives a private signal, • For every player i N, Ai is a set of actions for
and the vector of private signals is chosen that player, and Ai : S ! Ai is a set-valued
according to a known joint probability distri- (measurable) function that assigns to each
bution. In the extended game, a strategy of a state s S the set of actions Ai(s) that are
player depends, in addition to past play, on the available to player i in state s. If Ai is
signals he received. uncountable, it is supplemented with an
A stochastic game A repeated interaction s-algebra of measurable sets. Denote
between several participants in which the SA = {(s, a): s S, a = (ai)i N, ai Ai(s)
underlying state of the environment changes 8i N}. This is the set of all possible action
profiles.
We thank Eitan Altman, János Flesch, Yuval Heller, Jean- • For every player i N, ui: SA ! R is a
Jacques Herings, Ayala Mashiach-Yakovi, Andrzej (measurable) stage payoff function for player i.
Nowak, Ronald Peeters, T.E.S. Raghavan, Jérôme Renault,
• q: SA ! D(S) is a (measurable) transition
Nahum Shimkin, Robert Simon, Sylvain Sorin, William
Sudderth, and Frank Thuijmsman, for their comments on function, where D(S) is the space of probability
an earlier version of the entry. distributions over S.
© Springer Science+Business Media, LLC, part of Springer Nature 2020 229
M. Sotomayor et al. (eds.), Complex Social and Behavioral Systems,
https://doi.org/10.1007/978-1-0716-0368-0_522
Originally published in
R. A. Meyers (ed.), Encyclopedia of Complexity and Systems Science, © Springer Science+Business Media LLC 2017
https://doi.org/10.1007/978-3-642-27737-5_522-2
230 Stochastic Games
The game starts at an initial state s1 S and is payoffs. These two sometimes contradicting
played as follows. At each stage t N, each effects make the optimization problem of the
player i N chooses an action ati Ai ðst Þ and players quite intricate and the analysis of the
t t
t the stage payoff ui(s , a ), where
receives game challenging.
a ¼ ai i N , and the game moves to a new state
t 10. The players receive a stage payoff at each
st+1 that is chosen according to the probability stage. So far we did not mention how the
distribution q(| st, at). players evaluate the infinite stream of stage
A few comments are in order: payoffs that they receive, nor did we say what
is their information at each stage: Do they
1. A stochastic game lasts infinitely many observe the current state? Do they observe
stages. However, the model also captures the actions of the other players? These issues
finite interactions (of length t), by assuming will be discussed later.
the play moves, at stage t, to an absorbing 11. The actions that are available to the players at
state with payoff 0 to all players. each stage, the payoff functions, and the tran-
2. In particular, by setting t = 1, we see that sition function all depend on the current state
stochastic games are a generalization of and not on past play (i.e., past states that the
matrix games (games in normal form), game visited and past actions that the players
which are played only once. chose). This assumption is without loss of
3. Stochastic games are also a generalization of generality. Indeed, suppose that the actions
repeated games, in which the players play the available to the players at each stage, the
same matrix game over and over again. payoff functions, and the transition function
Indeed, a repeated game is equivalent to a all depend on past play, as well as on the
stochastic game with a single state. current state. For every t N, let Ht be the
4. Stopping games are also a special case of sto- set of all possible histories of length t, that is,
chastic games. In these games every player has all sequences of the form (s1, a1, s2, a2,. . ., st),
two actions in all states, continue and stop. As where sk S for every k = 1, 2, 3,. . ., t, ak
long as all players choose continue, the stage
¼ aki i N and aki is an available action to
payoff is 0; once at least one player chooses
player i at stage k, for every k = 1, 2,,. . .,
stop, the game moves to an absorbing state.
t 1. Then the game is equivalent to a game
5. Markov decision problems (see, e.g., Puterman
with state space H : = [t NHt, in which the
1994) are stochastic games with a single player.
state variable captures past play and the state
6. The transition function q governs the evolution
at stage t lies in Ht. In the new game, the sets
of the game. It depends on the actions of all
of available actions, the payoff function, and
players and on the current state, so that all the
the transition function depend on the current
players may influence the evolution of the game.
state rather than on all past play.
7. The payoff function ui of player i depends on
the current state as well as on the actions
chosen by all players. Thus, a player’s payoff The interested reader is referred to Filar and
depends not only on that player’s choice but Vrieze (1996), Sorin (2002), Neyman and Sorin
also on the behavior of the other players. (2003), and Başar and Zaccour (2017) for further
8. Though we refer to the functions (ui)i N as reading on stochastic games. We now provide a
“stage payoffs,” with the implicit assumption few applications.
that each player tries to maximize his payoff,
in some applications these functions describe Example 1 Capital Accumulation (Levhari and
a stage cost, and then the implicit assumption Mirman 1980; Dutta and Sundaram 1992,
is that each player tries to minimize his cost. 1993; Amir 1996; Nowak 2003c) Two
9. The action of a player at a given stage affects (or more) agents jointly own a natural resource
both his stage payoff and the evolution of the or a productive asset; at every period they have to
state variable, thereby affecting his future decide on the amount of the resource to consume.
Stochastic Games 231
The amount that is not consumed grows by a The transition depends on the actions chosen by
known (or an unknown) fraction. Such a situation the players, but it has a stochastic component,
occurs, e.g., in fishery: fishermen from various which captures the number of new packets that
countries fish in the same area, and each country arrive at the various transmitters during every
sets a quota for its fishermen. Here the state time slot.
variable is the current amount of resource, the
action set is the amount of resource to be exploited Example 4 Queues (Altman 2005) Individuals
in the current period, and the transition is that require service have to choose whether to be
influenced by the decisions of all the players, as served by a private slow service provider or by a
well as possibly by the random growth of the powerful public service provider. This situation
resource. arises, e.g., when jobs can be executed on either
a slow personal computer or a fast mainframe.
Example 2 Taxation (Chari and Kehoe 1990; Here a state lists the current load of the public and
Phelan and Stacchetti 2001) A government sets private service providers, and the cost is the time
a tax rate at every period. Each citizen decides at to be served.
every period how much to work and, from the total The importance of stochastic games stems
amount of money he or she has, how much to from the wide range of applications they encom-
consume; the rest is saved for the next period pass. Many repeated interactions can be recast as
and grows by a known interest rate. Here the stochastic games; the vast array of theoretical
state is the amount of savings each citizen has; results that have been obtained provide insights
the stage payoff of a citizen depends on the that can help in analyzing specific situations and
amount of money that he consumed, on the suggesting proper behavior to the participants. In
amount of free time he has, and on the total certain classes of games, algorithms that have
amount of tax that the government collected. The been developed may be used to calculate such
stage payoff of the government may be the aver- behavior.
age stage payoff of the citizens, the amount of tax
collected, or a mixture of the two.
Strategies, Evaluations, and Equilibria
Example 3 Communication Network (Sagduyu
and Ephremides 2003) A single-cell system with So far we have not described the information that
one receiver and multiple uplink transmitters the players have at each stage. In most of the
shares a single, slotted, synchronous classical chapter, we assume that the players have complete
collision channel. Assume that all transmitted information of past play; that is, at each stage t,
packets have the same length and require one they know the sequence s1, a1, s2, a2,. . ., st of
time unit, which is equal to one time slot, for states that were visited in the past (including the
transmission. Whenever a collision occurs, the current state) and the actions that were chosen by
users attempt to retransmit their packets in sub- all players. This assumption is too strong for most
sequent slots to resolve collision for reliable applications, and in the sequel we will mention the
communication. consequences of its relaxation.
Here a state lists all relevant data for a given Since the players observe past play, a pure
stage, e.g., the number of packets waiting at each strategy for player i is a (measurable) function si
transmitter, or the length of time each has been that assigns to every finite history (s1, a1, s2,
waiting to be transmitted. The players are the a2,. . ., st) an action si(s1, a1, s2, a2,. . ., st)
transmitters, and the action of each transmitter Ai(st), with the interpretation that, at stage t, if
is which packet to transmit, if any. The stage cost the finite history (s1, a1, s2, a2,. . ., st) occurred,
may depend on the number of time slots that the player i plays the action si(s1, a1, s2, a2,. . ., st). If
transmitted packet waited, on the number of the player does not know the complete history,
packets that have not been transmitted at that then a strategy for player i is a function that
period, and possibly on additional variables. assigns to every possible information set, an
232 Stochastic Games
action that is available to the player when the The limsup payoff under s for player i is
player has this information. A mixed strategy for
player i is a probability distribution over the set of P
T
l sufficiently small, and a limsup e-equilibrium is • The action spaces of the two players are A1(s)
called a uniform e-equilibrium. and A2(s), respectively.
• The payoff function (that Player 2 pays Player 1) is
Definition 7 Let e > 0. A strategy profile s is a
X
uniform e-equilibrium if it is a limsup lu1 ðs, aÞ þ ð1 lÞ qðs0 j s, aÞvðs0 Þ:
e-equilibrium and there are T0 N and l0 s0 S
(0, 1) such that for every T T0 the strategy
profile s is a T-stage e-equilibrium and for every The game Gls ðvÞ captures the situation in
l (0, l0) it is a l-discounted e-equilibrium. which, after the first stage, the game terminates
If for every e > 0 the game has a T-stage (resp. with a terminal payoff v(s0 ), where s0 is the state
l-discounted, limsup, uniform) e-equilibrium reached after stage 1. Define an operator ’: V ! V,
with corresponding payoff ge, then any accumu- termed the Shapley operator, as follows:
lation point of (ge)e>0 as e goes to 0 is termed a ’s ðvÞ ¼ val Gls ðvÞ ,
T-stage (resp. l-discounted, limsup, uniform)
where val Gls ðvÞ is the value of the matrix
equilibrium payoff.
game Gls ðvÞ . Since the value operator is non-
expansive, it follows that the operator ’ is
Zero-Sum Games contracting: ||’(v) ’(w)||1 (1 l)||v
w||1, so that this operator has a unique fixed
A two-player stochastic game is zero-sum if point ^v l . One can show that the fixed point is the
u1(s, a) + u2(s, a) = 0 for every (s, a) SA. As value of the stochastic game, and every strategy si
in matrix games, every two-player zero-sum sto- of player i in which he plays, after each finite
chastic game admits at most one equilibrium pay- history (s1, a1, s2, a2,. . ., st), an optimal
mixed
off at every initial state s1, which is termed the action in the matrix game Glst ^v l is a
value of the game at the initial state s1. Each l-discounted 0-optimal strategy in the
player’s strategy which is part of an e-equilibrium stochastic game.
is termed e-optimal. The definition of e-equilibrium
implies that an e-optimal strategy guarantees the
Example 9 Consider the two-player zero-sum
value up to e; for example, in the T-stage evalua-
game with three states: S = {s0, s1, s2} that
tion, if s1 is an e-optimal strategy of Player 1, then
appears in Fig. 1, each entry of the matrix indi-
for every strategy of Player 2 we have
cates the payoff that Player 2 (the column player)
gT1 ðs1 , s1 , s2 Þ vT ðs1 Þ e,
pays Player 1 (the row player, the payoff is in the
where vT (s1) is the T-stage value at s1.
middle), and the transitions (which are determin-
In his seminal work, Shapley (1953) presented
istic and are denoted at the top-right corner).
the model of two-player zero-sum stochastic
The states s0 and s1 are absorbing: once the
games with finite state and action spaces and
play reaches one of these states, it never leaves
proved the following.
it. State s2 is nonabsorbing. Stochastic games with
a single nonabsorbing state are called absorbing
Theorem 8 (Shapley 1953) For every two- games. For every v = (v0, v1, v2) V = R3, the
player zero-sum stochastic game, the
games Gls ðvÞ s S are depicted in Fig. 2.
l-discounted value at every initial state exists.
The unique fixed point^v of the operator val(Gl)
Moreover, both players have l-discounted
must satisfy
0-optimal stationary strategies.
The l-discounted value of the game at the
initial state s1 is denoted by vl(s1).
• ^v 0 ¼ val Gls0 ð^v Þ , so that vl ðs0 Þ ¼ ^v 0 ¼ 0;
Proof Let V be the space of all functions v:
• ^v 1 ¼ val Gls1 ð^v Þ , so that vl ðs1 Þ ¼ ^v 1 ¼ 1;
S ! R. For every v V and s S, define a
zero-sum matrix game Gls ðvÞ as follows: • ^v 2 ¼ val Gls1 ð^v Þ :
234 Stochastic Games
Stochastic Games, L R
Fig. 1 The game in
T 0 s
2
1 s
1 L L
Example 9
1 0 1 0
B 1 s 0 s T 1 s T 0 s
State s2 State s1 State s0
Stochastic Games, L R
Fig. 2 The games
l T (1 − λ)v2 λ + (1 − λ)v1 L L
Gs ðvÞ s S in Example 9
B λ + (1 − λ)v1 (1 − λ)v0 T λ + (1 − λ)v1 T (1 − λ)v0
The game Gλs2 The game Gλs1 The game Gλs0
Stochastic Games, L R
Fig. 3 The Big Match
T 0 s
2
1 s
2 L L
game in Example 10
1 0 1 0
B 1 s 0 s T 1 s T 0 s
State s2 State s1 State s0
initial state s2 is 12 and a l-discounted stationary rate is sufficiently low. In other words, Player 1 can-
0-optimal strategy (i.e., at each stage Player 2 plays not guarantee 12, but he may guarantee an amount as
L with probability 1
R with probability 12 ) for close to 12 as he wishes by choosing M to be
1 2 and
Player 2 is 2 ðLÞ, 12 ðRÞ . Indeed, if Player 1 plays T, sufficiently large. The strategy sM 1 is defined as
then the expected stage payoff is 12 and play remains follows: at stage t, play B with probability
1
at s2, while if Player 1 plays B, then the game moves ðMþl r Þ2
, where lt is the number of stages up to
t t
to an absorbing state, and the expected stage payoff stage t in which Player 2 played L and rt is the
from that stage onwards is 12 . In particular, this number of stages up to stage t in which Player
strategy guarantees 12 for Player 2 both in the limsup 2 played R.
evaluation and uniformly. h A l-discountedi0-optimal Since rt + lt = t – 1, one has rt lt = 2rt (t 1).
1 1
strategy for Player 1 is 1þl ðT Þ, 1þl ð BÞ . The quantity rt is the total payoff that Player
What can Player 1 guarantee in the limsup eval- 1 received in the first t 1 stages if Player
uation and uniformly? If Player 1 plays the station- 1 played T in those stages (and the game was
ary strategy [x(T), (1 x)(B)] that plays at each not absorbed). Thus, this total payoff is a linear
stage the action T with probability x and the action function of the difference rt lt. When presented
B with probability 1 x, then Player 2 has a reply this way, the strategy sM 1 depends on the total
that ensures that the limsup payoff is 0: if x = 1 and payoff up to the current stage. Observe that as rt
Player 2 always plays L, the payoff is 0 at each stage; increases, rt lt increases as well and the prob-
if x < 1 and Player 2 always plays R, the payoff is ability to play B decreases.
1 until the play moves to s0, and then it is 0 forever. Mertens and Neyman (1981) generalized the
Since Player 1 plays the action B with probability idea presented at the end of Example 10 to sto-
1 x > 0 at each stage, the distribution of the stage chastic games with finite state and action spaces.
in which play moves to s0 is geometric. Therefore, the (Mertens and Neyman’s (1981) result actually
limsup payoff is 0, and if l is sufficiently small, the holds in every stochastic game that satisfies
discounted payoff is close to 0. proper conditions, which are always satisfied
One can verify that if Player 1 uses a bounded- when the state and action spaces are finite.)
recall strategy, that is, a strategy that uses only the
last k actions that were played, Player 2 has a Theorem 11 If the state and action spaces of a
reply that guarantees that the limsup payoff is two-player zero-sum stochastic game are finite,
0 and the discounted payoff is close to 0, provided the game has a uniform value v0(s) at every
l is close to 0. Thus, in the limsup payoff, and in initial state s S. Moreover, v0(s) = liml!0
the uniform game, finite memory cannot guaran- vl(s) = limT!1 vT (s).
tee more than 0 in this game (see also Fortnow In their proof, Mertens and Neyman describe a
and Kimmel (1998)). uniform e-optimal strategy. In this strategy the
To get a limsup payoff higher than 0, Player player keeps a parameter, lt, which is a fictitious
1 would like to condition the probability of playing discount rate to use at stage t. This parameter
T on the past behavior of Player 2: if in the past changes at each stage as a function of the stage
Player 2 played the action L more often than the payoff; if the stage payoff at stage t is high, then lt+1
action R, he would have liked to play T with higher < lt, whereas if the stage payoff at stage t is low,
probability; if in the past Player 2 played the action then lt+1 > lt. The intuition is as follows. As
R more often than the action L, he would have liked mentioned before, in stochastic games there are
to play B with higher probability. Blackwell and two forces that influence the player’s behavior: he
Ferguson (1968) constructed a family of good strat- tries to get high stage payoffs, while keeping
egies sM 1 ,MN for Player 1. The parameter future prospects high (by playing in such a way
M determines the amount that the strategy guaran- that the next stage that is reached is favorable).
tees: the strategy sM1 guarantees a limsup payoff and When considering the l-discounted payoff, there
a discounted payoff of M1 2M , provided the discount is a clear comparison between the importance of
236 Stochastic Games
q(s2 | s0 , x1 (s0 ))
q(s1 | s0 , x1 (s0 )) q(s3 | s2 , x2 (s2 ))
Denote by Q1l the inertia rate of the left-hand e.g., h(x) = 2 +pffiffisin(ln(
ffi ln(x))). One
can
side part of the game in Fig. 5 and by Q2l the compute x2, l plffiffiffi. Since h0 ðxÞ ¼ o 1x , we
inertia rate of the right-hand side part of the game have h x2, l plffiffi; and substituting this
in Fig. 5. Using Shapley’s result, one can show back gives Q2l 2hðllÞ.
that the l-discounted value in states s0 and s2
satisfies In light of the example by Vigeral (2013), it is
of interest to find classes of stochastic games with
vl ðs0 Þ ¼ Q1l ð1Þ þ 1 Q1l vl ðs2 Þ,
infinite action spaces in which the asymptotic
vl ðs2 Þ ¼ Q1l ðþ1Þ þ 1 Q2l vl ðs0 Þ,
value or even the uniform value does exist.
and therefore
A first step in this direction using algebraic tools
is presented in Bolte et al. (2015). Further
Q2l Q1l Q1l Q2l Q2 Q1l Q1l Q2l
v l ðs0 Þ ¼ 2 1 1 2
, vl ðs2 Þ ¼ l2 : advances in zero-sum stochastic games, and
Ql þ Ql Ql Ql Ql þ Q1l Q1l Q2l
zero-sum dynamic games in general, can be
In particular, if for each i = 1, 2 there is a found in Laraki and Sorin (2015).
continuous, positive, and bounded (the notation
g(l) ~ h(l) as l ! 0 means liml!0 ghððllÞÞ ¼ 1: )
Multiplayer Games
function f i: (0, 1] ! R, bounded away from
0 and satisfying Qil lr f i ðlÞ as l ! 0, for Takahashi (1964) and Fink (1964) extended
some r > 0, then
f 1 ð lÞ Shapley’s (1953) result to discounted equilibria
1 in nonzero-sum games.
f 2 ð lÞ
vl ðs0 Þ vl ðs2 Þ :
f 1 ð lÞ
1þ Theorem 13 Every stochastic game with finite
f 2 ð lÞ
state and action spaces has a l-discounted equi-
Consequently, to find a stochastic game with
librium in stationary strategies.
no asymptotic value, it becomes a matter to find
functions f 1 and f 2 such that Q1l and Q2l each has Proof The proof utilizes Kakutani’s fixed point the-
the form above with the same exponent r, one in orem (Kakutani, 1941). Let M = maxi,s,a |ui(s, a)| be
which liml!1 f i(l) exists for i = 1 but not for a bound on the absolute values of the payoffs. Set
i = 2. One pair of functions (f 1, f 2) that has this
property is: X = i N,s S (D(Ai(s)) [M, M]). A point x
¼ xAi, s , xVi, s X is a collection of one
• Assume q(s2 | s0, x) = x and q(s1 | s0, x) = x2. i N, s S
Minimizing (1), which is now a function of a mixed action and one payoff to each player at
qffiffiffiffiffiffi every state. For every v = (vi)i N [M,
single variable, gives x1, l ¼ 1l l
, and there- M]NS and every s S, define a matrix game Gls
1
p ffiffi
ffi
fore Ql 2 l. ðvÞ as follows:
• Assume q(s3 | s2, y) = y2 and q(s2 | s2,
y) = y = h(y), where h is bounded away from • The action spaces of each player i is Ai(s).
0, h0 ðxÞ ¼ o 1x , and limx!0 h(x) does not exist, • The payoff to player i is
238 Stochastic Games
X
lui ðs, aÞ þ ð1 lÞ qðs0 j s, aÞvi ðs0 Þ: these works assume some continuity conditions
s0 S on the transition function. Recently Levy (2013a)
and Levy and McLennan (2015) have demon-
We define a set-valued function ’: X ! X as strated that general discounted stochastic games
follows: may not possess stationary equilibria, presenting
examples both in the framework of games with a
• For every player i N and every state s S, deterministic transition function, and in the case
’Ai, s is the set of all best responses of player i to of absolutely continuous transition function (the
the strategy vector xi , s : = (xj , s)j 6¼ i in the latter assumption being undertaken, e.g., in
game Gls ðvÞ. That is, Nowak and Raghavan (1992) and Nowak
(2003b)).
’Ai, s ðx, vÞ :¼ argmaxyi, s DðAi ðsÞÞ lr i s, yi, s , xi, s As in the case of zero-sum games, a dynamic
!
X programming argument coupled with Nash’s equi-
0
þð1 lÞ q s j s, yi, s, xi, s vi, s0 : librium theorem shows that under a strong conti-
s0 S
nuity assumption on the payoff function or on the
transition function, a T-stage equilibrium exists.
• For every player i N and every state s S, the Little is known regarding the existence of the
quantity ’Vi, s ðxs , vÞ is the payoff for player i in the limsup equilibrium and uniform equilibrium, even
game Gls ðvÞ , when the player plays the mixed when the sets of states and actions are finite. The
action profile xs in the game Gls ðvÞ. That is, following classical example, due to Sorin (1986)
and coined the “Paris Match,” demonstrates a sort
X
’Vs ðx, vÞ :¼ lr ðs, xs Þ þ ð1 lÞ qðs0 j s, xs Þvi, s0 : of discontinuity between the sets of discounted
s0 S and finite-stage equilibria on the one hand and
the sets of limsup and uniform equilibria on the
The set-valued function ’ has convex and non- other hand.
empty values, and its graph is closed, so that by
Kakutani’s fixed point theorem, it has a fixed
point. It turns out that if (x , v ) is a fixed point Example 14 Paris Match Consider the two-
of ’, then the stationary strategy profile x is a player nonzero-sum stochastic game with two
stationary l-discounted equilibrium with absorbing states and one nonabsorbing state,
corresponding payoff v . which appears in Fig. 6 and is quite similar to
This result readily extends to games with dis- the Big Match.
crete countable state spaces, but attempts at prov- Suppose the initial state is s2. Like in the Big
ing existence of stationary equilibrium when the Match, as long as Player 1 plays action T, the play
state space is generally proved elusive. Some remains at state s2; once he plays action B, the play
deduced the existence of such equilibria in spe- moves to either state s0 or s1 and is effectively
cific classes of games, e.g., Amir (1996) and Horst terminated. Unlike in zero-sum games, in which
(2005). Other works establish existence of equi- the discounted values and finite stage values con-
libria in more complex (history-dependent) verge to the uniform value, Sorin (1986) demon-
strategies – see Mertens and Parthasarathy strates that for any discount rate, the only
(1987) and Solan (1998) – or in correlated equi- equilibrium payoff is V :¼ 12 , 23 . A similar conclu-
librium, e.g., Nowak and Raghavan (1992); all sion holds for the finitely repeated game. However,
Stochastic Games, L R
Fig. 6 The game in
T (1, 0) s 2
(0, 1) s2 L L
Example 14
1 0 1
B (0, 2) s (1, 0) s T (0, 2) s T (1, 0) s0
State s2 State s1 State s0
Stochastic Games 239
Playing the stationary strategy 13 ðLÞ, 23 ðRÞ guar- payoff, then W1 V1, W2 V2. We next argue that
antees Player 2 at least V 2 ¼ 23 . Since Player 1’s the probability of absorption under any limsup and
payoffs are identical to those in the Big Match, he uniform e-equilibrium must be close to one.
can guarantee V 1 ¼ 12. Hence, if W = (W1, W2) is a Indeed, to quote Sorin (1986), “If the probability
l-discounted equilibrium payoff under equilib- of getting an absorbing payoff on the equilibrium
rium strategies (s1, s2), then it satisfies path is less than 1, then after some time Player 1 is
W1 V1, W2 V2. Standard continuity arguments essentially playing T; the corresponding feasible
show that the set of l-discounted equilibrium pay- rewards from this stage on are not individually
offs is closed. Assume then that W is a rational [as the payoffs from that stage onward
l-discounted equilibrium payoff with the largest would sum to 1], hence a contradiction.”
payoff W2 for Player 2 among all l-discounted Conversely, for a payoff W = (p, 2(1 p)) with
1
equilibrium payoff. We will show W2 V2. A sim- 2 p 23 , let s2 be the stationary strategy that
ilar argument shows that if W1 is the largest plays the mixed action [(1 p)(L), p(R)] at each
payoff for Player 1 in any equilibrium, then stage, and let se1 be an e-optimal strategy in the
W1 V1. Together, these implications show that auxiliary zero-sum game that appears in Fig. 8.
V is the only l-discounted equilibrium payoff. The auxiliary game is similar to the Big Match
It is easy to check that in equilibrium the strat- (Example 10), and an e-optimal strategy for
egies of the two players play at the first stage a Player 1 can be constructed similarly to the con-
fully mixed action. Let WL (resp. WR) denote the struction in that game or using the general con-
expected payoff for Player 2 from the second stage struction of Mertens and Neyman (1981). Either
onward (re-normalized) conditional on the action of these explicit constructions shows that play
L (resp. R) being played at the first stage. Then WL under se1 , s2 absorbs with probability 1 and
240 Stochastic Games
Stochastic Games, L R
Fig. 8 The auxiliary game
T p s 2
p − 1 s2 L L
B −p s1 1 − p s0 T −p s1 T 1 − p s0
State s2 State s1 State s0
gives the desired payoff. One thenverifies using existence of e-subgame-perfect equilibrium with
Big Match-type arguments that se1 , s2 is a upper-semicontinuous payoffs. Flesch and Pre-
limsup e-equilibrium and a uniform dtetchinski (2017) generalized this result to
e-equilibrium. games in which payoff continuity fails on a
The most significant result in the study of the sigma-discrete set (a countable union of discrete
uniform equilibrium so far has been Vieille sets). Flesch et al. (2014) present an example of a
(2000a,b), who proved that every two-player sto- perfect information game in which no e-subgame-
chastic game with finite state and action spaces perfect equilibrium exists. (Payoffs are Borel and
has a uniform e-equilibrium for every e > 0. This described explicitly.) For a recent survey on
result has been proved for other classes of sto- nonzero-sum stochastic games, the reader is
chastic games; see, e.g., Thuijsman and Raghavan referred to Jaśkiewicz and Nowak (2017b).
(1997), Solan (1999), Solan and Vieille (2001),
Simon (2003, 2007, 2012), Altman et al. (2008),
Flesch et al. (2007), and Flesch et al. (2008, Correlated Equilibrium
2009). Several influential works in this area are
Kohlberg (1974), Vrieze and Thuijsman (1989), The notion of correlated equilibrium was intro-
and Flesch et al. (1997). Most of the papers men- duced by Aumann (1974, 1987); see also Forges
tioned above rely on the vanishing discount rate (2007). A correlated equilibrium of a static game
approach, which constructs a uniform is an equilibrium of an extended game, in which
e-equilibrium by studying a sequence of each player receives at the outset of the game a
l-discounted equilibria as the discount rate goes private signal such that the vector of signals is
to 0. chosen according to a known joint probability
For games with general state and action spaces, distribution. In repeated interactions, such as in
a limsup equilibrium exists under an ergodicity stochastic games, there are three natural notions of
assumption on the transitions; see, e.g., Nowak correlated equilibria: (a) each player receives one
(2003b, Remark 4) and Jaśkiewicz and Nowak signal at the outset of the game (normal-form
(2005, 2006). correlated equilibrium); (b) each player receives
A particular class of games of interest is those a signal at each stage (extensive-form correlated
with perfect information, i.e., those games in equilibrium); and (c) at every stage the players
which there are no simultaneous moves, and observe a public signal that is, without loss of
both players observe past play. Existence of equi- generality, taken to be uniformly distributed in
librium in this class was proven by Mertens [0, 1] (extensive-form public correlated equilib-
(1987) in a very general setup of perfect informa- rium). It follows from Forges (1990) that when the
tion games. However, it leaves the existence of state and action sets are finite, the set of all corre-
e-subgame-perfect equilibrium, that is, a strategy lated T-stage equilibrium payoffs (either normal
profile inducing e-equilibria in every subgame. form or extensive form) is a polytope.
Flesch et al. (2010) show that e-subgame-perfect Nowak and Raghavan (1992) proved the exis-
equilibrium exists when the payoffs depend in a tence of stationary discounted extensive-form
lower-semicontinuous way on the history of play. public correlated equilibrium under weak condi-
(The sequence of plays is endowed with the tions on the state and action spaces. Roughly, their
Tychonoff topology.) Purves and Sudderth approach is to apply Kakutani’s fixed point theo-
(2011) complemented this by proving the rem to the set-valued function that assigns to each
Stochastic Games 241
game Gls ðvÞ the set of all correlated equilibrium the T-stage equilibrium, or the limsup equilib-
payoffs in this game, which is convex and com- rium: An alternative description of the game is to
pact. This approach has since been generalized by view the strategies as being chosen at the onset of
Duggan (2012) to equilibrium in games in which the game, hence reducing the stochastic game to a
the state component contains enough “noise,” one-shot game in which the actions are the strat-
which is used to replace the public signal. egies in the stochastic games. This space has a
Solan and Vieille (2002) proved the existence natural topological structure in which two strate-
of an extensive-form correlated uniform equilib- gies are close if they approximately agree for a
rium payoff when the state and action spaces are long initial period. This reduction is feasible
finite. Their approach is to let each player play his regardless of the information structure. The
uniform optimal strategy in a zero-sum game in discounted and T-stage payoffs are continuous
which all other players try to minimize his payoff. with respect to this structure, and hence Nash’s
Existence of a normal-form correlated equilib- equilibrium theorem applies and allows us to
rium was proved for the class of absorbing deduce the existence of equilibrium (see, e.g.,
games (Solan and Vohra, 2002). Altman and Solan, 2009). This approach may be
Solan (2001) characterized the set of also used for the limsup equilibrium under a
extensive-form correlated equilibrium payoffs proper ergodicity condition.
for general state and action spaces and a general Whenever there exists an equilibrium in sta-
evaluation on the stage payoffs and provided a tionary strategies (e.g., a discounted equilibrium
sufficient condition that ensures that the set of in games with finitely many states and actions),
normal-form correlated equilibrium payoffs coin- the only information that players need in order to
cides with the set of extensive-form correlated follow the equilibrium strategies is the current
equilibrium payoffs. Using these techniques, state. In particular, they need not observe past
Mashiach-Yaakovi (2015) showed the existence actions of the other players. As we now show, in
of extensive-form correlated equilibrium in sto- the Big Match game the limsup value and the
chastic games when the payoffs are given by uniform value may fail to exist when each player
general bounded Borel functions. does not observe the past actions of the other
player.
players know the past play. There are cases in playing the stationary strategy 12 ðLÞ, 12 ðRÞ . One
which this assumption is too strong; in some can show that for every strategy of Player 2, Player
cases players do not know the complete descrip- 1 has a reply such that the limsup payoff is at least 12.
tion of the current state (Examples 3 and 4), and in In other words, infs2 sups1 g1 ðs2 , s1 , s2 Þ ¼ 12.
others players do not fully observe the actions of We now argue that sups1 infs2 g1(s2, s1,
all other players (Examples 2, 3, and 4). For a s2) = 0. Indeed, fix a strategy s1 for Player 1
most general description of stochastic games, see and e > 0. Let y N be sufficiently large such
Mertens et al. (2015, Chapter IV) and Coulomb that the probability that under s1 Player 1 plays
(2003b). In this model, at every stage each player action B for the first time after stage y is at most e.
observes a private signal, which depends on the If no such y exists, then absorption occurs a.s., so
current state and on the actions taken by the the best response of Player 2 is to always play
players at the current stage. action R. Observe that as t increases, the proba-
The following observation can be used to show bility that Player 1 plays action B for the first time
the existence of equilibrium in some classes of after stage t decreases to 0, so that such y exists.
games, in particular for discounted equilibrium, Consider the following strategy s2 of Player 2:
242 Stochastic Games
play action R up to stage y and play L from stage station depends on the power attenuations of all
t + 1 and on. By the definition of y, Player 1 plays the mobiles. Finally, the stage payoff is the stage
action B either before or at stage y, and then the power consumption.
game moves to state s0, and the payoff is 0 at each Rosenberg et al. (2009) studied the extreme
stage thereafter; or Player 1 plays action T at case of two-player zero-sum games in which the
each stage, and then the stage payoff after stage players observe neither the current state nor the
y is 0, or, with probability less than e, Player action of the other player and proved that the
1 plays action B for the first time after stage y, uniform value does exist in two classes of
the play moves to state s1, and the payoff is games, which capture the behavior of certain com-
1 thereafter. Thus, the limsup payoff is at most e. munication protocols. Classes of games in which
A similar analysis shows that for every the actions are observed but the state is not
0 < l < 1, observed were studied, e.g., by Sorin (1984,
inf s2 sups1 gl ðs2 , s1 , s2 Þ ¼ 12 , 1985), Sorin and Zamir (1991), Krausz and Rieder
(1997), Flesch et al. (2003), Neyman (2008),
however
Rosenberg et al. (2004), Renault (2006, 2012),
sups2 inf s1 liml!0 gl ðs2 , s1 , s2 Þ ¼ 0, and Gensbittel and Renault (2015). For additional
so that the uniform value does not exist as well. results, see Rosenberg et al. (2003), Coulomb
This example shows that in general the limsup (2003a,c), and Sorin (2003).
value and the uniform value need not exist when Until recently, several conjectures raised by
the players do not observe past play. Though in Mertens (1987) had remained open in the general
general the value (and therefore also an equilib- model of zero-sum repeated games that allows for
rium) need not exist, in many classes of stochastic imperfect observation of states and actions. Pri-
games, the value or an equilibrium does exist, marily, it was conjectured that when the number of
even in the presence of imperfect monitoring. states, actions, and signals is finite, the limits
Rosenberg et al. (2002) and Renault (2011) liml!0 vl and limn!1 vn exist and are equal.
showed that the uniform value exists in the one This conjecture was recently shown to be false
player setup (Markov decision problem), in which by Ziliotto (2016). Ziliotto (2016) presents an
the player receives partial information regarding example of a game with symmetric information
the current state. Thus, a single decision-maker (i.e., actions are observed, and consequently at
who faces a dynamic situation and does not fully every stage, both players have the same beliefs
observe the state of the environment can play in over the set of states) in which liml!0 vl does not
such a way that guarantees high payoff, provided exist. In fact, in that example, in each state only
the interaction is sufficiently long or the discount one of the players controls the payoffs and transi-
rate is sufficiently low. tions, and although states are not observed
Altman et al. (2005, 2008) studied stochastic directly, payoffs are known. Ziliotto (2016) then
games in which each player has a “private” state, shows how the example can be modified so that
which only he can observe, and the state of the neither liml!0 vl nor limn!1 vn exist. Lastly,
world is composed of the vector of private states; Ziliotto (2016) presents examples of a state-blind
each player also does not observe the action of the stochastic game (players observe actions but get
other players. Such games arise naturally in wire- no information about the state) and an example of
less communication (see Altman et al., 2005); a game with one state-blind player (i.e., one player
take, for example, several mobiles which period- knows the states, but the other receives no infor-
ically send information to a base station. The mation about it) in which the asymptotic value
private state of a mobile may depend, e.g., on its does not exist. See also Sorin and Vigeral (2015).
exact physical environment, and it determines the Another type of incomplete information is the
power attenuation between the mobile and the presence of information lag, that is, a delay in
base station. The throughput (the amount of bits learning about the actions of an opponent or of
per second) that a mobile can send to the base the state evolution; see Levy (2012) and the
Stochastic Games 243
references therein. Yet another interesting direc- has nonempty interior (ruling out, e.g., the case
tion that introduces uncertainty of a different kind when two agents have identical payoffs),
is to observe games of finite length but with Fudenberg and Maskin (1986) show that all such
uncertainty present as to how long the game will payoffs can arise in subgame-perfect equilibria as
proceed. Neyman and Sorin (2010) studies such well for small enough discount rate. This result
game where the probabilistic information about was extended to the case of public imperfect mon-
the duration is public (i.e., known to both players), itoring of actions (i.e., players do not observe
deduces a recursive formula for the value of such a others’ actions, but rather the action profile deter-
game analogous to that of the value for a game of mines a distribution over public signals), assum-
fixed length, and establishes convergence of the ing that the public signals about the action profiles
value as the expectation of the duration goes to satisfy conditions that effectively allow, in the
infinity. See reference therein for previous works long-run, detection of deviations and the identity
on asymmetric-information uncertain duration of the deviator, known as “full rank conditions”;
processes in repeated prisoners’ dilemma and see Fudenberg et al. (1994).
repeated games of incomplete information. Dutta (1995) extended the classical folk theo-
rem to stochastic games, both for the discounted
evaluation and for the T-stage evaluation. The
l-discounted feasible set of payoffs, which
Folk Theorems
depends on the initial state s, includes the payoffs
that can be achieved in the entire play of the game
Another direction of study is the topic of Folk
and not just in the one-shot game:
Theorems. These results, dating back to Aumann
Fl(s) : = co
and Shapley (1976) and Rubinstein (1979), orig-
({gl(s, s)| s is any strategy
profile}),
inated in the study of ordinary repeated games
where gl ðs, sÞ :¼ ggi ðs, sÞ i N is the vector
(i.e., a single state played repeatedly) and attempt
of payoffs. It is shown there that it suffices to take
to characterize the set of possible equilibrium
the convex hull of payoffs of pure stationary pro-
payoffs as the players become more and more
file payoffs. Dutta (1995) similarly defines the
patient (i.e., as the discount rate l goes to 0). If
feasible set for the long-run average stage game
the payoff in the repeated game is given by a
as the convex hull of limits of payoffs in the
function u(), assigning to each action profile
T-stage games as T goes to infinity:
a in the set of action profiles A = ∏i NAi a
FðsÞ :¼ coðflimT!1 gT ðs, sÞj s is any
payoff vector u(a), then the set of feasible payoff
strategy profilegÞ,
vectors is given by
where limT!1 gT (s, s) refers to the set of
F = co{u(a)| a A},
accumulation points. In particular, a single strat-
where co() denotes the convex hull. A payoff
egy profile may yield multiple points in F(s).
vector v = (v1, . . ., vn) is said to be (strictly)
Individual rationality as well refers to being
individually rational if for every player i,
higher than the minmax of the entire game; i.e.,
vi > mi :¼ min max ui ðai , xi Þ; the vector v RN is (strictly) individually ratio-
xi ∏j6¼i DðAi Þ ai Ai
nal in the l-discounted game if
vi > mi ðs, lÞ :¼ inf si supsi gli ðs, sÞ
i.e., if it is higher than the minmax value in and is (strictly) individually rational in the
mixed strategies for all players. The folk theorem long-run average game if
for repeated games (with perfect monitoring)
states that for every feasible and individually vi > mi ðsÞ :¼ inf si supsi lim inf T!1 gli ðs, sÞ:
rational payoff vector v, there is a discount rate To derive folk theorems, Dutta (1995) assumes
l0 such that if l l0, v is an equilibrium payoff of asymptotic state independence; that is, the limits
the l-discounted game. In fact, assuming the set of liml!0 mi(s, l) for i N and liml!0 Fl(s) are
feasible and individually rational payoff vectors independent of the state. This assumption holds,
244 Stochastic Games
for example, in the case of irreducible stochastic value in zero-sum stochastic game or equilibria in
games with finite state space, i.e., games in which nonzero-sum games. Moreover, in Example 9 the
any state will be reached eventually starting at any discounted value may be irrational for rational
other state, regardless of the strategies used by the discount rates, even though the data of the game
players. Assuming the full dimensionality of the (payoffs and transitions) are rational, so it is not
set of feasible payoffs, any long-run average fea- clear whether linear programming methods can be
sible and individually rational payoff vector cor- used to calculate the value of a stochastic game.
responds to a subgame-perfect equilibrium when Nevertheless, linear programming methods were
the players are sufficiently patient. used to calculate the discounted and uniform
More recently, there have been attempts to value of several classes of stochastic games; see
generalize the imperfect public monitoring frame- Filar and Vrieze (1996) and Raghavan and Syed
work of Fudenberg et al. (1994) to stochastic (2002). Other methods that were used to calculate
games. This has been achieved by Fudenberg the value or equilibria in discounted stochastic
and Yamamoto (2011) and Hörner et al. (2011), games include fictitious play (Vrieze and Tijs
under the assumptions of asymptotic state inde- 1982), value iterates, policy improvement, and
pendence, full dimensionality of payoffs, and the general methods to find the maximum of a function
“full rank conditions” of the public signals; the (see Filar and Vrieze (1996) and Raghavan and
latter work also discusses algorithms for comput- Syed (2003)), a homotopy method (Herings and
ing such strategies. Peski and Wiseman (2015) Peeters, 2004), and algorithms to solve sentences
derive similar results, but rather than assuming in formal logic (Chatterjee et al. (2008) and Solan
players become more and more patient, they and Vieille (2010)).
assume that the duration of each stage of play There are two related questions one can ask
becomes shorter. Aiba (2014) establishes a folk from a computational point of view: The com-
theorem for irreducible stochastic games with pri- plexity of finding the value and the complexity
vate almost-perfect monitoring of actions, of finding e-optimal strategies. The earliest study
extending the results Hörner and Olszewski in this venue is Condon (1992), who concentrates
(2006) have established for repeated games. on a subclass of zero-sum games termed simple
stochastic games. In such games, there are absorb-
ing vertices (sinks) with payoffs 0 or 1, and at all
Algorithms other vertices, reward is 0, and either Player 1, or
Player 2, or Nature chooses which of possible
There has been extensive work on algorithms for neighbors to continue on to. Simple stochastic
computing the value and optimal strategies (and, games are games with perfect information; hence
in some cases, equilibria) of stochastic games, both players have pure and stationary limsup opti-
particularly in light of their important applications mal strategies. Condon (1992) studies the com-
and the increased interest of computer scientists in plexity of finding whether the value of the game is
game-theoretic questions (see Nisan et al. (2007) larger than some constant a and shows that this
or Papadimitriou (2007)). problem lies in the intersection of the complexity
It is well known that the value of a two-player class NP, those solvable in polynomial time using
zero-sum matrix game and optimal strategies for randomized algorithms, and coNP, those whose
the two players can be calculated efficiently using a negation lie in NP. It is also shown that if transi-
linear program. Equilibria in two-player nonzero- tions are controlled by only one player and
sum games can be calculated by the Lemke- Nature, or by the players but not by Nature, then
Howson algorithm, which is usually efficient; the problem is solvable in polynomial time.
however, its worst running time is exponential in Andersson and Miltersen (2009) and the refer-
the number of pure strategies of the players (Savani ences therein show that solving simple stochastic
and von Stengel 2004). Unfortunately, to date there games (either for the value or for e-optimal strat-
are no efficient algorithms to calculate either the egies) is as computationally complex (i.e.,
Stochastic Games 245
polynomial-time reducible to/form) solving either of phenomena, from the study of occurrence of
the discounted or limiting average games with financial crises, in which the state of the economy
perfect information in general. can change drastically in a very short time, to
Chatterjee et al. (2008) show that the uniform sports matches like soccer, in which the score
value of a finite-state two-player zero-sum sto- can change in a split second.
chastic game with limit-average payoff can be The earliest model of continuous-time stochas-
approximated to within e in time exponential in a tic games (called there Markov games) goes back
polynomial in the size of the game times a poly- to Zachrisson (1964), who studied zero-sum
nomial in the logarithm of 1/e. Hansen et al. games played on a fixed interval of time [0, T].
(2011) give algorithms for solving stochastic In such games, the payoffs are given by integra-
games with either discounted or limsup evalua- tion over the play of the game of the instantaneous
tions. The run-times are in general quite long; payoffs: i.e., given the running payoff which
however, when the number of states is fixed, the assigns to every state s, every action profile a,
algorithms run in polynomial time. Hansen et al. and every player i an instantaneous payoff ui(s, a),
(2016) give algorithms for computing e-optimal the total payoff is given by
strategies which use O(loglogT) space, T being RT
0 ui ðs , a Þdt,
t t
the stage, improving on the previous O(logT)
space required by the strategies demonstrated ear- where st and at are the state and action profile at
lier in this paper. In a somewhat different strain, time t. The current state and action profile deter-
Bertrand et al. (2009) study stochastic games with mine the rate of transition. That is, the transitions
imperfect monitoring of actions and public sig- occur in “jumps,” and when the state at time t is st
nals. Instead of studying the discounted or limsup and action profile at is played from time t to t + d,
evaluations, they set winning criteria such as the probability of moving to state s0 between time
reachability/safety (i.e., reaching a specific subset t and t + d is approximately dq(s0 | st, at).
of states eventually/never) or Büchi/co-Büchi Zachrisson (1964) concentrates on Markovian
(i.e., reaching a specific subset of states infinitely strategies, which are a function of only state and
many times/at most finitely many times). Giving time, and shows in particular that the value of the
such an objective for, say, Player 1, the problem zero-sum games exists, as well as do optimal
is to determine whether Player 1 has a strategy Markovian strategies, when the action spaces are
that wins almost surely. In general, they show compact and convex and the transitions and pay-
that solving reachability games is 2EXP offs are multilinear in the actions of the players.
T IME-complete (2EXPTIME is the class of prob- This assumption holds in particular when the
lems solvable in time O(22p(n)) for some polyno- action ati of player i at time t is in fact a mixed
mial p().), as is solving stochastic games with the action over a finite set of actions.
Büchi winning criterion. Surprisingly, games with Until recently, this model has received very
the co-Büchi winning criterion are in general little attention, until being revived by Neyman
undecidable, i.e., cannot be solved by a Turing (2012), who considers multiplayer stochastic
machine. games with infinite-time horizon and continuous
time. As had been well known from other models
of continuous-time games (e.g., differential
Continuous-Time Games games; see Friedman (1971)), there is some diffi-
culty in defining strategies in continuous time.
The model we have presented is played in discrete Allowing a player’s strategy at time t to depend
time. A natural variation is to model the game in on everything that has transpired up to precisely
continuous time. In such a model, the states and time t is known to not induce a well-defined
the actions of the players may change at any time. probability distribution over possible plays of
As argued by, e.g., Neyman (2012), this could be the game. A technique commonly used to get
the natural framework for modeling a wide range around this problem in the differential game
246 Stochastic Games
game possesses e-equilibria. Laraki et al. (2005) Altman E, Solan E (2009) Constrained games: the impact
showed that three-player games need not possess of the attitude to Adversary’s constraints. IEEE Trans
Autom Control 54:2435–2440
e-equilibria, even when the processes (Xi, Yi, Altman E, Avrachenkov K, Marquez R, Miller G (2005)
Zi)i=1,2 and the functions (xi)i=1,2 are all constant. Zero-sum Constrained stochastic games with indepen-
See Kifer (2013) for a survey of Dynkin games dent state processes. Math Meth Oper Res 62:375–386
and applications in finance. Altman E, Avrachenkov K, Bonneau N, Debbah M,
El-Azouzi R, Sadoc MD (2008) Constrained cost-
coupled stochastic games with independent state pro-
cesses. Oper Res Lett 36:160–164
Additional and Future Directions Amir R (1996) Continuous stochastic games of capital
accumulation with convex transitions. Games Econ
Behav 15:111–131
The research on stochastic games extends to addi- Andersson D, Miltersen PB (2009) The complexity of
tional directions than those mentioned in earlier sec- solving stochastic games on graphs. In: Dong Y, Du
tions. Approximation of games with infinite state and DZ, Ibarra O (eds) Algorithms and computation.
action spaces by finite games was discussed by Whitt ISAAC 2009, Lecture notes in computer science,
vol 5878. Springer, Berlin/Heidelberg
(1980) and further developed by Nowak (1985). For Aumann RJ (1974) Subjectivity and correlation in random-
hybrid models that include both discrete-time and ized strategies. J Math Econ 1:67–96
continuous-time aspects, see, e.g., Başar and Olsder Aumann RJ (1987) Correlated equilibrium as an expres-
(1995) and Altman and Gaitsgory (1995). sion of Bayesian rationality. Econometrica 55:1–18
Aumann RJ, Shapley L (1976) Long term competition: a
Among the many directions of future research game theoretical analysis. Mimeo, Hebrew
in this area, we will mention here but a few. One University
challenging question is the existence of a uniform Bertrand N, Genest B, Gimbert H (2009) Qualitative deter-
equilibrium and a limsup equilibrium in multi- minacy and decidability of stochastic games with sig-
nals. In: 24th annual IEEE symposium on logic in
player stochastic games with finite state and computer science (LICS’09), pp 319–328
action spaces. Another active area is to establish Bewley T, Kohlberg E (1976) The asymptotic theory of
the existence of the uniform value or uniform stochastic games. Math Oper Res 1:197–208
maxmin in classes of stochastic games with Blackwell D (1956) An analog of the minimax theorem for
vector payoffs. Pacific J Math 6:1–8
imperfect monitoring. A third direction concerns Blackwell D, Ferguson TS (1968) The big match. Ann
the identification of applications that can be recast Math Stat 39:159–163
in the framework of stochastic games and that can Bolte J, Gaubert S, Vigeral G (2015) Definable zero-sum
be successfully analyzed using the theoretical stochastic games. Math Oper Res 40:171–191
Chari V, Kehoe P (1990) Sustainable plans. J Polit Econ
tools that the literature developed. Another prob- 98:783–802
lem that is of interest is the characterization of Chatterjee K, Majumdar R, Henzinger TA (2008) Stochas-
approachable and excludable sets in stochastic tic limit-average games are in EXPTIME. Int J Game
games with vector payoffs (see Blackwell (1956) Theory 37:219–234
Condon A (1992) The complexity of stochastic games. Inf
for the presentation of matrix games with vector Comput 96:203–224
payoffs and Milman (2006) and Flesch et al. Coulomb JM (2003a) Absorbing games with a signalling
(2016) for partial results regarding this problem). structure. In: Neyman A, Sorin S (eds) Stochastic
games and applications, NATO science series. Kluwer,
Dordrecht, pp 335–355
Coulomb JM (2003b) Games with a recursive structure. In:
Neyman A, Sorin S (eds) Stochastic games and applications,
Bibliography
NATO science series. Kluwer, Dordrecht, pp 427–442
Coulomb JM (2003c) Stochastic games without perfect
Primary Literature monitoring. Int J Game Theory 32:73–96
Aiba K (2014) A folk theorem for stochastic games with Duggan J (2012) Noisy stochastic games. Econometrica
private almost-perfect monitoring. Games Econ Behav 80:2017–2045
86:58–66 Dutta P (1995) A folk theorem for stochastic games. J Econ
Altman E (2005) Applications of dynamic games in Theory 66:1–32
queues. Adv Dyn Games 7:309–342. Birkhauser Dutta P, Sundaram RK (1992) Markovian equilibrium in a
Altman E, Gaitsgory VA (1995) A hybrid (differential- class of stochastic games: existence theorems for
stochastic) zero-sum game with fast stochastic part. discounted and undiscounted models. Economic The-
Ann Int Soc Dyn Games 3:47–59 ory 2:197–214
248 Stochastic Games
Dutta P, Sundaram RK (1993) The tragedy of the com- Guo X, Hernandez-Lerma O (2005) Nonzero-sum games
mons? Economic Theory 3:413–426 for continuous-time Markov chains with unbounded
Dynkin EB (1967) Game variant of a problem on optimal discounted payoffs. J Appl Probab 42:303320
stopping. Soviet Math Dokl 10:270–274 Hansen KA, Kouck M, Lauritzen N, Miltersen PB,
Filar JA, Vrieze K (1996) Competitive markov decision Tsigaridas E (2011) Exact algorithms for solving sto-
processes. Springer-Verlag, New York chastic games. In: Proceedings of the 43rd annual ACM
Fink AM (1964) Equilibrium in a stochastic n-person symposium on theory of computing (STOC’11). Asso-
game. J Sci Hiroshima Univ 28:89–93 ciation for Computing Machinery, New York,
Fleming WH, Souganidis PE (1989) On the existence of pp 205–214
value functions of two-player zero-sum stochastic dif- Hansen KA, Ibsen-Jenses R, Kouck M (2016) Algorithmic
ferential games. Indiana Univ Math J 38:293–314 Game Theory - 9th International Symposium, SAGT
Flesch J, Predtetchinski A (2017, forthcoming) 2016, Proceedings, held in Liverpool, UK; Springer-
A characterization of subgame-perfect equilibrium plays Verlag, Berlin Heidelberg
in borel games of perfect information. Math Oper Res Herings JJP, Peeters RJAP (2004) Stationary equilibria in
Flesch J, Thuijsman F, Vrieze K (1997) Cyclic Markov stochastic games: structure, selection, and computa-
equilibria in stochastic games. Int J Game Theory tion. J Econ Theory 118:32–60
26:303–314 Hörner J, Olszewski W (2006) The folk theorem for games
Flesch J, Thuijsman F, Vrieze OJ (2003) Stochastic games with private almost-perfect monitoring. Econometrica
with non-observable actions. Math Meth Oper Res 74:1499–1544
58:459–475 Hörner J, Sugaya T, Takahashi S, Vieille N (2011) Recur-
Flesch J, Thuijsman F, Vrieze OJ (2007) Stochastic games sive methods in discounted stochastic games: an
with additive transitions. Eur J Oper Res 179:483–497 alogirthm for d ! 1; and a folk theorem. Econometrica
Flesch J, Schoenmakers G, Vrieze K (2008) Stochastic games 79:1277–1318
on a product state space. Math Oper Res 33:403–420 Horst U (2005) Stationary equilibria in discounted stochas-
Flesch J, Schoenmakers G, Vrieze K (2009) Stochastic tic games with weakly interacting players. Games Econ
games on a product state space: the periodic case. Int Behav 51:83–108
J Game Theory 38:263–289 Jaśkiewicz A, Nowak AS (2005) Nonzero-sum semi-
Flesch J, Kuipers J, Mashiah-Yaakovi A, Schoenmakers G, Markov games with the expected average payoffs.
Solan E, Vrieze K (2010) Perfect-information games Math Meth Oper Res 62:23–40
with lower-semi-continuous payoffs. Math Oper Res Jaśkiewicz A, Nowak AS (2006) Zero-sum ergodic sto-
35:742–755 chastic games with feller transition probabilities. SIAM
Flesch J, Kuipers J, Mashiah-Yaakovi A, Schoenmakers G, J Control Optim 45:773–789
Shmaya E, Solan E, Vrieze K (2014) Non-existence of Kakutani S (1941) A generalization of Brouwers fixed
subgame-perfect e-equilibrium in perfect information point theorem. Duke Math J 8:457–459
games with infinite horizon. Int J Game Theory Kohlberg E (1974) Repeated games with absorbing states.
43:945–951 Ann Stat 2:724–738
Flesch J, Laraki R, Perchet V (2016) Online learning and Krausz A, Rieder U (1997) Markov games with incomplete
Blackwell approachability in quitting games. J Mach information. Math Meth Oper Res 46:263–279
Learn Res 49:941–942. Proceedings of the 29th con- Laraki R, Solan E (2013) Equilibrium in two-player non-
ference on learning theory zero-sum Dynkin games in continuous time. Stochas-
Forges F (1990) Universal mechanisms. Econometrica tics 85:997–1014
58:1341–1364 Laraki R, Solan E (2005) SIAM Journal on Control and
Fortnow L, Kimmel P (1998) Beating a finite automaton in Optimization 43:1913–1922
the big match. In: Proceedings of the 7th conference on Laraki R, Solan E, Vieille N (2005) Continuous time games
theoretical aspects of rationality and knowledge. Mor- of timing. J Econ Theory 120:206–238
gan Kaufmann, San Francisco, pp 225–234 Levhari D, Mirman L (1980) The great fish war: an exam-
Fudenberg D, Maskin E (1986) The folk theorem in ple using a dynamic Cournot-Nash solution. Bell
repeated games with discounting or with incomplete J Econ 11(1):322–334
information. Econometrica 54:533–554 Levy Y (2012) Stochastic games with information lag.
Fudenberg D, Yamamoto Y (2011) The folk theorem for Games Econ Behav 74:243–256
irreducible stochastic games with imperfect public Levy, Y. (2013a) Discounted stochastic games with no sta-
monitoring. J Econ Theory 146:1664–1683 tionary Nash equilibrium: two examples, Econometrica,
Fudenberg D, Levine D, Maskin E (1994) The folk theo- 81, 1973–2007.
rem with imperfect public information. Econometrica Levy Y (2013b) Continuous-time stochastic games of fixed
62:997–1039 duration. Dyn Games Appl 3:279–312
Gensbittel F, Renault J (2015) The value of Markov chain Levy Y, McLennan A (2015) Corrigendum to “discounted
games with Lack of information on both sides. Math stochastic games with no stationary Nash equilibrium:
Oper Res 40:820–841 two examples”. Econometrica 83:1237–1252
Gillette D (1957) Stochastic games with zero stop proba- Lovo S, Tomala T (2015) Markov perfect equilibria in
bilities, Contributions to the theory of games, stochastic revision games, HEC Paris research paper
vol 3. Princeton University Press, Princeton no. ECO/SCD-2015-1093
Stochastic Games 249
Maitra A, Sudderth W (1998) Finitely additive stochastic Papadimitriou C (2007) The complexity of computing
games with Borel measurable payoffs. Int J Game The- equilibria. In: Nisan N, Roughgarden T, Tardos E,
ory 27:257–267 Vazirani VV (eds) Algorithmic game theory. Cam-
Mamer JW (1987) Monotone stopping games. J Appl Pro- bridge University Press, Cambridge
bab 24:386–401 Peski M, Wiseman T (2015) A folk theorem for stochastic
Martin DA (1998) The determinacy of Blackwell games. games with infrequent state changes. Theor Econ
J Symb Log 63:1565–1581 10:131–173
Mashiach-Yaakovi A (2015) Correlated equilibria in sto- Phelan C, Stacchetti E (2001) Sequential equilibria in a
chastic games with Borel measurable payoffs. Dyn Ramsey tax model. Econometrica 69:1491–1518
Games Appl 5:120–135 Purves RA, Sudderth WD (2011) Perfect information
Mertens JF (1987) Repeated games. In: Proceedings of the games with upper Semicontinuous payoffs. Math
international congress of mathematicians, Berkeley, Oper Res 36:468473
pp 1528–1577 Raghavan TES, Syed Z (2002) Computing stationary Nash
Mertens JF, Neyman A (1981) Stochastic games. Int equilibria of undiscounted single-Controller stochastic
J Game Theory 10:53–66 games. Math Oper Res 27:384–400
Mertens JF, Parthasarathy T (1987) Equilibria for Raghavan TES, Syed Z (2003) A policy improvement type
discounted stochastic games, CORE discussion paper algorithm for solving zero-sum two-Person stochastic games
no. 8750. Also published. In: Neyman A, Sorin S (eds) of perfect information. Math Program Ser A 95:513–532
stochastic games and applications, NATO science Renault J (2006) The value of Markov chain games with
series. Kluwer, Dordrecht, pp 131–172 lack of information on one side. Math Oper Res
Milman E (2006) Approachable sets of vector payoffs in 31:490–512
stochastic games. Games Econ Behav 56:135–147 Renault J (2011) Uniform value in dynamic programming.
Morimoto H (1986) Non zero-sum discrete parameter sto- J Eur Math Soc 13:309–330
chastic games with stopping times. Probab Theory Renault J (2012) The value of repeated games with an
Relat Fields 72:155–160 informed controller. Math Oper Res 37:154–179
Neyman, Sorin (2010) Int J Game Theory 39:29–52 Rosenberg D, Solan E, Vieille N (2001) Stopping games
Neyman A (2008) Existence of optimal strategies in Mar- with randomized strategies. Probab Theory Relat Fields
kov games with incomplete information. Int J Game 119:433–451
Theory 37:581–596 Rosenberg D, Solan E, Vieille N (2002) Blackwell opti-
Neyman A (2012) Continuous-time stochastic games. Dis- mality in Markov decision processes with partial obser-
cussion paper, Center for the Study of Rationality, vation. Ann Stat 30:1178–1193
Jerusalem, #616 Rosenberg D, Solan E, Vieille N (2003) The maxmin of
Neyman A (2013) Stochastic games with short-stage dura- stochastic games with imperfect monitoring. Int
tion. Dyn Games Appl 3:236–278 J Game Theory 32:133–150
Neyman A, Sorin S (2010) Repeated games with public Rosenberg D, Solan E, Vieille N (2004) Stochastic games
uncertain duration process. Int J Game Theory with a single controller and incomplete information.
39:29–52 SIAM J Control Optim 43:86–110
Nowak AS (1985) Existence of equilibrium stationary Rosenberg D, Solan E, Vieille N (2009) Protocol with no
strategies in discounted Noncooperative stochastic acknowledgement. Oper Res 57:905–915
games with uncountable state space. J Optim Theory Rubinstein A (1979) Equilibrium in supergames with the
Appl 45:591–620 overtaking criterion. J Econ Theory 47:153–177
Nowak AS (2003a) Zero-sum stochastic games with Borel Sagduyu YE, Ephremides A (2003) Power control and rate
state spaces. In: Neyman A, Sorin S (eds) Stochastic adaptation as stochastic games for random access. In:
games and applications, NATO science series. Kluwer, Proceedings of the 42nd IEEE conference on decision
Dordrecht, pp 77–91 and control, vol 4, pp 4202–4207
Nowak AS (2003b) N-person stochastic games: extensions of Savani R, von Stengel B (2004) Exponentially many steps
the finite state space case and correlation. In: Neyman A, for finding a Nash equilibrium in a bimatrix game. In:
Sorin S (eds) Stochastic games and applications, NATO Proceedings of the 45th annual IEEE symposium on
science series. Kluwer, Dordrecht, pp 93–106 foundations of computer science, 2004, pp 258–267
Nowak AS (2003c) On a new class of nonzero-sum Shapley LS (1953) Stochastic games. Proc Natl Acad Sci
discounted stochastic games having stationary Nash U S A 39:1095–1100
equilibrium points. Int J Game Theory 32:121–132 Simon RS (2003) The structure of non-zero-sum stochastic
Nowak AS, Raghavan TES (1992) Existence of stationary games. Adv Appl Math 38:1–26
correlated equilibria with symmetric information for Simon RS (2007) The structure of non-zero-sum stochastic
discounted stochastic games. Math Oper Res games. Adv Appl Math 38:1–26
17:519–526 Simon RS (2012) A topological approach to Quitting
Ohtsubo Y (1987) A non zero-sum extension of Dynkin’s games. Math Oper Res 37:180–195
stopping problem. Math Oper Res 12:277–296 Solan E (1998) Discounted stochastic games. Math Oper
Ohtsubo Y (1991) On a discrete-time non-zero-sum Res 23:1010–1021
Dynkin problem with monotonicity. J Appl Probab Solan E (1999) Three-Person absorbing games. Math Oper
28:466–472 Res 24:669–698
250 Stochastic Games
Standard signaling game A signaling game in information you need to find him – or to evaluate
which strategy sets and payoff functions satisfy whether you will meet to discuss old times or to be
monotonicity properties. asked a favor. While the examples all involve sig-
Type In an incomplete information game, a var- naling, the nature of the signaling is different. The
iable that summarizes private information. doctor faces large penalties for misrepresenting her
Verifiable information game A signaling game credentials. She is not required to display all of her
with the property that each type has a signal diplomas, but it is reasonable to assume that degrees
that can only be sent by that type. are not forged. The celebrity endorsement is costly –
certainly to the manufacturer who pays for the
celebrity’s services and possibly to the celebrity
Definition of the Subject himself, whose reputation may suffer if the product
works badly. It is reasonable to assume that it is
Signaling games refer narrowly to a class of two- easier to obtain an endorsement of a good product,
player games of incomplete information in which but there are also good reasons to be skeptical about
one player is informed and the other is not. The the claims. In contrast, although a dishonest or mis-
informed player’s strategy set consists of signals leading message may lead to a bad outcome, leaving
contingent on information, and the uninformed a message is not expensive and the content of the
player’s strategy set consists of actions contingent message is not constrained by your friend’s infor-
on signals. More generally, a signaling game mation. The theory of signaling games is a useful
includes any strategic setting in which players can way to describe the essential features of all three
use the actions of their opponents to make infer- examples.
ences about hidden information. The earliest work Opportunities to send and evaluate signals
on signaling games was Spence’s (1974) model of arise in many common natural and economic set-
educational signaling and Zahari’s (1975) model of tings. In the canonical example (due to Spence
signaling by animals. During the 1980s researchers 1974), a high-ability worker invests in education
developed the formal model and identified condi- to distinguish herself from less skilled workers.
tions that permitted the selection of unique equilib- The potential employer observes educational
rium outcomes in leading models. attainment, but not innate skill, and infers that a
better educated worker is more highly skilled and
pays a higher wage. To make this story work,
there must be a reason that low-ability workers
Introduction do not get the education expected of a more highly
skilled worker and hence obtain a higher wage.
The framed degree in your doctor’s office, the celeb- This property follows from an assumption that the
rity endorsement of a popular cosmetic, and the higher the ability of the worker, the easier it is for
telephone message from an old friend are all signals. her to produce a higher signal.
The signals are potentially valuable because they The same argument appears in many applica-
allow you to infer useful information. These signals tions. For example, a low-risk driver will purchase
are indirect and require interpretation. They may be a lower cost, partial insurance contract, leaving
subject to manipulation. The doctor’s diploma tells the riskier driver to pay a higher rate for full
you something about the doctor’s qualifications, but insurance (Rothschild and Stiglitz 1976 or Wilson
knowing where and when the doctor studied does 1977). A firm that is able to produce high-quality
not prove that she is a good doctor. The endorsement goods signals this by offering a warranty for the
identifies the product with a particular lifestyle, but goods sold (Grossman 1981) or advertising exten-
what works for the celebrity may not work for you. sively. A strong deer grows extra large antlers to
Besides, the celebrity was probably paid to endorse show that it can survive with this handicap and to
the product and may not even use it. The phone signal its fitness to potential mates (Zahavi 1975).
message may tell you how to get in touch with your Game theory provides a formal language to
friend, but is unlikely to contain all of the study how one should send and interpret signals
Signaling Games 253
in strategic environments. This entry reviews the communication about ability is not possible, but
basic theory of signaling and discusses some the worker can acquire education. The employer
applications. It does not discuss related models can observe the worker’s level of education and
of screening. Kreps and Sobel (1994) and Riley use this to form a judgment about the worker’s
(2001) review both signaling and screening. true level of ability. In this application, S is a worker;
Section “The Model” describes the basic model. R represents a potential employer (or a competitive
Section “Equilibrium” defines equilibrium for the labor market); t is the worker’s productivity; s is her
basic model. Section “The Basic Model” limits level of education; and a is her wage.
attention to a special class of signaling game. I give
conditions sufficient for the existence of equilibria in
which the informed agent’s signal fully reveals her
Equilibrium
private information and argue that one equilibrium
of this kind is prominent. The next three sections
Defining Nash equilibrium for the basic signaling
study different signaling games. Section “Cheap
game is completely straightforward when T, M,
Talk” discusses models of costless communication.
and A are finite sets. In this case a behavior strat-
Section “Verifiable Information” discusses the impli-
cations of the assumptions that some information is X for S is a function m: T M ! ½0, 1 such that
egy
mðt, sÞ ¼ 1 for all t. m(t, s) is the probability
verifiable. Section “Communication About Inten- sM
tions” briefly discusses the possibility of signaling that sender-type t sends the signal s. A behavior
intentions rather than private information. strategyXfor R is a function a: M A ! ½0, 1
Section “Applications” describes some applications where aðs, aÞ ¼ 1 for all s. a(s, a) is the prob-
and extensions of the basic model. Section “Future aA
Directions” speculates on directions for future ability that R takes action a following the signal s.
research.
Proposition 1 Behavior strategies (a*, m*) form
a Nash equilibrium if and only if for all t T
The Model
X
mðt, sÞ > 0 implies U S ðt, s, aÞaðs, aÞ
This section describes the basic signaling game. aA
There are two players, called S (for sender) and X
R (for receiver). S knows the value of some ran- ¼ max
0
U S ðt, s0 , aÞaðs0 , aÞ ð1Þ
s S
aA
dom variable t whose support is a given set T. t is
called the type of S. The prior beliefs of R are X
and for each s M such that mðt, sÞpðtÞ > 0,
given by a probability distribution pðÞ over T; tT
these beliefs are common knowledge. When T is
finite, p(t) is the prior probability that the sender’s X
aðs, aÞ > 0 implies U R ðt, s, aÞbðt, aÞ
type is t. When T is uncountably infinite, pðÞ is a tT
density function. Player S learns t and sends to R a X
signal s drawn from some set M. Player R receives ¼ max
0
U R ðt, s, a0 Þbðt, a0 Þ; ð2Þ
a A
tT
this signal and then takes an action a drawn from a
set A. (It is possible to allow A to depend on s and
where
M to depend on t.) This ends the game: The payoff
to i is given by a function ui : T M A ! ℝ. mðt, sÞpðtÞ
bðt, sÞ ¼ X : ð3Þ
This canonical game captures the essential fea-
t0 T
mðt0 , sÞpðt0 Þ
tures of the classic applications of market signaling.
In the labor market signaling story due to Spence
(1974), a worker wishes to signal his ability to a Condition (1) states that the S places positive
potential employer. The worker has information probability only on signals that maximize
about ability that the employer lacks. Direct expected utility. This condition guarantees that
254 Signaling Games
S responds optimally to R’s strategy. Condition (2) difficult to construct pooling equilibria for the
states that R places positive probability only on basic signaling game. Take the labor market model
actions that maximize expected utility, where the and assume S sends the message s* with probability
expectation is taken with respect to the distribution one and that the receiver responds to s* with his best
bð, sÞ following the signal s. Condition (3) states response to the prior distribution and to all other
that bð, sÞ accurately reflects the pattern of play. It messages with the best response to the belief that t is
requires that R’s beliefs be determined using S’s the least skilled agent. Provided that the least skilled
strategy and the prior distribution whenever possi- agent prefers to send s* to sending the cheapest
ble. Equilibrium refinements also require that R has alternative signal, this is a Nash equilibrium.
beliefs following signals s that satisfy
X
mðt, sÞpðtÞ ¼ 0; ð4Þ The Basic Model
tT
The separating equilibrium is a benchmark out-
that is, those signals that are sent with probability come for signaling games. When a separating equi-
zero in equilibrium. Specifically, sequential equi- librium exists, then it is possible for the sender to
librium permits bð, sÞ to be an arbitrary distribu- share her information fully with the receiver in
tion when Eq. 4 holds, but requires that Eq. 2 spite of having a potential conflict of interest.
holds even for these values of s. This restriction Existence of separating equilibria typically
rules out equilibria in which certain signals are not requires a systematic relationship between types
sent because the receiver responds to the signal and signals. An appropriate condition, commonly
with an action that is dominated. referred to as the single-crossing condition, plays
The ability to signal creates the possibility that a prominent role in signaling games and in models
R will be able to draw inferences about S’s type of asymmetric information more generally.
from the signal. Whether he is able to do so is a In this section I limit attention to a special class
property of the equilibrium. It is useful to define of signaling game in which there is a monotonic
two extreme cases. relationship between types and signals. In these
models, separating equilibria typically exist.
Definition 1 An equilibrium (a*, m*) is called a I begin by stating the assumption in the environ-
separating equilibrium if each type t sends differ- ment most commonly seen in applications. Assume
ent signals. That is, M can that the sets T, M, and A are all real intervals.
Xbe partitioned into sets
Mt such that for each t, mðt, sÞ ¼ 1. An equi-
s Mt Definition 2 US ðÞ satisfies the single-crossing con-
librium (a*, m*) is called a pooling equilibrium if dition if US ðt, s, aÞ U S ðt, s0 , a0 Þ for s0 > s implies
there is a single signal s * that is sent by all types that U S ðt0 , s, aÞ < U S ðt0 , s0 , a0 Þ for all t0 > t.
with probability one.
In a typical application, U S ðÞ is strictly
In a separating equilibrium, R can infer S’s pri- decreasing in its second argument (the signal)
vate information completely. In a pooling equilib- and increasing in its third argument (R’s response)
rium, R learns nothing from the sender’s signal. This for all types. Consequently indifference curves are
definition excludes other possible situations. For well defined in M A for all t. The single-
example, all sender types can randomize uniformly crossing condition states that indifference curves
over a set of two or more signals. In this case, the of different sender types cross once. If a lower
receiver will be able to draw no inference beyond type is indifferent between two signal-action
the prior from a signal received in equilibrium. More pairs, then a higher type strictly prefers to send
interesting is the possibility that the equilibrium will the higher signal. In this way, the single-crossing
be partially revealing, with some, but not all of the condition guarantees that higher types send
sender types sending common signa. It is not weakly higher signals in equilibrium.
Signaling Games 255
Note two generalizations of Definition 2. First, 4. U S ðÞ is strictly increasing in action and strictly
the assumption that the domain of US ðÞ is the decreasing in signal.
product of intervals can be replaced by the 5. The single-crossing property holds.
assumption that these sets are partially ordered. 6. The receiver’s best-response function is uniquely
In this case, weak and strict order replaces the defined, independent of the signal, and strictly
weak and strict inequalities comparing types and increasing in t so that it can be written BR(t).
actions in the statement of the definition. Second, 7. There exists sM such that
it is sometimes necessary to extend the definition U S ðK, s, BRðK ÞÞ < U S K, s0 , BRð0Þ .
to mixed strategies. In this case, the ordering of
A induces a partial ordering of distributions of Conditions (1) and (2) simplify exposition, but
A through first-order stochastic dominance. otherwise are not necessary. It is important that T,
When one thinks of the single-crossing condi- M, and A be partially ordered so that some kind of
tion geometrically, it is apparent that it implies a single-crossing condition applies. Conditions (4–
ranking of the slopes of the indifference curves of 6) impose a monotone structure on the problem so
the sender. Suppose that U S ðÞ is smooth, strictly that higher types are more able to send high signals
increasing in actions and strictly decreasing in and that higher types induce higher (and uniformly
signals so that indifference curves are well defined more attractive) actions. These conditions imply
for each t. For fixed t, an indifference
curve is a set that in equilibrium, higher types will necessarily
of the form ðs, aÞ: US ðt, s, aÞ c for some con- send weakly higher signals. Condition (7) is a
stant c. Denote the function by ā(s; t), so that boundary condition that makes sending high sig-
U S ðt, s, aðs; tÞÞc. It follows that the slope of the nals unattractive. It states that the highest type of
indifference curve of a type t sender is sender would prefer to be treated like the lowest
type rather than use the signal s. These properties
U S2 ðt, s, aÞ hold in many standard applications. Condition (6)
a1 ðs; tÞ ¼ ; ð5Þ
U S3 ðt, s, aÞ would be satisfied if U R ðt, s, aÞ ¼ ða tÞ2 .
inductively produces a signaling strategy for the are many pooling equilibrium outcomes. One can
sender
and a response rule for the receiver defined construct a potential pooling outcome by assum-
on s0 , . . . , sK . When BRðÞ is strictly increas- ing that all sender types send the same signal and
ing, the single-crossing condition implies that the that the receiver best responds to this common
signaling strategy is strictly increasing. To com- signal and responds to all other signals with the
plete the description of strategies, assume that the least attractive action. Under the standard mono-
receiver takes the action BR(k) in response to tonicity assumptions, this strategy profile will be
signals in the interval ½sk , skþ1 Þ, BR(0) for s < s0 , an equilibrium if the lowest sender type prefers
and BR(K) for s > sK . By the definition of the pooling to sending the cheapest available out-of-
best-response function, the receiver is best equilibrium message. Section “Separating Equi-
responding to the sender’s strategy. When the librium” ended with the construction of a separat-
boundary condition fails, a separating equilibrium ing equilibrium. There are also typically many
need not exist, but when M is compact, one can separating equilibrium outcomes. Assume that
follow the construction above to obtain an equi- types t ¼ 0, . . . , r 1 send signals st , type
librium in which the lowest types separate and r sends e st for
sr > sr , and subsequent signals e
higher types pool at the maximum signal in t > r solve
M (see Cho and Sobel 1990 for details).
In the construction, the equilibrium involves max U S ðt, s, BRðtÞÞ, subject to U S ðt 1, s, BRðtÞÞ
s
inefficient levels of signaling. When U S ðÞ is
decreasing in the signal, all but the lowest type U ðt 1, e
st1 , BRðt 1ÞÞ:
of sender must make a wasteful expenditure in the
signal in order to avoid being treated as having a In both of these cases, the multiplicity is typically
lower quality. The result that expenditures on profound, with a continuum of distinct equilib-
signals are greater than the levels optimal in a rium outcomes (when M is an interval). The mul-
full-information model continue to hold when tiplicity of equilibria means that, without
U S ðÞ is not monotonic in the signal. The sender refinement, equilibrium theory provides few
inevitably does no better in a separating equilib- clear predictions beyond the observation that the
rium than she would do if R had full information lowest type of sender receives at least U* (0), the
about t. Indeed, all but the lowest type will do payoff it would receive under complete informa-
strictly worse in standard signaling games. On the tion, and the fact that the equilibrium signaling
other hand, the equilibrium constructed above has function is weakly increasing in the sender’s type.
a constrained efficiency property: Of all separat- The first property is a consequence of the mono-
ing equilibria, it is Pareto dominant from the tonicity of S’s payoff in a and of R’s best response
standpoint of S. To confirm this claim, argue function. The second property is a consequence of
inductively that in any separating equilibrium, if the single-crossing condition.
tj sends the signal sj, then s j sj , with equality This section describes techniques that refine the
only if all types i < j send si with probability one. set of equilibria. Refinement arguments that guar-
Mailath (1987) provides a similar construction antee existence and select unique outcomes for
when T is a real interval. In this case, the Spence- standard signaling games rely on the Kohlberg-
Mirrlees formulation of the single-crossing con- Mertens (1986) notion of strategic stability. The
dition plays an important role, and the equilibrium complete theory of strategic stability is only avail-
is a solution to a differential equation. able for finite games. Consequently the literature
applies weaker versions of strategic stability that
are defined more easily for large games. Banks and
Multiple Equilibria and Selection Sobel (1987), Cho and Kreps (1987), and Cho and
Section “Equilibrium” ended with the construc- Sobel (1990) introduce these arguments.
tion of a pooling equilibrium. A careful reconsid- Multiple equilibria arise in signaling games
eration of the argument reveals that there typically because Nash equilibrium does not constrain the
Signaling Games 257
receiver’s response to signals sent with zero prob- Proposition 3 The standard signaling game has a
ability in equilibrium. Specifying that R’s response unique separating equilibrium outcome that sat-
to these unsent signals is unattractive leads to the isfies condition D1.
largest set of equilibrium outcomes. (In standard
signaling games, S’s preferences over actions do In standard signaling games, the only equilib-
not depend on type, so the least attractive action is rium outcome that satisfies condition D1 is the
well defined). The equilibrium set shrinks if one separating outcome described in the previous sec-
restricts the meaning of unsent signals. An effec- tion. Details of the argument appear in Cho and
tive restriction is condition D1, introduced in Cho Sobel. The argument relies on two insights. First,
and Kreps (1987). This condition is less restrictive types cannot be pooled in equilibrium because
than the notion of universal divinity introduced by slightly higher signals will be interpreted as com-
Banks and Sobel (1987), which in finite games is ing from the highest type in the pool. Second, in
less restrictive than Kohlberg and Mertens’s notion any separating equilibrium in which a sender type
of strategic stability. fails to solve Step 2, deviation to a slightly lower
Given an equilibrium (a*, m*), let U* (t) be the signal will not lower R’s beliefs.
equilibrium expected payoff of a type t sender and The refinement argument is powerful and the
let Dðt, sÞ ¼ fa : uðt, s, aÞ U ðtÞg be the set of separating outcome selected receives prominent
pure-strategy responses to s that lead to payoffs at attention in the literature. It is worth pointing out
least as great as the equilibrium payoff for player t. that the outcome has one unreasonable property.
Given a collection of sets, fXðtÞgt T , XðtÞ is The separating outcome described above depends
maximal if it is not a proper subset of any X(t). only on the support of types and not on the details
of the distribution. Further, all types but the lowest
Definition 3 Behavior strategies (a*, m*) type must make inefficient (compared to the full-
together with beliefs b* satisfy D1 if for any information case) investments in signal in order to
unsent message s, b ð, sÞ is supported on those distinguish themselves from lower types. The effi-
t for which D(t, s) is maximal. cient separating equilibrium for a sequence of
games in which the probability of the lowest type
In standard signaling games, D(t, s) is an inter- converges to zero does not converge to the sepa-
val: all actions greater than or equal to a particular rating equilibrium of the game in which the prob-
action will be attractive relative to the equilibrium. ability of the lowest type is zero. In the special case
Hence, these sets are nested. If D(t, s) is not of only two types, the (efficient) pooling outcome
maximal, then there is another type t0 that is may be a more plausible outcome when the prob-
“more likely to deviate” in the sense that there ability of the lower type shrinks to zero. Grossman
exists out-of-equilibrium responses that are attrac- and Perry (1987) and Mailath et al. (1993) intro-
tive to t0 but not t. Condition D1 requires that the duce equilibrium refinements that select the
receiver place no weight on type t making a devi- pooling equilibrium in this setting. These concepts
ation in this case. Notice if D(t, s) is empty for all t, share many of the same motivations of the refine-
then D1 does not restrict beliefs given s (and any ments introduced by Banks and Sobel and Cho and
choice of action will support the putative equilib- Kreps. They are qualitatively different from the
rium). Condition D1 is strong. One can imagine intuitive criterion, divinity, and condition D1,
weaker restrictions. The intuitive condition (Cho because they are not based on dominance argu-
and Kreps 1987) requires that b ðt, sÞ ¼ 0 when ments and lack general existence properties.
Dðt, sÞ ¼ ∅ and at least one other D(t0 , s) is
nonempty. Divinity (Banks and Sobel 1987)
requires that if D(t, s) is strictly contained in Cheap Talk
D(t0 , s), then b ðt0 , sÞ=b ðt, sÞ pðt0 Þ=pðtÞ, so
that the relative probability of the types more Models in which preferences satisfy the single-
likely to deviate increases. crossing property are central in the literature, but
258 Signaling Games
the assumption is not appropriate in some inter- introduced a similar game in an article circulated
esting settings. This section describes an extreme in 1981.) In this entry, A and T are the unit interval
case in which there is no direct cost of signaling. and M can be taken to be the unit interval without
In general, a cheap-talk model is a signaling loss of generality. The sender’s private information
model in which Ui(t, s, a) is independent of s for or type, t, is drawn from a differentiable probability
all (t, a). Two facts about this model are immediate. distribution function, FðÞ , with density f ðÞ ,
First, if equilibrium exists, then there always exists supported on [0, 1]. S and R have twice continuously
an equilibrium in which no information is commu- differentiable von Neumann-Morgenstern utility
nicated. To construct this “babbling” equilibrium, functions Ui(t, a) that are strictly concave in a and
assume that b(t, s) is equal to the prior independent have a strictly positive mixed partial derivative. For
of the signal s. R’s best response will be to take an i ¼ R, S, ai ðtÞ denotes the unique solution to
action that is optimal conditional only on his prior max Ui ðt, aÞ and further assume that aS ðtÞ >
a
information. Hence, R’s action can be taken to be aR ðtÞ for all t. (The assumptions on U i ðÞ guarantee
constant. In this case, it is also a best response for that U i ðÞ is well defined and strictly increasing).
S to send a signal that is independent of type, which In this model, the interests of the sender and
makes b(t, s) the appropriate beliefs. Hence, even if receiver are partially aligned because both would
the interests of S and R are identical, so that there are like to take a higher action with a higher t. The
strong incentives to communicate, there is a possi- interests are different because S would always like
bility of complete communication breakdown. the action to be a bit higher than R’s ideal action. In
Second, it is clear that nontrivial communica- a typical application, t represents the ideal action
tion requires that different types of S have differ- for R, such as the appropriate expenditure on a
ent preferences over R’s actions. If it is the case public project. Both R and S want actual expendi-
that whenever some type t prefers action a to ture to be close to the target value, but S has a bias
action a0 , then so do all other types, then (ruling in favor of additional expenditure.
out indifference1) it must be the case that in equi- For 0 t0 < t00 1, let ā(t0 , t00 ) be the unique
librium the receiver takes only one action with ð t00
positive probability. To see this, note that other- solution to max U R ða, tÞdFðtÞ. By convention,
a
t0
wise one type of sender is not selecting a best
aðt, tÞ ¼ aR ðtÞ.
response. The second observation shows that
Without loss of generality, limit attention to
cheap talk is not effective in games, like the stan-
pure-strategy equilibria. The concavity assump-
dard labor-market story, in which the sender’s
tion guarantees that R’s best responses will be
preferences are monotonic in the action of the
unique, so R will not randomize in equilibrium.
receiver. With cheap communication, the poten-
An equilibrium with strategies (m*, a*) induces
tial employee in the labor market will always
action a if ft : a ðm ðtÞÞ ¼ ag has positive
select a signal that leads to the highest possible
prior probability. Crawford and Sobel (1982)
wage and consequently, in equilibrium, all types
characterize equilibrium outcomes.
of workers will receive the same wage.
U S ðti , aðti , tiþ1 ÞÞ US ðti , aðti1 , ti ÞÞ ¼ 0; ð6Þ R interprets m0 the same way as m and sender
types previously sending m randomize equally
mðtÞ ¼ mi f or t ðti1 , ti ; ð7Þ between m and m0 .
In the basic model messages take on meaning
and only through their use in an equilibrium. Unlike
natural language, they have no external meaning.
aðmi Þ ¼ aðti1 , ti Þ: ð8Þ There have been several attempts to formalize the
notion that messages have meanings that, if con-
Furthermore, essentially all equilibrium out- sistent with strategic aspects of the interaction,
comes can be described in this way. should be their interpretation inside the game.
In an equilibrium, adjacent types pool together The first formulation of this idea is due to
and send a common message. Condition (6) states Farrell (1993).
that sender types on the boundary of a partition
element are indifferent between pooling with Definition 4 Given an equilibrium with sender
types immediately below or immediately above. expected payoffs U ðÞ, the subset G
T isself-
Condition (7) states that types in a common ele- signaling if G ¼ t : US ðt, BRðGÞÞ > u ðtÞ .
ment of the partition send the same message.
Condition (8) states that R best responds to the That is, G is self-signaling if precisely the types
information in S’s message. in G gain by making a statement that induces the
Crawford and Sobel make another monotonic- action that is a best response to the information
ity assumption, which they call condition (M). that t G. (When BR(t) is not single valued, it is
(M) is satisfied in leading examples and implies necessary to refine the definition somewhat and
that there is a unique equilibrium partition for permit the possibility that U S ðt, BRðGÞÞ ¼ U ðtÞ
each N ¼ 1, . . . , N , the ex ante equilibrium for some t. See Mathews et al. (1991).) Farrell
expected utility for both S and R is increasing in argues that the existence of a self-signaling set
N, and N* increases if the preferences of S and would destroy an equilibrium. If a subset G had
R become more aligned. These conclusions pro- available message that meant “my type is in G,”
vide justification for the view that with fixed pref- then relative to the equilibrium R could infer that
erences “more” communication (in the sense of if he were to interpret the message literally, then it
more actions induced) is better for both players would be sent only by those types in G (and hence,
and that the closer the interests of the players are, the literal meaning would be accurate). With this
the greater the possibilities for communication. motivation, Farrell proposes a refinement.
As in the case of models with costly signaling,
there are multiple equilibria in the cheap-talk Definition 5 An equilibrium is neologism proof
model. The multiplicity is qualitatively different. if there exist no self-signaling sets relative to the
Costly signaling models have a continuum of equilibrium.
Nash equilibrium outcomes. Cheap-talk models
have only finitely many. Refinements that impose Rabin (1990) argues convincingly that
restrictions on off-the-equilibrium path beliefs Farrell’s definition rules out too many equilibrium
work well to identify a single outcome in costly outcomes. Indeed, for leading examples of the
signaling models. These refinements have no cut- basic cheap-talk game, there are no neologism-
ting power in cheap-talk models because any proof equilibria. Specifically, in the Crawford-
equilibrium distribution on type-action pairs can Sobel model in which S has a bias toward higher
arise from signaling strategies in which all mes- actions, there exist self-signaling sets of the form
sages are sent with positive probability. To prove [t, 1]. On the other hand, Chen et al. (2007) dem-
this claim, observe that if message m0 is unused in onstrate that the N *-step equilibrium always sat-
equilibrium, while message m is used, then one isfies the no incentive to separate (NITS)
can construct a new equilibrium in which condition,
260 Signaling Games
US ð0, a ðm ð0ÞÞÞ U S 0, aR ð0Þ ; ð9Þ Communication is nontrivial if some sender
type strictly prefers to induce one equilibrium
and that under condition (M) this is the only action over another. Nontrivial communication
equilibrium that satisfies condition (9). typically requires that different types have differ-
NITS states that the lowest type of sender prefers ent preferences over outcomes. In standard signal-
her equilibrium payoff to the payoff she would ing models, the heterogeneity arises because
receive if the receiver knew her type (and responded different sender types have different costs of send-
optimally). Kartik (2009) introduced and named this ing messages. In cheap-talk models with one-
condition. The NITS condition can be shown to rule dimensional actions, the heterogeneity arises if
out equilibria that admit self-signaling sets of the different sender types have different ideal actions.
form [0, t]. Chen (2011) and Kartik (2009) show that With multidimensional actions, heterogeneity
the condition holds in the limits of perturbed ver- could come simply from different sender types
sions of the basic cheap-talk game. having different preferences over the relative
Condition 9 holds automatically in any perfect importance of the different issues. Another simple
Bayesian equilibrium of the standard signaling variation is to assume the existence of more than
model. This follows because when R’s actions one sender. In the two-sender game, nature picks
are monotonic in type and S’s preferences are t as before, both senders learn t and simulta-
monotonic in action, the worst outcome for S is neously send a message to the receiver, who
to be viewed as the lowest type. This observation makes a decision based on the two messages.
would not be true in Nash equilibrium, where it is The second sender has preferences that depend
possible for R to respond to an out-of-equilibrium on type and the receiver’s action, but not directly
message with an action a < BRð0Þ. on the message sent. In this environment, assume
that M ¼ T, so that the set of available messages is
equal to the type space (this is essentially without
Variations on Cheap Talk loss of generality). One can look for equilibria in
In standard signaling models, there is typically an which the senders report honestly. Denote by a *
equilibrium that is fully revealing. This is not the (t, t0 ) R’s response to the pair of messages (t, t0 ). If
case in the basic cheap-talk model. This leads to an equilibrium in which both senders report hon-
the question of whether it is possible to obtain estly exists, then R’s response to identical mes-
more revelation in different environments. sages is a ðt, tÞ ¼ aR ðtÞ. It must be the case that
One idea is to consider the possibility of sig- there exists a specification of a * (t, t0 ) for t 6¼ t0
naling over many dimensions. Chakraborty and such that for all i ¼ 1 and 2 and t 6¼ t0 ,
Harbaugh (2007) consider a model in which T and
A are multidimensional. A special case of their U Si ðt, a ðt, tÞÞ U Si ðt, a ðt, t0 ÞÞ: ð10Þ
model is one in which the components of T are
independent draws from the same distribution and It is possible to satisfy condition (10) if the biases
A involves taking a real-valued action for each of the senders are small relative to the set of
component of T. If preferences are additively sep- possible best responses. Krishna and Morgan
arable across types and actions, Chakraborty and (2001a) study a one-dimensional model of infor-
Harbaugh provide conditions under which cate- mation transmission with two informed players.
gorical information transmission, in which Ambrus and Takahashi (2008) and Battaglini
S transmits the order of the components of T, is (2002) provide conditions under which full reve-
credible in equilibrium even when it would not be lation is possible when there are two informed
possible to transmit information if the dimensions players and possibly multiple dimensions of
were treated in isolation. It may be credible for information.
S to say “t1 > t2 ,” even if she could not credibly In many circumstances, enriching the commu-
provide information about the absolute value of nication structure either by allowing more rounds
either component of t. of communication (Aumann and Hart 2003;
Signaling Games 261
Forges 1990), mediation (Ben-Porath 2003), or Condition 11 requires that beliefs place posi-
exogenous uncertainty (Blume et al. 2007) tive probability only on types capable of sending
enlarges the set of equilibrium outcomes. the message “my type is an element of C.”
Aumann (1990) argues that one cannot rely on fully reveal quality. As in all costly signaling
pre-play communication to select a Pareto- models, it is not important that there be a direct
efficient equilibrium. He considers a simple two- relationship between quality and signal, it is only
player game with Pareto-ranked equilibria and necessary that firms with higher quality have
argues that no “cheap” pre-play signal would be lower marginal costs of advertising. Hence, sim-
credible. ply “burning money” or sending a signal that
Ben-Porath and Dekel (1992) show that adding lowers utility by an amount independent of quality
a stage of “money burning” (a signal that reduces and response can be informative. The consumer
all future payoffs by the same amount) when may obtain full information in equilibrium, but
combined with an equilibrium refinement can someone must pay the cost of advertising. There
select equilibria in a complete information game. are other situations where it is natural for the
Although no money is burned in the selected signal to be linked to the quality of the item.
equilibrium outcome, the potential to send costly Models of verifiable information are appropriate
signals creates dominance relationships that lead in this case. When the assumptions of Proposition
to a selection. 5 hold, one would expect consumers to obtain all
Vida (2006) synthesizes a literature that com- relevant information through disclosures without
pares the set of equilibrium outcomes available wasteful expenditures on signaling. Finally, cheap
when communication possibilities are added to a talk plays a role in some markets. One would
game to the theoretically larger set of equilibrium expect costless communication to be informative
outcomes if there is a reliable mediator available in environments where heterogeneous consumers
to collect information and recommend actions to would like to identify the best product. Cheap talk
the players. can create more efficient matching of product to
consumer. Here communication is free although
in leading models separating equilibria do not
Applications exist.
existence and duration of strikes (Fudenberg and this behavior comes from a model in which inves-
Tirole 1983; Sobel and Takahashi 1983). If a firm tors have imperfect information about the future
with private information about its profitability profitability of the firm and profitable firms are
makes a take-it-or-leave-it offer to a union, then more able than less profitable firms to distribute
the strategic interaction is a simple signaling profits in the form of dividends (see Bhattacharya
model in which the magnitude of the offer may 1979).
serve as a signal of the firm’s profitability. Firms
with low profits are better able to make low wage Reputation
offers to the union because the threat of a strike is Dynamic models of incomplete information cre-
less costly to a firm with low profits than one with ate the opportunity for the receiver to draw infer-
high profits. Consequently settlement offers may ences about the sender’s private information while
reveal information. Natural extensions of this engaging in an extended interaction. Kreps and
model permit counter offers. The variation of the Wilson (1982) and Milgrom and Roberts (1982b)
model in which the uninformed agent makes provided the original treatments of reputation for-
offers and the uninformed agent accepts and mation in games of incomplete information. Moti-
rejects is formally almost identical to the canoni- vated by the limit pricing, their models examined
cal model of price discrimination by a durable- the interaction of a single long-lived incumbent
goods monopolist (Ausubel and Deneckere 1989; facing a sequence of potential entrants. The
Gul et al. 1986). entrants lack information about the willingness
of the incumbent to tolerate entry. Pricing deci-
Finance sions of the incumbent provide information to the
Simple signaling arguments provide potential entrants about the profitability of the market.
explanations for firms’ choices of financial struc- In these models, signals have implications for
ture. Classic arguments due to Modigliani and both current and future utility. The current cost is
Miller (1958) and imply that firms’ profitability determined by the effect the signal has on current
should not depend on their choice of capital struc- payoffs. In Kreps-Wilson and Milgrom-Roberts,
ture. Hence, this theory cannot organize empirical this cost is the decrease in current profits associ-
regularities about firm’s capital structure. The ated with charging a low price. In other models
Modigliani-Miller theorem assumes that the (e.g., Morris 2001; Sobel 1985) the actual signal is
firm’s managers, shareholders, and potential costless, but it has immediate payoff implications
shareholders all have access to the same informa- because of the response it induces. Signals also
tion. An enormous literature assumes instead that have implications for future utility because infer-
the firm’s managers have superior information ences about the sender’s private information will
and use corporate structure to signal profitability. influence the behavior of the opponents in future
Leland and Pyle (1977) assume that insiders periods. Adding concern for reputation to a sig-
are risk averse so they would prefer to diversify naling game will influence behavior, but whether
their personal holdings rather than maintain large it leads to more or less informative signaling
investments in their firm. The greater the value of depends on the application.
diversification, the lower the quality of the firm.
Hence, when insiders have superior information Signaling in Biology
than investors, there will be an incentive for the Signaling is important in biology. In independent
insiders of highly profitable firms to maintain and almost contemporaneous work, Zahavi
inefficiently large investments in their firm in (1975) proposed a signaling model that shared
order to signal profitability to investors. the essential features of Spence’s (1974) model
Dividends are taxed twice under the US tax of labor-market signaling. Zahavi observed that
code, which raises the question of why firms there are many examples in nature of animals’
would issue dividends when capital gains are apparently excessive physical displays. It takes
taxed at a lower rate. A potential explanation for energy to produce colorful plumage, large antlers,
264 Signaling Games
or loud cries. Having a large tail may actually While most of the literature on signaling in
make it harder for peacocks to flea predators. If a biology focuses on the use of costly signals,
baby bird makes a loud sound to get his mother’s there are also situations in which cheap talk is
attention, he may attract a dangerous predator. effective. A leading example is the “Sir Philip
Zahavi argued that costly signals could play a Sidney Game,” originally developed by John
role in sexual selection. In Zahavi’s basic model, Maynard Smith (1991) to illustrate the value of
the sender is a male and the receiver is a female of costly communication between a mother and
the same species. Females who are able to mate child. The child has private information about its
with healthier males are more likely to have stron- level of hunger, and the mother must decide to
ger children, but often the quality of a potential feed the child or keep the food for itself. Since the
mate cannot be observed directly. Zahavi argued players are related, survival of one positively
that if healthier males could produce visible dis- influences the fitness of the other. This creates a
plays more cheaply than less healthy males, then common interest needed for cheap-talk communi-
females would be induced to use the signals when cation. There are two ways to model communica-
deciding upon a mate. Displays may impose costs tion in this environment. The first is to assume that
that “handicap” a signaler, but displays would signaling is costly, with hungrier babies better
persist when additional reproductive success com- able to communicate their hunger. This could be
pensates for their costs. Zahavi identifies a single- because the sound of a hungry baby is hard for
crossing condition as a necessary condition for the sated babies to imitate, or it could be that crying
existence of costly signals. for food increases the risk of predation and that
The development of signaling in biology paral- this risk is relatively more dangerous to well-fed
lels that in economics, but there are important chicks than to starving ones (because the starving
differences. Biology replaces the assumption of chicks have nothing to lose). This game has mul-
utility maximization and equilibrium with fitness tiple equilibria in which signals fully reveal the
maximization and evolutionary stability. That is, state of the baby over a range of values (see
their models do not assume that animals con- Lachmann and Bergstrom 1998; Maynard Smith
sciously select their signal to maximize a payoff. 1991). These papers look at a model in which both
Instead, the biological models assume that the pro- mother and child have private information. Alter-
cess of natural selection will lead to strategy pro- natively, Bergstrom and Lachmann (1998) study a
files in which mutant behavior has lower cheap-talk version of the game. Here there may be
reproductive fitness than equilibrium behavior. an equilibrium outcome in which the baby bird
This notion leads to static and dynamic solution credibly signals whether or not he is hungry.
concepts similar to Nash equilibrium and its refine- Those who signal hunger get fed. The others do
ments. Fitness in biological models depends on not. Well-fed baby birds may wish to signal that
contributions from both parents. Consequently, a they are not hungry in order to permit the mother
full treatment of signaling must take into account to keep food for herself. Such an equilibrium
population genetics. Grafen (1990b) discusses exists if the fraction of genes that mother and
these issues, and Grafen (1990a) and Siller (1998) child share is large and the baby is already
provide further theoretical development of the well fed.
handicap theory. Finally, one must be careful in
interpreting heterogeneous quality in biological
models. Natural selection should operate to elimi- Political Science
nate the least fit genes in a population. To the extent Signaling games have played an important role in
that this arises, there is pressure for quality varia- formal models of political science. Banks (1991)
tion within a population to decrease over time. The reviews models of agenda control, political rhe-
existence of unobserved quality variations needed toric, voting, and electoral competition. Several
for signaling may be the result of relatively small important models in this area are formally inter-
variations about a population norm. esting because they violate the standard
Signaling Games 265
assumptions frequently satisfied in economic his ideal point is high, P makes a compromise
models. I describe two such models in this offer that is sometimes accepted and sometimes
subsection. rejected.
Banks (1990) studies a model of agenda setting
in which the informed sender proposes a policy to
a receiver (decision-maker), who can either accept Future Directions
or reject the proposal. If the proposal is accepted,
it becomes the outcome. If not, then the outcome The most exciting developments in signaling
is a fallback policy. The fallback policy is known games in the future are likely to come from inter-
only to the sender. In this environment, the action between economics and other disciplines.
sender’s strategy may convey information to the Over the last 10 years the influence of behav-
decision-maker. Signaling is costly, but, because ioral economists has led the profession to rethink
the receiver’s set of actions is binary, fully reveal- many of its fundamental models. An explosion of
ing equilibria need not exist. Refinements limit the experimental studies has already influenced the
set of predictions in this model to a class of out- interpretation of signaling models and has led to
comes in which only one proposal is accepted in a reexamination of basic assumptions. There is
equilibrium (and that this proposal is accepted evidence that economic actors lack the strategic
with probability one), but there are typically a sophistication assumed in equilibrium models.
continuum of possible equilibrium outcomes. Further, economic agents may be motivated by
Matthews (1989) develops a cheap-talk model more than their material well-being. Existing
of veto threats. There are two players, a Chooser experimental evidence provides broad support
(C), who plays the role of receiver, and a Proposer for many of the qualitative predictions of the
(P), who plays the role of sender. The players have theory (Banks et al. 1994; Brandts and Holt
preferences that are represented by single-peaked 1992), but also suggests ways in which the theory
utility functions which depend on the real-valued may be inadequate.
outcome of the game and an ideal point. P’s ideal The driving assumption of signaling models is
point is common knowledge. C’s ideal point is her that when informational asymmetries exist,
private information, drawn from a prior distribu- senders will attempt to lie for strategic advantage
tion that has a smooth positive density on a com- and that sophisticated receivers will discount
pact interval. The game form is simple: C learns statements. These assumptions may be
her type and then sends a cheap-talk signal to P, reconsidered in light of experimental evidence
who responds with a proposal. C then either that some agents will behave honestly in spite of
accepts or rejects the proposal. Accepted pro- strategic incentives to lie. For example, Gneezy
posals become the outcome of the game. If (2005) and Hurkens and Kartik (2009) present
C rejects the proposal, then the outcome is the experimental evidence that some agents are reluc-
status quo point. tant to lie even when there is a financial gain from
As usual in cheap-talk games, this game has a doing so. There is evidence from other disciplines
babbling outcome in which C’s message contains that some agents are unwilling or unable to manip-
no information and P makes a single, take-it-or- ulate information for strategic advantage and that
leave-it offer that is accepted with probability people may be well equipped to detect these
strictly between 0 and 1. Matthews shows there manipulations in ways that are not captured in
may be equilibria in which two outcomes are standard models (see, e.g., Ekman 2001 or Trivers
induced with positive probability (size-two equi- 1971). Experimental evidence and, possibly,
libria), but size n > 2 (perfect Bayesian) equilibria results from neuroscience may demonstrate that
never exist. In a size-two equilibrium, P offers his the standard assumption that some agents cannot
ideal outcome to those types of C whose message manipulate information for their strategic advan-
indicates that their ideal point is low; this offer is tage (or that other agents have ability to see
always accepted in equilibrium. If C indicated that through deception) will inform the development
266 Signaling Games
of novel models of communication that include Aumann RJ (1990) Nash equilibria and not self-enforcing.
behavioral types. Several papers study the impli- In: Gabszewicz JJ, Richard J-F, Wolsey LA (eds) Eco-
nomic decision making: games, econometrics and opti-
cations of including behavioral types into the misation. Elsevier, Amsterdam, pp 201–206
standard paradigm. The reputation models of Aumann R, Hart S (2003) Long cheap talk. Econometrica
Kreps and Wilson (1982) and Milgrom and Rob- 71(6):1619–1660
erts (1982a) are two early examples. Papers on Ausubel LM, Deneckere RJ (1989) Reputation in
bargaining and durable goods monopoly. Econometrica
communication by Chen (2011), Crawford 57(3):511–531
(2003), Kartik (2005), and Olszewski (2004) are Banks JS (1990) Monopoly agenda control and asymmet-
more recent examples. New developments in ric information. Q J Econ 105(2):445–464
behavioral economics will inform future theoreti- Banks JS (1991) Signaling games in political science.
Routledge, Langhorne
cal studies. Banks JS, Sobel J (1987) Equilibrium selection in signal-
There is substantial interest in signaling in ing games. Econometrica 55(3):647–661
philosophy. Indeed, the philosopher David Banks J, Camerer C, Porter D (1994) An experimental
Lewis (2002) (first published in 1969) introduced analysis of nash refinements in signaling games.
Games Econ Behav 6(1):1–31
signaling games prior to the contributions of Battaglini M (2002) Multiple referrals and multidimensional
Spence and Zahavi. Recently linguists have been cheap talk. Econometrica 70(4):1379–1401
paying more attention to game-theoretic ideas. Ben-Porath E (2003) Cheap talk in games with incomplete
Benz et al. (2005) collect recent work that information. J Econ Theory 108:45–71
Ben-Porath E, Dekel E (1992) Signaling future actions and
attempts to formalize ideas from linguistic philos- the potential for sacrifice. J Econ Theory 57:36–51
ophy due to Grice (1991). While there have been a Benz A, Jäger G, Van Rooij R (eds) (2005) Game theory
small number of contributions by economists in and pragmatics. Palgrave MacMillan, Houndmills
this area (Rubinstein 2000 and Sally 2005 are Bergstrom CT, Lachmann M (1998) Signalling among
relatives. III. Talk is cheap. Proc Natl Acad Sci U S A
examples), there is likely to be more active inter- 95:5100–5105
action in the future. Bhattacharya S (1979) Imperfect information, dividend
Finally, future work may connect strategic policy, and the bird in the hand fallacy. Bell J Econ
aspects of communication to the actual structure 10(1):259–270
Blume A (2000) Coordination and learning with a partial
of language. Blume (2000), Cucker et al. 2004), language. J Econ Theory 95:1–36
and Nowak and Krakauer (1999) present dramat- Blume A, Board O, Kawamura K (2007) Noisy talk. Theor
ically different models on how structured commu- Econ 2(4):396–440
nication may result from learning processes. Brandts J, Holt CA (1992) An experimental test of equi-
librium dominance in signaling games. Am Econ Rev
Synthesizing these approaches may lead to funda- 82(5):1350–1365
mental insights on how the ability to send and Chakraborty A, Harbaugh R (2007) Comparative cheap
receive signals develops. talk. J Econ Theory 132(1):70–94
Chakraborty A, Harbaugh R (2010) Persuasion by Cheap
Talk. Am Econ Rev 100(5):2361–2382.
Acknowledgments I thank the Guggenheim Foundation, Chen Y (2011) Perturbed communication games with hon-
NSF, and the Secretaría de Estado de Universidades e est senders and naive receivers. J Econ Theory
Investigación del Ministerio de Educación y Ciencia 146(2):401–424.
(Spain) for financial support and Richard Brady, Kanako Chen Y, Kartik N, Sobel J (2007) On the robustness of
Goulding Hotta, and Jose Penalva for their comments. I am informative cheap talk. Technical report, UCSD.
grateful to the Departament d’Economia i d’Història Econometrica 76(1):117–136.
Econòmica and Institut d’Anàlisi Econòmica of the Cho I-K, Kreps DM (1987) Signaling games and stable
Universitat Autònoma de Barcelona for hospitality and equilibria. Q J Econ 102(2):179–221
administrative support. Cho I-K, Sobel J (1990) Strategic stability and uniqueness
in signaling games. J Econ Theory 50(2):381–413
Crawford VP (2003) Lying for strategic advantage: rational
Bibliography and boundedly rational misrepresentation of intentions.
Am Econ Rev 93(1):133–149
Crawford VP, Sobel J (1982) Strategic information trans-
Primary Literature mission. Econometrica 50(6):1431–1451
Ambrus A, Takahashi S (2008) Multi-sender cheap talk Cucker F, Smale S, Zhou D-X (2004) Modeling language
with restricted state space. Theor Econ 3(1):1–27 evolution. Found Comput Math 4(3):315–343
Signaling Games 267
Edlin AS, Shannon C (1998) Strict single-crossing and the Mailath GJ, Okuno-Fujiwara M, Postlewaite A (1993) On
strict spence-mirrlees condition: a comment on mono- belief based refinements in signaling games. J Econ
tone comparative statics. Econometrica 66(6): Theory 60(2):241–276
1417–1425 Mathews SA, Okuno-Fujiwara M, Postlewaite A (1991)
Ekman P (2001) Telling lies. W.W. Norton, New York Refining cheap-talk equilibria. J Econ Theory
Farrell J (1993) Meaning and credibility in cheap-talk 55(2):247–273
games. Games Econ Behav 5(4):514–531 Matthews SA (1989) Veto threats: rhetoric in a bargaining
Forges F (1990) Equilibria with communication in a job game. Q J Econ 104(2):347–369
market example. Q J Econ 105(2):375–398 Maynard Smith J (1991) Honest signalling: the Philip
Fudenberg D, Tirole J (1983) Sequential bargaining with Sidney game. Anim Behav 42:1034–1035
incomplete information. Rev Econ Stud 50(2): Milgrom PR (1981) Good news and bad news: representa-
221–247 tion theorems. Bell J Econ 21:380–391
Giovannoni F, Seidmann DJ (2007) Secrecy, two-sided Milgrom P, Roberts J (1982a) Limit pricing and entry
bias and the value of evidence. Games Econ Behav under incomplete information: an equilibrium analysis.
59(2):296–315 Econometrica 50(2):443–459
Gneezy U (2005) Deception: the role of consequences. Am Milgrom P, Roberts J (1982b) Predation, reputation, and
Econ Rev 95(1):384–394 entry deterrence. J Econ Theory 27:280–312
Grafen A (1990a) Biological signals as handicaps. J Theor Milgrom PR, Shannon C (1994) Monotone comparative
Biol 144:517–546 statics. Econometrica 62(1):157–180
Grafen A (1990b) Sexual selection unhandicapped by the Modigliani F, Miller MH (1958) The cost of capital, cor-
fisher process. J Theor Biol 144:473–516 poration finance and the theory of investment. Am
Green JR, Stokey NL (2007) A two-person game of Econ Rev 48(3):261–297
information transmission. J Econ Theory Morris S (2001) Political correctness. J Polit Econ
135(1):90–104 109:231–265
Grice HP (1991) Studies in the way of words. Harvard Nowak MA, Krakauer DC (1999) The evolution of lan-
University Press, Cambridge guage. Proc Natl Acad Sci 96(14):8028–8033
Grossman S (1981) The role of warranties and private Olszewski W (2004) Informal communication. J Econ
disclosure about product quality. J Law Econ Theory 117:180–200
24:461–483 Rabin M (1990) Communication between rational agents.
Grossman S, Perry M (1987) Perfect sequential equilibria. J Econ Theory 51:144–170
J Econ Theory 39:97–119 Rabin M, Farrell J (1996) Cheap talk. J Econ Perspect
Gul F, Sonnenchein H, Wilson R (1986) Foundations of 10(3):103–118
dynamic monopoly and the coase conjecture. J Econ Riley JG (2001) Silver signals: twenty-five years of screen-
Theory 39:155–190 ing and signaling. J Econ Lit 39(2):432–478
Hurkens S, Kartik N (2009) Would I Lie to You: On Social Rothschild M, Stiglitz J (1976) Equilibrium in competitive
Preferences and Lying Aversion. Experimental Econ insurance markets: an essay on the economics of imper-
12(2):180–192. fect information. Q J Econ 90(4):629–649
Kartik N (2009) Information transmission with almost- Rubinstein A (2000) Economics and language. Cambridge
cheap talk. Rev Econ Stud 76(4):1359–1395. University Press, New York
Kohlberg E, Mertens J-F (1986) On the strategic stability Sally D (2005) Can I say, bobobo and mean, there’s no such
of equilibria. Econometrica 54(5):1003–1037 thing as cheap talk? J Econ Behav Organ 57(3):245–266
Kreps DM, Sobel J (1994) Signalling. In: Aumann RJ, Hart Seidmann DJ, Winter E (1997) Strategic information trans-
S (eds) Handbook of game theory: with economics mission with verifiable messages. Econometrica
applications, vol 2, Handbooks in economics, No 11. 65(1):163–169
Elsevier, Amsterdam, pp 849–868, chap 25 Siller S (1998) A note on errors in grafen’s strategic hand-
Kreps DM, Wilson R (1982) Reputation and imperfect icap models. J Theor Biol 195:413–417
information. J Econ Theory 27:253–277 Sobel J (1985) A theory of credibility. Rev Econ Stud
Krishna V, Morgan J (2001a) A model of expertise. Q J 52(4):557–573
Econ 116(2):747–775 Sobel J, Takahashi I (1983) A multistage model of
Lachmann M, Bergstrom CT (1998) Signalling among bargaining. Rev Econ Stud 50(3):411–426
relatives. II. Beyond the tower of babel. Theor Popul Spence AM (1974) Market signaling. Harvard University
Biol 54:146–160 Press, Cambridge
Leland HE, Pyle DH (1977) Informational asymmetries, Trivers RL (1971) The evolution of reciprocal altruism.
financial structure, and financial intermediation. Q Rev Biol 46:35–58
J Finance 32(2):371–387 Vida P (2006) Long pre-play communication in games. Ph
Lewis D (2002) Convention: a philosophical study. Black- D thesis, Autonomous University of Barcelona
well, Oxford Wilson C (1977) A model of insurance markets with
Mailath GJ (1987) Incentive compatibility in signaling incomplete information. J Econ Theory 16:167–207
games with a continuum of types. Econometrica Zahavi A (1975) Mate selection- a selection for a handicap.
55(6):1349–1365 J Theor Biol 53:205–214
268 Signaling Games
Books and Reviews Milgrom P, Roberts J (1986) Price and advertising signals
Admati A, Perry M (1987) Strategic delay in bargaining. of product quality. J Polit Econ 94(4):796–821
Rev Econ Stud 54:345–364 Noldeke G, Samuelson L (1997) A dynamic model of
Austen-Smith D (1990) Information transmission in equilibrium selection in signaling markets. J Econ The-
debate. Am J Polit Sci 34(1):124–152 ory 73(1):118–156
Battigalli P, Siniscalchi M (2002) Strong belief and forward Noldeke G, Van Damme E (1990) Signalling in a dynamic
induction reasoning. J Econ Theory 106(2):356–391 labour market. Rev Econ Stud 57(1):1–23
Bernheim BD (1994) A theory of conformity. J Polit Econ Ottaviani M, Sørensen PN (2006) Professional advice.
102(5):841–877 J Econ Theory 126(1):120–142
Blume A, Kim Y-G, Sobel J (1993) Evolutionary stability Rabin M (1994) A model of pre-game communication.
in games of communication. Games Econ Behav J Econ Theory 63(2):370–391
5:547–575 Ramey G (1996) D1 signaling equilibria with multiple
Fudenberg D, Tirole J (1991) Game theory. MIT Press, signals and a continuum of types. J Econ Theory
Cambridge, MA 69(2):508–531
Gibbons R (1992) Game theory for applied economists. Rasmusen EB (2006) Games and information: an introduc-
Princeton University Press, Princeton tion to game theory, 4th edn. Blackwell, New York
Kartik N, Ottaviani M, Squintani F (2007) Credulity, lies, Riley JG (1979) Informational equilibrium. Econometrica
and costly talk. J Econ Theory 134(1):93–116 47:331–359
Krishna V, Morgan J (2001b) A model of expertise. Q J Sobel J, Stole L, Zapater I (1990) Fixed-equilibrium
Econ 116(2):747–775 rationalizability in signaling games. J Econ Theory
Lo P-Y (2006) Common knowledge of language and iter- 52(2):304–331
ative admissibility in a sender-receiver game. Technical Spence AM (1973) Job market signaling. Q J Econ
report. Brown University 90:225–243
Manelli AM (1996) Cheap talk and sequential equilibria in Swinkels JM (1999) Education signalling with preemptive
signaling games. Econometrica 64(4):917–942 offers. Rev Econ Stud 66(4):949–970
noncooperative game is a Nash equilibrium
Inspection Games which is either unique, or which, for some
reason, has been selected among alternative
Rudolf Avenhaus1 and Morton J. Canty2 equilibria.
1
Armed Forces University Munich, Neubiberg, Noncooperative game An n-person noncooper-
Germany ative game in normal or strategic form is a list
2
Institute for Chemistry and Dynamics of the of actions, called pure strategies, for each of
Geosphere, Forschungszentrum Jülich, Jülich, n players, together with a rule for specifying
Germany each player’s payoff (utility) when every
player has chosen a specific action. Each player
seeks to maximize her own payoff.
Article Outline Saddle point A saddle point is a Nash equilib-
rium of a two-person zero-sum game. The
Glossary value of the game is the (unique) equilibrium
Definition payoff to the first player.
Introduction Utility Utilities are sequences of numbers
Selected Inspection Models assigned to the outcomes of any strategy com-
Future Directions bination which mirror the order of preferences
Bibliography of each player and which fulfill the axioms of
von Neumann and Morgenstern.
Glossary Verification Verification is the independent con-
firmation by an inspector of the information
Deterrence In an inspection game, deterrence is reported by an inspectee. It is used most com-
said to be achieved by a Nash equilibrium in monly in the context of arms control and dis-
which the inspectee behaves legally, or in armament agreements.
accordance with the agreed rule. Zero-sum game A zero-sum game is a noncoop-
Extensive form The extensive form of a nonco- erative game in which the payoffs of all players
operative game is a graphical representation sum to zero for any specific combination of
which describes a succession of moves by dif- pure strategies.
ferent players, including chance moves, and
which can handle quite intricate information
patterns. Definition
Inspector leadership Leadership in inspection
games is a strategic concept by which, through Inspection games deal with the problem faced by
persuasive announcement of her strategy, the an inspector who is required to control the com-
inspector can achieve deterrence. pliance of an inspectee to some legal or otherwise
Mixed strategy A mixed strategy for a player in formal undertaking. One of the best examples of
a noncooperative game is a probability distri- an inspector, in the institutional sense, is the Inter-
bution over that player’s pure strategies. national Atomic Energy Agency (IAEA) which,
Nash equilibrium A Nash equilibrium in a non- under a United Nations mandate, verifies the
cooperative game is a specification of strate- activities of all States – inspectees – signatory to
gies for all players with the property that no the Nuclear Weapons Non-proliferation Treaty.
player has an incentive to deviate unilaterally An inspection regime of this kind is a true conflict
from her specified strategy. A solution of a situation, even if the inspectee voluntarily submits
to inspection (in multinational or bilateral treaties seminal for later work it is presented it in some
this is invariably the case), because the raison detail in Subsection “Customs and Smugglers.”
d’être of any control authority must be the A second phase of inspection game develop-
assumption that the inspected party has a real ment started around 1968 in connection with the
incentive to violate its commitments. The primary verification of the Treaty on the Non-Proliferation
objective of the inspector is to deter the inspectee of Nuclear Weapons (NPT). There was no model
from illegal behavior or, barring this, to catch him for this verification system, therefore new princi-
out. It is thus natural that quantitative models of ples and tools had to be developed and analyzed,
inspections should be non-cooperative games see, e.g., (Bierlein 1968, 1969; Höpfinger 1974).
with at least two players, inspector and inspectee Because of its importance for the further develop-
(s). This survey will be limited to just one ment of the whole discipline of inspection games,
inspectee, that is, we shall restrict ourselves to one of the major components of NPT verification
two-person non-cooperative inspection games. measures, namely material accountancy, will be
Inspection games should be distinguished from discussed in Subsection “Diversion of Nuclear
related topics such as quality control or the pre- Material.”
vention of random accidents, for which there are In Economics game-theoretic work on
no adversaries that act strategically, or from Accounting and Auditing was begun in the late
search-and-destroy problems. The salient feature 1960s. A first survey was given by Borch (1990).
of an inspection game is that an inspector tries to Since that time papers and books have been
prevent an inspectee from behaving illegally in published regularly but on a limited scale along
terms of some commitment. The inspectee might, similar lines, placing emphasis on auditing prac-
for example, decide not to violate, so that there is tice, see, e.g., (Cavasoglu and Raghunatahan
nothing to search for. In fact, deterrence is gener- 2004; Cook et al. 1997; Wilks and Zimbelman
ally the inspector’s highest priority. Nevertheless, 2004) provided an updated review of theoretical
a sharp distinction between, e.g., inspection and empirical research. In economic models
games and quality control models cannot be known as principal-agent problems, in which
made in all cases, as we will see in Subsection inspections of economic transactions raise the
“Illegal Production.” question of their most efficient design, game-
theoretic methods have been applied, early sur-
veys having been given by Baiman (1982),
Introduction Kanodia (1985) and Dye (1986).
In the last decade, new models have been devel-
Immediately after von Neumann and oped and analyzed again in the context of recent
Morgenstern’s pioneering book Theory of Games ACD verification developments, in particular the
and Economic Behavior (von Neumann and more stringent requirements on re-negotiated NPT
Morgenstern 1947), Arms Control and Disarma- verification measures (IAEA 1997). Whereas
ment (ACD) inspections may have been analyzed under the previous regime purely technical aspects
game-theoretically as classified military research; like size of the nuclear fuel cycle and accuracy of
this is not known for sure but may be inferred the measurement systems were considered, now
from papers published later. Non-classified work qualitative features such as behavior and intentions
started vigorously in the early 1960s with analyses of States had to be taken into account. This
for the United States Arms Control and Disarma- required the introduction of State-specific utilities,
ment Agency (ACDA). These dealt with very for first analyses of this kind see, e.g., (Avenhaus
general ACD problems, and also with concrete and Canty 1996; Kilgour 1992). Independently of
problems of test ban treaty verification. In that concrete applications there has been an ongoing
context probably the first genuine inspection interest of mathematicians in the refinement and
game in the open literature was the recursive generalization of existing models, a few examples
game developed by Dresher (1962). Since it was will be given below.
Inspection Games 271
Inspections cause conflicts in many real world cooperative game each time he boards a train.
situations. In economics, there are services of He can buy a valid ticket or travel “schwarz,”
many kinds the fulfillment or payment of which risking a fine if he is controlled (checked). In its
has to be verified. For example one is concerned edition of July 18th, 1996, the daily Süddeutsche
with the central problem of principal-agent rela- Zeitung reported the complaint of the Munich City
tionships, where the principal, e.g., an employer, Treasurer to the effect that the deployment of
delegates work or responsibility to the agent, the ticket inspectors by the local transit authority
employee, and chooses a payment schedule that (MMV) was not worthwhile, the collected fines
best exploits the agent’s self-interests. The agent, paying for only about half the cost of the inspec-
of course, behaves so as to maximize her own tors themselves.
utility given the fee schedule proposed by the From the game theorist’s viewpoint there is
principal. obviously an optimum control intensity that
Environmental agreements obviously give rise would alleviate this problem: employing just a
to inspection problems, but these have not yet single inspector would encourage many violations
received as much attention from modelers as one and result in her collecting more than enough fines
might have expected (and as they might deserve). to pay for herself, but would clearly not be in the
To date most methodological analyses of inspec- MVV’s interests. Using an army of inspectors
tion games have been performed in the context of would ensure compliance, but, there now being
arms control and disarmament. no fines at all, would not finance the inspectors.
There exist previous reviews of inspection The optimum must lie between these two
games with objectives somewhat different to extremes.
those of this survey. Avenhaus et al. (1996) restrict
discussion to arms control and disarmament, and Solution
emphasize the historical development. Avenhaus The problem can be formulated as a two-person
et al. (2002) stress the methodological, and in game in normal form involving the MVV as
particular the mathematical aspects. Here we player 1 and the transit passenger as player
undertake a new approach to organize the mate- 2 (Avenhaus 1997). The pure strategies for the
rial, one which gives more credit to the diversity MVV are to control or not to control, whereas
of the applications and the techniques necessary the passenger will decide whether or not to buy a
for their solution. We focus on selected game valid ticket. Let f be the fare, b > f the fine and
theoretic inspection models which, together with e < b the control costs per passenger, all in euros.
their variants and generalizations, we hope will The game’s normal form (also called bimatrix
span the full range of the subject. form) is shown in Fig. 1.
In the figure, the pure strategies of player
1 (control/no control) are depicted as rows and
Selected Inspection Models those of player 2 (legal/illegal) as columns. The
payoffs to player 1 for each pure strategy combi-
In the following, five inspection problems are nation are shown in the lower left hand corners of
chosen to illustrate applications of inspection the corresponding squares, those for player 2 in
games together with their analysis. They are the upper right hand corners. (This simple formu-
complemented with discussion of some of their lation ignores MVVoverhead costs and any mate-
variants and generalizations. In the last illustra- rial gain the passenger may have from his trip.)
tion, the special role of the leadership concept in A solution of the game will be a Nash equilib-
inspection models is emphasized. rium (1951), that is, a pair of strategies, called
equilibrium strategies, with the property that nei-
Passenger Ticket Control ther player has an incentive to deviate unilaterally
A commuter on the Munich subway is, con- from his or her equilibrium strategy. Equivalently,
sciously or unconsciously, involved in a non- the strategies are said to be mutual best replies. In
272 Inspection Games
−f
At equilibrium the passenger behaves illegally
0
no control
with positive probability 1 q* = e/b, that is,
1−p deterrence is not possible. We will return to this
f 0 issue in Subsection “Sharing Common Pool
Resources.” Nevertheless on average he enjoys
the same payoff as he would receive by paying
Inspection Games, Fig. 1 Normal form of the two- his fare every time, namely f.
person game between MVV (player 1) and passenger
(player 2). The arrows indicate the preference directions
for the two players, the horizontal arrows for player 2, the Remarks
vertical arrows for player 1 The average control expenditure for the MVV is
ep, whereas the mean profit from collection of
fines is bp(1 q). The difference is (e b
the Figure the preference directions, i.e., the devi-
(1 q))p If the passenger plays his equilibrium
ation incentives, are seen to be cyclical. This strategy as given by Eq. (3), then
means that there can be no Nash equilibrium
involving pure strategies. However the equilib-
ðe bð1 q ÞÞp ¼ 0 (4)
rium concept can be generalized to involve
mixed strategies which are probability distribu- for any control probability p. The control costs are
tions over the sets of pure strategies. In the present thus exactly compensated by the collected fines. It
case, the MVV controls with some probability might be mentioned that the actual figures for
p and the passenger behaves legally with proba- inspection probability and income per passenger
bility q. The expected payoffs to the two players for Munich approximately satisfy (3). We may
are then given by speculate that the City Treasurer’s complaint
arises from the fact that regular violators develop
E1 ðp, qÞ ¼ ðf eÞpq þ ðb eÞpð1 qÞ þ f ð1 pÞ
strategies to recognize and avoid inspectors. Other
qE2 ðp, qÞ ¼ fpq bpð1 qÞ
f ð1 pÞq: complications not taken into account in the model
are variations in frequency, hour of day and dwell-
(1)
ing time of passengers using the system.
If we designate the mixed equilibrium strate- There are many inspection problems which can
gies by p* and q* and the equilibrium payoffs as be described with models equivalent or similar to
Ei ¼ Ei ðp , q Þ, i ¼ 1, 2 then the conditions for the one presented here. Inspection of metered
Nash equilibrium are parking spaces provides another example. Control
of the sharing of common pool resources, as
E1 E1 ðp, q Þ for all p ½0, 1 discussed in the last inspection model, below,
E2 E2 ðp, q Þ for all q ½0, 1: belongs to the same category.
(2)
Illegal Production
For this situation the equilibrium strategies can In treaties prohibiting the production, acquisition
be determined simply by requiring that each and/or proliferation of weapons of mass
Inspection Games 273
destruction, one is often concerned with the mis- Nash equilibrium conditions, or, since we are
use of ostensibly legitimate production facilities. dealing with a zero-sum game, the saddle point
A commercial chemical plant, for example, may criteria determining the equilibrium mixed strat-
be used for production of forbidden precursors of egies G* and F*, are given by
chemical weapons, or a uranium enrichment facil-
ity may illegally enrich its product to weapons- EðG , xÞ EðG , F Þ Eðy, F Þ
(6)
grade U-235. The consequences of non-detection for all x, y ½0, 1:
by an inspecting authority can be dire, and the
timeliness of control procedures – the interval Since the final inspection is certain, clearly the
between the onset of plant misuse and its operator shouldn’t wait too long to act. Rather, in
detection – may be especially important. constructing his optimal probability distribution
To illustrate, consider the following simple F*(x), he might plausibly choose x randomly on an
model. At the end of some reference time interval, interval [0, b] with b < 1 Consequently the inspector
for instance a calendar year or a production cam- will not act later than b either. Let us assume that she
paign, a major inspection takes place at a facility, chooses her inspection time y according to the prob-
one which would detect prior illegal production with ability density function g*(y), where
certainty. Additionally, a single interim inspection is ðb
carried out, timed at the inspector’s discretion, g ðyÞdy ¼ 1: (7)
0
which will likewise detect prior violation with cer-
tainty. The interim inspection is intended to enhance The expected payoff to the operator for diver-
the timeliness of detection should illegal activity be sion at time x [0, b] is then
underway. The inspector would like to know pre- ðx
cisely when it should take place. Eð G , x Þ ¼ ð1 xÞg ðyÞdy
0
ðb
Solution þ ðy xÞg ðyÞ dy
x
This example entails solution of a zero-sum game ðx ðb
on the unit square. The onset of illegal production ¼ g ðyÞdy þ yg ðyÞdy
and the time of the interim inspection and are 0 x
ðb
chosen on the interval strategically by the respec-
x g ðyÞdy: (8)
tive protagonists, plant operator and inspector. We 0
take the payoff to the former as the time to detec-
tion of illegal production, and to the latter to be the But if the operator randomizes across the interval
negative of that quantity. [0, b] as assumed, this payoff must be constant for
Representing the reference time by the interval all x [0, 6] and equal to the value of the game,
[0, 1], the operator’s so-called payoff kernel is i.e., the equilibrium payoff to the operator. If this
were not the case for some x in the interval, the
y x for xy operator would not have included it in his mixed
Aðy, xÞ ¼ (5)
1 x for x y, strategy F* in the first place. Thus the derivative
of Eq. (8) with respect to x must vanish. This gives
here x [0, 1] denotes a pure strategy for the immediately
operator, the onset of illegal production and sim-
ilarly y [0, 1] is a pure inspection strategy. Let 1
g ð y Þ ¼ (9)
the inspector’s and operator’s mixed strategy dis- 1y
tribution functions be G(y) and F(x), respectively.
G(y) is the probability of an inspection taking and from Eq. (7), b = 1 1/e. The expected payoff
place at time y or earlier, F(x) the probability of to the operator and value of the game is then easily
illegal activity beginning at time x or earlier. The seen to be E(G*, x) = 1/e for all x [0, b].
274 Inspection Games
Getting the operator’s optimal strategy F* is a Both models were motivated by reliability con-
bit more subtle because it requires a so-called trol problems: production units have to be
atom at X = 0, that is to say, a finite probability inspected regularly and the earlier a failure is
of starting illegal production at precisely the detected, the less costly it is for the production
beginning of the interval, as well as the probabil- facility owner. Of course the production unit is
ity density f*(x) on the remaining half-open inter- not acting strategically, but in order to be on the
val [0, b]. In terms of the distribution function safe side a minimax approach was chosen, which
F*(x) that characterizes this mixed strategy, the was then generalized by Diamond to give a saddle
atom is F*(0) and f*(x) is the derivative of F*(x) point solution. Thus we see that an approach which
on [0, b]. The operator’s payoff for some inspec- originally was not an inspection game according to
tion time y [0, b] is our definition in Section “Definition” turned out to
become one, with interesting applications such as
ðy
the one discussed above.
Eðy, F Þ ¼ yF ð0Þ þ ðy xÞf ðxÞdx
0 Rothenstein and Zamir (2002) extended Dia-
ðb mond’s model with a single inspection to include
þ ð1 xÞf ðxÞdx errors of the first and second kind. Krieger (2008)
y
ðy considered a time-discrete variant of the model.
All variants require that both the operator and
¼ yF ð0Þ þ y f ðxÞdx
0 inspector commit themselves before the reference
ðb ðb period begins. Thus if in the solution in Subsec-
þ f ðxÞdx xf ðxÞdx tion “Solution” the operator simply waits for the
y 0
interim inspection and then violates, he will
¼ ðy 1ÞF ðyÞ þ F ðbÞ achieve an expected time to detection of
ðb
ðb
xf ðxÞdx: (10) 1 1
0 ð1 xÞf ðxÞdx ¼ b ¼ 1 > ,
0 e e
Arguing as before, if the inspector randomizes and the inspector’s advantage will have evapo-
over the interval, this expression must be constant rated. But of course f*(x) is not the inspector’s
for all y [0, b]. This is true if (y 1)F*(y) is equilibrium in such a sequential game. Prior com-
independent of y. The requirement that F*(b) = 1 mitment may be justified in some cases, but not in
then leads to others. If there is no requirement for commitment,
the operator may prefer to start his illegal action
1 1 immediately, i.e., at the beginning of the reference
F ðxÞ ¼ (11)
e 1x period, or delay his decision until the first inter-
mediate inspection. This situation has to be
For x [0, b] and F*(x) = l for x > b The atom is modeled as an extensive form game. Its time-
F*(0) = 1/e; and the construction is complete. continuous version, which also considered errors
Verifying that F* and G* satisfy Eq. (6) is of the first and second kind, was studied by
straightforward. Avenhaus and Canty (2005). Surprisingly it
turned out that an equilibrium strategy of the
inspector is a pure strategy, contrary to the equi-
Remarks
librium strategies of the Diamond-type models.
Owen (1968) discussed the existence of Nash
equilibria for continuous games on the unit square
and methods for their solution. The above result Diversion of Nuclear Material
was first obtained by Diamond (1982), who gave a As already mentioned in the introduction, a large
generalization to k > 2 inspections. Prior to Dia- number of game theoretic models of inspection
mond’s work, Derman (1961) treated a somewhat situations have been developed in the framework
similar minimax inspection problem. of IAEA verification activities. The basis of the
Inspection Games 275
IAEA inspection system is the verification of the is measured in the facility. Then, during the ith
continuing presence of fissile material in the period, i = 1,. . ., n, some net measured amount Yi
peaceful nuclear fuel cycle of the State under of material enters the area. At the end of that
consideration (IAEA 1972), therefore statistical period the amount of material, now Ii is again
measurement errors and, consequently, statistical measured. The quantity
decision theory must be taken into account.
Over a single accounting period, typically Zi ¼ I ii þ Y i I i , i ¼ 1 . . . n,
1 year, we define the material flow as the mea-
sured net transfer of fissile material across the is the material balance test statistic for the ith
facility boundary, consisting of inputs (receipts inventory period. Under the null hypothesis that
of raw material) R and outputs (shipments of no material was diverted, its expected value is, as
purified product and waste) S. If the physical before,
inventory within the facility at the start of the
period was I0, then the book inventory at the end E0 ðZi Þ ¼ 0, i ¼ 1 . . . n:
of the accounting period is defined as
The alternative hypothesis is that material is
B ¼ I 0 þ R S ¼ I 0 þ Y, (12) diverted from the balance area according to some
specific pattern. Thus
where Y = R S is the net material flow into the
facility. X
n
that is the covariance matrix of the multivariate and for fixed a is, according to the Lemma of
normally distributed random vector Z = (Z1, Neyman and Pearson, given by
Z2,. . ., Zn)T and define e = (1, 1,. . . 1)T. Then the
( )
equilibrium strategies are in fact given by f ðzÞ X
1
zj 1 > k00a ¼ zj mT z > k0a ,
X f 0 ðzÞ
m
m ¼ P e, (16)
eT e
where fi (z) are the joint density functions under
and by the test da characterized by the critical hypothesis i,i = 0,1 But from Eq. (16)
region X T X1
X1
mT z/ e z ¼ eT z:
zj e z > ka
T
(17)
where ka is determined by a. The value of the Thus dais indeed a best reply to m and the right
game, that is, the guaranteed probability of detec- hand inequality (15) is fulfilled as well.
tion, is given by
! Remarks
m
1 b da , m ¼ F P 1 U ð 1 aÞ , According to Eq. (17), the inspector’s optimal test
ð eT eÞ2 statistic is
(18)
X
n X
n
closes the material balance with the help of the of significance thresholds for fixed false alarm
operator’s reported data. Thus, along with mate- probabilities and, from them, the associated detec-
rial accountancy, data verification comprises the tion probabilities.
second foundation of the IAEA safeguards sys- Later on (Avenhaus and Canty 1996) it was
tem. Due to the possibility that data may be inten- proven, again using the saddle point criterion
tionally falsified to make the material balance and the Lemma of Neyman and Pearson, that the
appear to be correct, data verification again use of the D-statistic is optimal for a “reasonable”
poses game theoretic problems. class of data falsification strategies, and it was
Two kinds of sampling procedures have to be shown how the sample sizes can be determined
considered in this context. In case of identifying such that they maximize the overall probability of
items or checking of seals, so-called attribute detecting a given total falsification for a total
sampling procedures are used in which only sam- given inspection effort.
pling errors have to be minimized. This leads on
the one hand to stratified sampling solutions sim- Customs and Smugglers
ilar to those found in the context of accounting In the previous examples evidence of violation
and auditing. One of these solutions became well- (illegal production, diversion) is assumed to per-
known, at least in expert circles, under the name sist: the illegal action can be detected after the
IAEA formula, see e.g., (Avenhaus and Canty fact. There are of course inspection problems
1989). On the other hand, in case of quantitative where this kind of model is not appropriate. Prob-
destructive and non-destructive verification mea- ably the first genuine inspection game in the open
surements, statistical measurement errors can no literature was a recursive zero-sum game devel-
longer be avoided, leading to consideration of oped by Dresher (1962) which treated the situa-
variable sampling procedures. A decision prob- tion in which the violator can only be caught red-
lem arises in this instance, since discrepancies handed, that is, if the illegal act is actually in
between reported and independently verified progress when an inspection occurs. It’s not diffi-
data can be caused either by measurement errors cult to imagine real situations where this is the
or by real and intentionally generated differences case, a much-discussed example being the
(data falsification). Stewart (1971) was the first to customs-smuggler problem (Thomas and Nisgav
propose the so-called D-statistic for use in IAEA 1976). In its simplest form, a smuggler has
data verification. For one class of data consisting n nocturnal opportunities to bring his goods safely
of N items, n of which are verified, the D-statistic across a channel. The customs office, equipped
is the sum of the differences of reported data Xj with a patrol boat, would very much like to appre-
and independently measured data Yj, extrapolated hend him, but budget restrictions require that the
to the whole class population, i.e., boat can only patrol on m < n nights. If a patrol
coincides with a crossing attempt, the smuggler
NX n will be apprehended with certainty. Moreover the
D1 ¼ Xj Y j : smuggler observes all patrols that take place. All
n j¼1
that being the case, one can ask how customs
should time its patrols.
For K classes of data (for instance one class for
each component of a closed material balance) the
Solution
D-statistic is given by
The game theoretic model developed by Dresher
XK
Ni X ni
(1962) at the RAND Corporation mentioned
DK Xij Y ij : above fits this situation rather well. It illustrates
i¼1
n i j¼1
nicely the special character of sequential games,
These quantities then form the basis for the test and has an elegant recursive solution. We summa-
procedure of the inspector, which goes along sim- rize it here, see von Stengel (1991) for a more
ilar lines as outlined before: Two hypotheses have thorough discussion as well as some interesting
to be formulated which permit the determination variations on the same theme.
278 Inspection Games
In Dresher’s model there are n time periods, case with a payoff of either 0 (legal behavior) or 1
during each of which the inspector can decide (illegal behavior) to the inspector.
whether or not to control the inspectee, using up The function v(n, m) is subject to two boundary
one of a total of m inspections available to her if conditions. If there are no periods left, and no
she does so. The inspectee knows at each stage the violation has occurred,
number of past inspections. He can violate at most
once, but can also choose to behave legally. Detec- vð0, mÞ ¼ 0, m > 0: (19)
tion occurs only if violation and inspection coincide
in the same period. The conflicting interests of the If the inspector has no inspections left, then the
two players are again modeled as a zero-sum game, inspectee is aware of this and can safely violate
that is, the inspectee’s loss on detection is the nega- (and will do so, since his payoff is higher):
tive of the inspector’s gain. Legal action gives a
payoff of nil to both players. The game is shown in vðn, 0Þ ¼ 1, n > 0: (20)
reduced extensive form, i.e., as a decision tree, in
Fig. 2. We shall now seek an equilibrium of the game
The inspectee’s information set, shown as an in the domain of mixed strategies. Let p(n, m)
oval in the figure, encompasses both of her deci- [0, l] be the probability with which the inspector
sion points. This is meant to imply that she doesn’t chooses to inspect in the first period. The equilib-
know at which node she is situated when choosing rium choice for p(n, m) makes the inspectee indif-
her strategy. ferent to legal or illegal behavior, so that he
The entries at the leaf nodes of the tree are the receives the same payoff v(n, m) given by
payoffs to the inspector. The value of the game prior
to the first period is denoted v(n, m). If the single vðn, mÞ ¼ pðn, mÞvðn l, m 1Þ
violation occurs, the inspector achieves +1 if an þð1 pðn, mÞÞvðn 1, mÞ ðlegalÞvðn, mÞ ¼ pðn, mÞ
inspection takes place, otherwise 1. In the latter ðþ1Þ þ ð1 pðn, mÞÞ ð1Þ ðillegalÞ:
case, the game proceeds trivially with the inspectee
In a similar way the inspector chooses her
behaving legally (he has already violated) and the
probability q(n, m) for violation at the first stage
inspector inspecting or not, as she chooses. If the
so as to make the inspectee indifferent as to con-
inspectee behaves legally, the continuation of the
trol or no control, leading to
game has, by definition, value v(n 1, m 1) to
the inspector if she decided to control in the first
period, otherwise value v(n 1, m). These values qðn, mÞvðn 1, m 1Þ þ ð1 qðn, mZ ÞÞ1
are the corresponding payoffs to the inspector after ¼ qðn, mÞvðn 1, mÞ þ ð1 qðn, mÞÞð1Þ:
the first period, thus giving the recursive form of the
game tree shown. The game terminates after Solving these three equations for p(n, m),
detected violation or after the n periods, in the latter q(n, m) and v(m, n) we obtain
insspectee
v(n − 1, m − 1) +1 v(n − 1, m) −1
Inspection Games 279
However it was postulated that McCoy’s pure strategies, that is, his choices of which mon-
highest priority was to keep Hatfield honest, and itoring probability 1 b he will announce in
there is in fact a way for him to do it. Suppose that advance, is infinite. Hatfield’s set of strategies on
McCoy takes the initiative and announces credi- the other hand consists of all functions which
bly with what precise probability he intends to assign to each value of 1 b the decision “take
monitor Hatfield’s activities. This so-called lead- share” or “take more.” The appropriate represen-
ership game can no longer be expressed as a tation is the extensive form game shown in Fig. 5.
bimatrix as in Fig. 4, because McCoy’s set of Due to its structure (formally, an extensive
form game with perfect information in which all
information sets are singletons) the game can be
2 take share take more
solved by backward induction. If this procedure
1 1−t t leads to an equilibrium, then that equilibrium is
said to be subgame perfect. A subgame perfect
0 +d Nash equilibrium is one in which every subgame
no control is also a Nash equilibrium.
β
−c
The first step is to replace the outcome payoffs
0
in Fig. 5 by their expected values, as shown in
Fig. 6. The argument then proceeds as follows:
0 −b Hatfield, knowing the probability 1 b of being
control
controlled, decides
1−β
−e −a
takeshare if 0 > b þ ðb þ d Þb
indifferent if 0 ¼ b þ ðb þ d Þb (30)
Inspection Games, Fig. 4 Normal or bimatrix form of takemore if 0 < b þ ðb þ d Þb,
the irrigation game. Left lower numbers are the payoffs to
player 1 (McCoy), upper right numbers the payoffs to
player 2 (Hatfield), whereby C > a > e and (a, b, c, d, since this strategy will always maximize his
e) > (0, 0, 0, 0, 0). The horizontal arrows are incentive
directions for player 2, the vertical arrows for player 1. The
expected payoff. McCoy’s equilibrium strategy
variables t and b define mixed strategies will be shown below to be
−a −c −e 0
−b +d 0 0
282 Inspection Games
Hatfield
− a (1 − β) − cb − e (1 − β)
− b (1 − β) + db 0
b 0n, ¼ n, take share for b b
b ¼ , (31) t ðbÞ ¼
bþd 1n, ¼ n, take more for b > b ,
(33)
so (30) is equivalent to Hatfield’s following
decision: where b* is given by (31). This is just the conclu-
sion we reached in Eq. (32) by backward induc-
takeshare if b < b tion, except for the case b = b*. We still have to
indifferent if b ¼ b (32) show that t*(b*) = 0, that is, that Hatfield stays
takemore if b > b : honest at equilibrium. To do this, we now have to
consider McCoy’s payoffs. The expected payoff
to McCoy, as a function of b, is given by
What is Hatfield’s equilibrium strategy? In
order to determine it, we must first define his ðb, t Þ
IM8
set of strategies a little more carefully. A typical >
> eð1 bÞ if b < b
<
eð1 b Þð1 tðb ÞÞ
element of the set will be a recipe which tells ¼
>
> þðcb að1 b ÞÞtðb Þ if b ¼ b
him, for every conceivable announcement by :
cb að1 bÞ if b > b :
McCoy of a value of the non-monitoring prob-
ability b, whether or not to take more than his (34)
share. Recipe (30) is certainly one such,
although it leaves an ambiguity for b = b*. In It is plotted in Fig. 7. McCoy also wishes to
that case Hatfield may decide to make his choice maximize his expected payoff for b < b*, when
randomly, in other words to use a mixed strat- Hatfield stays honest, it is at least e and
egy. In general, such a mixed strategy is given increases with increasing b. For b > b*, when
by a probability t(b) for “take more,” and the Hatfield takes more than his share, it is between
complementary probability 1 t(b) for “take c and a and certainly worse for McCoy. For
share,” for all b) [0, 1] Hatfield’s complete b = b*, McCoy’s payoff is something intermedi-
strategy set is therefore the set of all functions ate, depending on Hatfields behavior. The argu-
which map the unit interval onto itself, {t(b) | t: ment seems to be getting circular, but we are
[0, 1] 7! [0, 1]}. almost done.
We now assert that Hatfield’s equilibrium McCoy’s equilibrium strategy, if he has one at
strategy t* is in fact always a pure strategy, all, has to be b* as given by (31): for b > b*
namely McCoy would do better by choosing alternatively
Inspection Games 283
Inspection Games,
Im (β, t∗)
Fig. 7 McCoy’s payoff as
1
a function of b according to
0 β∗ β
( 34). The • indicates the
equilibrium payoff of the
simultaneous game,
Eq. (28)
−e
−a
−c
a small b to make Hatfield act honestly, and for for one’s self, and to destroy the other’s ability to do
b < b* he could always do a little better by the same.
choosing a larger b, closer to b*. So the only Simaan and Cruz (1973) and Wölling (2002)
maximum of his payoff curve, as seen in Fig. 7, refined the concept and formulated conditions for
is at b = b*. However, as Eq. (34) shows, the the existence of Nash equilibria in leadership
maximum exists only if Hatfield’s equilibrium games. Maschler was the first to apply the leader-
strategy is such that t*(b*) = 0. The unique sub- ship concept to inspection games (Maschler 1966,
game perfect equilibrium strategies must therefore 1967). Later on it was widely used in the analysis
be (b*, t*) given by (31) and (33), with payoffs of IAEA verification procedures, in particular for
variable sampling inspection problems, see
d
IM ðb , t Þ ¼ c , IH ðb , t Þ ¼ 0: (35) Avenhaus et al. (1991).
bþd The importance of deterring the inspectee from
illegal behavior, or more positively, of inducing him
Thus McCoy’s equilibrium strategy is the same to behave legally, depends on the specific nature of
as before, as is Hatfield’s payoff, the decisive the problem. For example in the ticket control prob-
difference being that Hatfield does not take more lem of Subsection “Passenger Ticket Control,” max-
than his share of the water. imization of intake – fares plus fines – is no doubt
the highest priority of the transit authority, even
Remarks though the inspector leadership concept would
The leadership concept was first introduced by work here as well, at least in theory. In principal
von Stackelberg (1934) in the context of eco- agent models the situation is similar. In the context
nomic theory, well before game theory became a of arms control and disarmament, on the other hand,
scientific discipline. In game-theoretic terminol- deterrence is fundamental: the community of States
ogy Schelling (1960) probably was the first for- party to such an agreement have a vital interest that
mulate its importance: all members adhere to its provisions. There exists a
large literature on the subject; a comprehensive sur-
A strategic move is one that influences the other
vey of the leadership concept in game theory in
person’s choice in a manner favorable to one’s self, general has been given by Wölling (2002).
by affecting the other person’s expectations on how
one’s self will behave. One constrains the partner’s
choice by constraining one’s own behavior. The Future Directions
object is to set up for one’s self and communicate
persuasively to the other player a mode of behavior
(including conditional responses to the other’s
We hope that, with our chosen examples, we have
behavior) that leaves the other a simple maximiza- given a representative, although certainly not exhaus-
tion problem whose solution for him is the optimum tive, overview of the concepts and models making up
284 Inspection Games
the area of inspection games. At the same time we Avenhaus R, Canty MJ, Kilgour DM, von Stengel B, Zamir
have tried to give some idea of its wide range of S (1996) Inspection games in arms control. Eur J Oper
Res 90:383–394
applications. Avenhaus R, von Stengel B, Zamir S (2002) Inspection
Of course we cannot predict future develop- games. In: Aumann R, Hart S (eds) Handbook of game
ments in the field. To be sure, there are still theory. Elsevier, Amsterdam, pp 1947–1987
unsolved mathematical problems associated with Baiman S (1982) Agency research in managerial account-
ing: a survey. J Account Lit 1:154–213
present inspection models, in particular in arms Baston VJ, Bostock FA (1991) A remark on the customs
control and disarmament. For example in Subsec- smuggler game. Nav Res Logist 41:287–293
tion “Diversion of Nuclear Material” it was Bierlein D (1968) Direkte Überwachungssysteme. Oper
pointed out that near real time accountancy Res Verfahr 6:57–68
Bierlein D (1969) Auf Bilanzen und Inventuren
poses fundamental difficulties that have not yet basierenden Safeguards-Systeme. Oper Res Verfahr
been solved satisfactorily. Active research is pro- 6:36–43
ceeding and interesting results may be expected. Borch K (1990) Economics of insurance. North-Holland,
As mentioned at the outset, in the area of envi- Amsterdam
Cavasoglu H, Raghunatahan S (2004) Configuration of
ronmental control the number of published inves- detection software: a comparison of decision and
tigations is surprisingly small. With the growing game theory. Decis Anal 1:131–148
awareness of the importance of international Cook J, Nadeau L, Thomas LC (1997) Does cooperation in
agreement on environmental protection the need auditing matter? A comparison of a non-cooperative
and a cooperative game model of auditing. Eur J Oper
for effective and efficient control mechanisms will Res 103:470–482
become more and more apparent. Here we expect Derman C (1961) On minimax surveillance schedules. Nav
that the inspection game approach, especially as a Res Logist 8:415–419
means of structuring verification systems, can and Diamond H (1982) Minimax policies for unobservable
inspections. Math Oper Res 7(1):139–153
will play a useful role. As Barry O’Neill con- Dresher M (1962) A sampling inspection problem in arms
cludes his examination of game theory in peace control agreements: a game theoretical analysis. Mem-
and war (O’Neill 1994): orandum RM-2972-ARPA. RAND Corporation, Santa
Monica
. . . game theory clarifies international problems Dye RA (1986) Optimal monitoring policies in agencies.
exactly because they are more complicated. [. . .] RAND J Econ 17:339–350
The contribution of game models is to sort out Ferguson TS, Melolidakis C (1998) On the inspection
concepts and figure out what the game might be. game. Nav Res Logist 45:327–334
Garnaev AY (1991) A generalized inspection game. Nav
Res Logist 28:171–188
Goutal P, Garnaev A, Garnaeva G (1997) On the infiltration
Bibliography game. Int J Game Theory 26(2):215–221
Höpfinger E (1971) A game-theoretic analysis of an
Avenhaus R (1997) Entscheidungstheoretische Analyse inspection problem, University of Karlsruhe
der Fahrgast-Kontrollen. Der Nahverkehr 9:27 (unpublished manuscript)
Avenhaus R, Canty MJ (1989) Re-examination of the Höpfinger E (1974) Zuverlässige Inspektionsstrategien.
IAEA formula for stratified attribute sampling. In: Pro- Z Wahrscheinlichkeitstheorie Verw Geb 31:35–46
ceedings of the 11th ESARDA symposium, JRC, Ispra, Hozaki R, Kuhdoh D, Komiya T (2006) An inspection
pp 351–356 game: taking account of fulfillment probabilities of
Avenhaus R, Canty MJ (1996) Compliance quantified. players. Nav Res Logist 53:761–771
Cambridge University Press, Cambridge IAEA (1972) The structure and content of agreements
Avenhaus R, Canty MJ (2005) Playing for time: a sequen- between the agency and states required in connection
tial inspection game. Eur J Oper Res 167(2):474–492 with the treaty on the non-proliferation of nuclear
Avenhaus R, Jaech JL (1981) On subdividing material weapons. IAEA, Vienna, INF/CIRC 153 (corrected)
balances in time and/or space. J Inst Nucl Manag IV IAEA (1997) Model protocol additional to the agreement
(3):24–33 (s) between state(s) and the international atomic energy
Avenhaus R, Okada A, Zamir S (1991) Inspector leader- agency for the application of safeguards. IAEA,
ship with incomplete information. In: Selten Vienna, INF/CIRC 140
R (ed) Game equilibrium models, vol IV. Springer, Kanodia CS (1985) Stochastic and moral hazard.
Heidelberg, pp 319–361 J Account Res 23:175–293
Inspection Games 285
Kilgour DM (1992) Site selection for on-site inspection in Rohatgi VK (1976) An introduction to probability theory
arms control. Arms Control 13:439–462 and mathematical statistics. Wiley, New York
Krieger T (2008) On the asymptotic behavior of a discrete Rothenstein D, Zamir S (2002) Imperfect inspection games
time inspection game. Math Model Anal 13(1):37–46 over time. Ann Oper Res 109:175–192
Kuhn HW (1953) Extensive games and the problem of Sakaguchi M (1994) A sequential game of multi-
information. In: Kuhn HW, Tucker AW (eds) Contri- opportunity infiltration. Math Jpn 39:157–166
butions to the theory of games, vol II. Princeton Uni- Schelling TC (1960) The strategy of conflict. Harvard
versity Press, Princeton, pp 193–216 University Press, Cambridge, MA
Maschler M (1966) A price leadership method for solving Simaan M, Cruz JB (1973) On the Stackelberg strategy
the inspector’s non-constant-sum game. Nav Res in nonzero-sum games. J Optim Theory Appl 11(5):
Logist 13:11–33 533–555
Maschler M (1967) The inspector’s non-constant-sum- von Neumann J, Morgenstern O (1947) Theory of games
game: its dependence on a system of detectors. Nav and economic behavior. Princeton University Press,
Res Logist 14:275–290 Princeton
Morris P (1994) Introduction to game theory. Springer, von Stackelberg H (1934) Marktform und Gleichgewicht.
New York Springer, Vienna
Nash JF (1951) Non-cooperative games. Ann Math von Stengel B (1991) Recursive inspection games, Report
54:286–295 No. S 9106. Computer Science Faculty, Armed Forces
O’Neill B (1994) Game theory models of peace and war. University Munich
In: Aumann R, Hart S (eds) Handbook of game theory. Stewart KB (1971) A cost-effectiveness approach to inven-
Elsevier, Amsterdam, pp 995–1053 tory verification. In: Proceedings of the IAEA sympo-
Ostrom E, Gardner R, Walker J (1994) Rules, games and sium on safeguards techniques, vol II. International
common pool resources. University of Michigan Press, Atomic Energy Agency, Vienna, pp 387–409
Ann Arbor Thomas MU, Nisgav Y (1976) An infiltration game with
Owen G (1968) Game theory. W. B. Saunders, Philadelphia time-dependent payoff. Nav Res Logist 23:297–320
Pavlovic L (2002) More on the search for an infiltrator. Nav Wilks TJ, Zimbelman MF (2004) Using game theory and
Res Logist 49:1–14 strategic reasoning concepts to prevent and detect
Rinderle K (1996) Mehrstufige sequentielle Inspektion- fraud. Account Horiz 18(3):173–184
sspiele mit statistischen Fehlern erster und zweiter Wölling A (2002) Das Führerschaftsprinzip bei
Art. Kovac, Hamburg Inspektionsspielen. Kovac, Hamburg
Asymmetric Information In a relationship or a
Principal-Agent Models transaction, there is asymmetric information
when one party has more or better information
Inés Macho-Stadler1 and David Pérez-Castrillo2 than the other party concerning relevant char-
1
Universitat Autònoma de Barcelona and acteristics of the relationship or the transaction.
Barcelona GSE, Barcelona, Spain There are two types of asymmetric information
2
Dept. of Economics and Economic History, problems: moral hazard and adverse selection.
Universitat Autònoma de Barcelona and Information Economics Information economics
Barcelona GSE, Barcelona, Spain studies how information and its distributions
among the players affect economic decisions.
Moral Hazard (Hidden Action) The term moral
Article Outline hazard initially referred to the possibility that the
redistribution of risk (such as insurance which
Glossary transfers risk from the insured to the insurer)
Definition of the Subject changes people’s behavior. This term, which has
Introduction been used in the insurance industry for many
The Base Game years, was studied first by Kenneth Arrow.In
Moral Hazard principal-agent models, the term moral hazard is
Adverse Selection used to refer to all environments where the igno-
Future Directions rant party lacks information about the behavior of
Bibliography the other party once the agreement has been
signed, in such a way that the asymmetry arises
after the contract is settled.
Keywords Principal Agent The principal-agent model iden-
tifies the difficulties that arise in situations where
Incentives · Contracts · Asymmetric
there is asymmetric information between two
Information · Moral Hazard · Adverse
parties and finds the best contract in such envi-
Selection
ronments. The “principal” is the name used for the
contractor, while the “agent” corresponds to the
contractee. Both principal and agent could be
Glossary individuals, institutions, organizations, or decision
centers. The optimal solutions propose mecha-
Adverse Selection (Hidden Information) The nisms that try to align the interests of the agent
term adverse selection was originally used in with those of the principal, such as piece rates or
insurance. It describes a situation where, as a profit sharing, or that induce the agent to reveal
result of private information, the insured are the information, such as self-reporting contracts.
more likely to suffer a loss than the uninsured
(such as offering a life insurance contract at a Definition of the Subject
given premium may imply that only the people
with a risk of dying over the average take it).In Principal-agent models provide the theory of con-
principal-agent models, we say that there is an tracts under asymmetric information. Such a the-
adverse selection problem when the ignorant ory analyzes the characteristics of optimal
party lacks information while negotiating a contracts and the variables that influence these
contract, in such a way that the asymmetry is characteristics, according to the behavior and infor-
previous to the relationship. mation of the parties to the contract. This approach
© Springer Science+Business Media, LLC, part of Springer Nature 2020 287
M. Sotomayor et al. (eds.), Complex Social and Behavioral Systems,
https://doi.org/10.1007/978-1-0716-0368-0_416
Originally published in
R. A. Meyers (ed.), Encyclopedia of Complexity and Systems Science, © Springer Science+Business Media New York 2015
https://doi.org/10.1007/978-3-642-27737-5_416-3
288 Principal-Agent Models
has a close relation to game theory and mechanism Prize in Economics in 1996 “for their fundamental
design: it analyzes the strategic behavior by agents contributions to the economic theory of incentives
who hold private information and proposes mech- under asymmetric information.” Five years later, in
anisms that minimize the inefficiencies due to such 2001, George A. Akerlof, A. Michael Spence, and
strategic behavior. The costs incurred by the prin- Joseph E. Stiglitz also obtained the Nobel Prize in
cipal (the contractor) to ensure that the agents (the Economics “for their analyses of markets with
contractees) will act in her interest are some type of asymmetric information.”
transaction cost. These costs include the tasks of
investigating and selecting appropriate agents,
gaining information to set performance standards, Introduction
monitoring agents, bonding payments by the
agents, and residual losses. The objective of the principal-agent literature is to
Principal-agent theory (and information eco- analyze situations in which a contract is signed
nomics in general) is possibly the area of econom- under asymmetric information, that is, when one
ics that has evolved the most over the past party knows certain relevant things of which the
25 years. It was initially developed in parallel other party is ignorant. The simplest situation
with the new economics of industrial organiza- concerns a bilateral relationship: the contract
tion, although its applications include now almost between one principal and one agent. The objec-
all areas in economics, from finance and political tive of the contract is for the agent to carry out
economy to growth theory. actions on behalf of the principal and to specify
Some early papers centered on incomplete the payments that the principal will pass on to the
information in insurance contracts, and more par- agent for such actions.
ticularly on moral hazard problems, are Spence and In the literature, it is always assumed that the
Zeckhauser (1971) and Ross (1973). The theory principal is in charge of designing the contract.
soon generalized to dilemmas associated with con- The agent receives an offer and decides whether
tracts in other contexts (Harris and Raviv 1978; or not to sign the contract. He will accept it when-
Jensen and Meckling 1976). It was further devel- ever the utility obtained from it is greater than the
oped in the mid-1970s by authors such as Pauly utility that the agent would get from not signing.
(1968, 1974), Mirrlees (1975), Harris and Raviv This utility level that represents the agent’s out-
(1979), and Holmström (1979). Arrow (1985) side opportunities is his reservation utility. In
worked on the analysis of the optimal incentive order to simplify the analysis, it is assumed that
contract when the agent’s effort is not verifiable. the agent cannot make a counteroffer to the prin-
A particular case of adverse selection is the one cipal. This way of modeling implicitly assumes
where the type of the agent relates to his valuation that the principal has all the bargaining power,
of a good. Asymmetric information about buyers’ except for the fact that the reservation utility can
valuation of the objects sold is the fundamental be high in those cases where the agent has excel-
reason behind the use of auctions. Vickrey (1961) lent outside opportunities.
provides the first formal analysis of the first- and If the agent decides not to sign the contract, the
second-prize auctions. Akerlof (1970) highlighted relationship does not take place. If he does accept
the issue of adverse selection in his analysis of the the offer, then the contract is implemented. It is
market for secondhand goods. Further analyses crucial to notice that the contract is a reliable
include the early work of Mirrlees (1971), Spence promise by both parties, stating the principal and
(1974), Rothschild and Stiglitz (1976), Mussa and agent’s obligations for all (contractual) contingen-
Rosen (1978), Baron and Myerson (1982), and cies. It can only be based on verifiable variables,
Guesnerie and Laffont (1984). that is, those for which it is possible for a third
The importance of the topic has also been recog- party (a court) to verify whether the contract has
nized by the Nobel Foundation. James A. Mirrlees been fulfilled. When some players know more
and William Vickrey were awarded with the Nobel than others about relevant variables, we have a
Principal-Agent Models 289
Principal-Agent Models,
P designs
Fig. 1 The figure
the contract N determines
summarizes the timing of
or the menu the state of the
the relationship and the of contracts
three cases as a function of word
(a)
the information available to
the participants
Time
situation with asymmetric information. In this concept to use is subgame (Bayesian) perfect
case, incentives play an important role (Fig. 1). equilibrium.
Given the description of the game played The setup gives rise to three possible scenarios:
between the principal and agent, we can summa-
rize its timing in the following steps: 1. The symmetric information case, where the
two players share the same information, even
1. The principal designs the contract (or set of if they both may ignore some important ele-
contracts) that she will offer to the agent, the ments (some elements may be uncertain).
terms of which are not subject to bargaining. 2. The moral hazard case, where the asymmetry
2. The alternatives opened to the agent are to of information arises once the contract has
accept or to reject the contract. The agent been signed: the decision or the effort of
accepts it if he desires so, that is, if the contract agent is not verifiable and hence it cannot be
guarantees him greater expected utility than included in the contract.
any other (outside) opportunities available 3. The adverse selection case, where the asym-
to him. metry of information is previous to the signa-
3. The agent carries out an action or effort on ture of the contract: a relevant characteristic of
behalf of the principal. the agent is not verifiable and hence the prin-
4. The outcome is observed and the payments cipal cannot include it in the contract.
are done.
From these elements, it can be seen that the To see an example of moral hazard, consider a
agent’s objectives may be in conflict with those of laboratory or research center (the principal) that
the principal. When the information is asymmet- contracts a researcher (the agent) to work on a
ric, the informed party tries to take advantage, certain project. It is difficult for the principal to
while the uninformed player tries to control this distinguish between a researcher who is thinking
behavior via the contract. Since a principal-agent about how to push the project through and a
problem is a sequential game, the solution researcher who is thinking about how to organize
290 Principal-Agent Models
his evening. It is precisely this difficulty in con- principal, who owns the result and must pay the
trolling effort inputs, together with the inherent agent, has preferences represented by the utility
uncertainty in any research project, what gener- function
ates a moral hazard problem, which is a nonstan-
dard labor market problem. Bðx wÞ;
For an example of adverse selection, consider a
regulator who wants to set the price of the service where w represents the payoff made to the agent.
provided by a public monopoly equal to the aver- BðÞ is assumed to be increasing and concave:
age costs in the firm (to avoid subsidies). This B0 > 0 , B00 0 (where the primes represent,
policy (as many others) is subject to important respectively, the first and second derivatives).
informational requirements. It is not enough that The concavity of the function BðÞ indicates that
the regulator asks the firm to reveal the required the principal is either risk neutral or risk averse.
information in order to set the adequate price, The agent receives a monetary payoff for his
since the firm would attempt to take advantage participation in the relationship, and he supplies an
of the information. Therefore, the regulator should effort which implies some cost to him. For the sake
take this problem into account. of simplicity, we represent his utility function as
reservation utility. Since the principal’s problem is ratio of marginal utilities of the principal and the
to design a contract that the agent will accept agent to be constant irrespective of the final result.
(by backward induction), the optimal contract If the principal is risk neutral (B00 ðÞ ¼ 0), then
must satisfy the participation constraint, and it the optimal contract has to be such that
is the solution to the following maximization u0 ðw∘ ðxi ÞÞ ¼ constant f or all i . In addition, if
problem: the agent is risk averse (u00 ðÞ < 0 ), he receives
the same wage, say w∘, in all contingencies. This
X
n wage only depends on the effort demanded and is
Max pi ðeÞBðxi wðxi ÞÞs:t: determined by the participation constraint. If the
½e, fwðxi Þgi¼1,...,n i¼1
agent is risk neutral (u00 ðÞ ¼ 0) and the principal
X n
pi ðeÞuðwðxi ÞÞ vðeÞ U: is risk averse (B00 ðÞ < 0 ), then we are in the
i¼1 opposite situation. In this scenario, the optimal
contract requires the principal’s profit to be inde-
The above problem corresponds to a Pareto opti- pendent of the result. Consequently, the agent
mum in the usual sense of the term. The solution to bears all the risk, insuring the principal against
this problem is conditional on the value of the variations in the result. When both the principal
parameter U, so that even those cases where the and the agent are risk averse, each of them needs
agent can keep a large share of the surplus are to accept a part of the variability of the result. The
taken into account. precise amount of risk that each of them supports
The principal’s program is well behaved with depends on their degrees of risk aversion. Using
respect to payoffs given the assumptions on u(w). the Arrow-Pratt measure of absolute risk aversion
Hence the Kuhn-Tucker conditions will be both r p ¼ B00 =B0 and r a ¼ u00 =u0 for the principal
necessary and sufficient for the global solution of and the agent, respectively, we can show that
the problem. However, we cannot ascertain the
concavity (or quasi-concavity) of the functions dw∘ rp
¼ ;
with respect to effort given the assumptions on dxi rp þ ra
v(e), because these functions also depend on all
the pi(e). Hence it is more difficult to obtain global which indicates how the agent’s wage changes
conclusions with respect to this variable. given an increase in the result xi. Since
Let us denote by e∘ the efficient effort level. r p = r p þ r a ð0, 1Þ, when both participants are
From the first-order Kuhn-Tucker conditions with risk averse, the agent only receives a part of the
respect to the wages in the different contingencies, increased result via a wage increase. The more
we can analyze the associated payoffs risk averse is the agent, that is to say, the greater
n o
∘ is ra, the less the result influences his wage. On the
w ðxi Þi¼1,...,n : We obtain the following
other hand, as the risk aversion of the principal
condition: increases, greater rp, changes in the result corre-
spond to more important changes in the wage.
B0 ðxi w∘ ðxi ÞÞ
l∘ ¼ , f or all i f1, 2, . . . , ng;
u0 ðw∘ ðxi ÞÞ
Moral Hazard
where l∘ is the multiplier associated with the Basic Moral Hazard Model
participation constraint. When the agent’s utility Here we concentrate on moral hazard, which is
is additively separable, the participation constraint the case in which the informational asymmetry
binds (l∘ is positive). The previous condition relates to the agent’s behavior during the relation-
equates marginal rates of substitution and indicates ship. We analyze the optimal contract when the
that the optimal distribution of risk requires that the agent’s effort is not verifiable. This implies that
292 Principal-Agent Models
effort cannot be contracted upon, because in case The idea underlying an incentive contract is
of the breach of contract, no court of law could that the principal can make the agent interested
know if the contract had really been breached or in the consequences of his behavior by making his
not. There are many examples of this type of payoff dependent on the result obtained. Note that
situation. A traditional example is accident insur- this has to be done at the cost of distorting the
ance, where it is very difficult for the insurance optimal risk sharing among both participants.
company to observe how careful a client has been The trade-off between efficiency, in the sense of
to avoid accidents. the optimal distribution of risk, and incentives
The principal will state a contract based on determines the optimal contract.
any signals that reveal information on the agent’s Formally, since the game has to be solved by
effort. We will assume that only the result of the backward induction, the optimal contract under
effort is verifiable at the end of the period and, moral hazard is the solution to the maximization
consequently, it will be included in the contract. problem:
However, if possible, the contract should be con-
tingent on many other things. Any information X
n
related to the state of nature is useful, since it Max pi ðeÞBðxi wðxi ÞÞs:t:
allows better estimations of the agent’s effort ½e, fwðxi Þgi¼1,...,n i¼1
thus reducing the risk inherent in the relation- X n
pi ðeÞuðwðxi ÞÞ vðeÞ U
ship. This is known as the sufficient statistic
i¼1
result, and it is perhaps the most important con- ( )
X
n
clusion in the moral hazard literature e Arg Max pi ðebÞuðwðxi ÞÞ vðebÞ :
(Holmström 1979). The empirical content of the be i¼1
sufficient statistic argument is that a contract
should exploit all available information in order The second restriction is the incentive compatibil-
to filter out risk optimally. ity constraint and the first restriction is the partic-
The timing of a moral hazard game is the ipation constraint. The incentive compatibility
following. In the first place, the principal decides constraint, and not the principal as under symmet-
what contract to offer the agent. Then the agent ric information, determines the effort of the agent.
decides whether or not to accept the relationship, The first difficulty in solving this program is
according to the terms of the contract. Finally, if related to the fact that the incentive compatibility
the contract has been accepted, the agent chooses constraint is a maximization problem. The second
the effort level that he most desires, given the difficulty is that the expected utility may fail to be
contract that he has signed. This is a free decision concave in effort. Hence, to use the first-order
by the agent since effort is not a contracted vari- condition of the incentive compatibility constraint
able. Hence, the principal must bear this in mind may be incorrect. In spite of this, there are several
when she designs the contract that defines the ways to proceed when facing this problem.
relationship. (a) Grossman and Hart (1983) propose to solve it
To better understand the nature of the problem in steps, identifying first the optimal payment
faced by the principal, consider the case of a risk- mechanism for any effort and then, if possible,
neutral principal and a risk-averse agent, which the optimal effort. This can be done since the
implies that, under the symmetric information, the problem is concave in payoffs. (b) The other pos-
optimal contract is to completely insure the agent. sibility is to consider situations where the agent’s
However, if the principal proposes this contract maximization problem is well defined. One pos-
when the agent’s effort is not a contracted vari- sible scenario is when the set of possible efforts is
able, once he has signed the contract, the agent finite, in which case the incentive compatibility
will exert the effort level that is most beneficial for constraint takes the form of a finite set of inequal-
him. Since the agent’s wage does not depend on ities. Another scenario is to write the incentive
his effort, he will use the lowest possible effort. compatibility as the first-order condition of the
Principal-Agent Models 293
maximization problem and introduce assumptions Radner (1981) and Rubinstein and Yaari
that allow doing it. The last solution is known as (1983) consider infinitely repeated relationships
the first-order approach. and show that frequent repetition of the relation-
Let us assume that the first-order approach is ship allows us to converge toward the efficient
adequate, and substitute the incentive compatibil- solution. Incentives are not determined by the
ity constraint in the previous program by payoff scheme contingent on the result of each
period, but rather on average effort, and the infor-
X
n mation available is very precise when the number
p0i ðebÞuðwðxi ÞÞ v0 ðebÞ ¼ 0: of periods is large. A sufficiently threatening pun-
i¼1
ishment, applied when the principal believes that
Solving the principal program with respect to the the agent on average does not fulfill his task, may
payoff scheme and denoting by l (resp., m) the be sufficient to dissuade him from shirking.
Lagrangian multiplier of the participation con- When the relationship is repeated a finite num-
straint (resp., the incentive compatibility con- ber of times, the analysis of the optimal contract
straint), we obtain that for all i: concentrates on different issues relating long-term
agreements and short-term contracts. Note that in a
1 p0i ðeÞ repeated setup, the agent’s wage and the agent’s
¼ l þ m :
u0 ðwðxi ÞÞ pi ð e Þ consumption in a period need not be equal. Lam-
bert (1983), Rogerson (1985), and Chiappori and
This condition shows that the wage should not Macho-Stadler (1990) show that long-term con-
depend at all on the value that the principal places tracts have memory (i.e., the payoffs in any single
on the result. It depends on the results as a mea- period will depend on the results of all previous
sure of how informative they are as to effort, in periods) since they internalize agent’s consumption
order to serve as an incentive for the agent. The over time, which depends on the sequence of pay-
wage will be increasing in the result as long as the ments received (as a function of the past contin-
result is increasing in effort. Hence, it is optimal gencies). Malcomson and Spinnewyn (1988),
that the wage will be increasing in the result only Fudenberg et al. (1990), and Rey and Salanié
in particular cases. The necessary condition for a (1990) study when the optimal long-term contract
wage to be increasing with results is pi0 (e)/pi(e) to can be implemented through the sequence of opti-
be decreasing in i. In statistics, this is called the mal short-term contracts. Chiappori et al. (1994)
monotonous likelihood quotient property. It is a show that, in order for the sequence of optimal
strong condition; for example, first-order stochas- short-term contracts to admit the same solution as
tic dominance does not guarantee the monotonous the long-term contract, two conditions must be
likelihood property. met. First, the optimal sequence of single-period
contracts should have memory. That is why, when
Extensions of Moral Hazard Models the reservation utility is invariant (is not history
The basic moral hazard setup, with a principal dependent), the optimal sequence of short-term
hiring and an agent performing effort, has been contracts will not replicate the long-term optimum
extended in several directions to take into account unless there exist means of smoothing consump-
more complex relationships. tion, that is, the agent has access to credit markets.
Second, the long-term contract must be renegotia-
Repeated Moral Hazard tion proof. A contract is said to be renegotiation
Certain relationships in which a moral hazard proof if at the beginning of any intermediate
problem occurs do not take place only once, but period, no new contract or renegotiation that
they are repeated over time (e.g., work relation- would be preferred by all participants is possible.
ships, insurance, etc.). The duration aspect (the When the long-term contract is not renegotiation
repetition) of the relationship gives rise to new proof (i.e., if it is not possible for participants to
elements that are absent in static models. change the clauses of the contract at a certain
294 Principal-Agent Models
moment of time even if they agree), it cannot Itoh 1990; Macho-Stadler and Pérez-Castrillo
coincide with the sequence of short-term contracts. 1993).
Another principal’s decision when she hires
several agents is the organization with which she
One Principal and Several Agents will relate. This includes such fundamental deci-
When a principal contracts with more than on agent, sions as how many agents to contract and how
the stage where agents exert their effort, which is should they be structured. These issues have been
translated into the incentive compatibility, depends studied by Demski and Sappington (1986),
on the game among the agents. If the agents behave Melumad and Reichelstein (1987), and Macho-
as a coordinated and cooperating group, then the Stadler and Pérez-Castrillo (1998).
problem is similar to the previous one where the Holmström and Milgrom (1991) analyze a sit-
principal hires a team. A more interesting case uation in which the agent carries out several tasks,
appears when agents play a noncooperative game each one of which gives rise to a different result.
and their strategies form a Nash equilibrium. They study the optimal contract when tasks are
Holmström (1979) and Mookherjee (1984), in complementary (in the sense that exerting effort in
models where there is personalized information one reduces the costs of the other) or substitutes.
about the output of each agent, show that the Their model allows to build a theory of job design
principal is interested in paying each agent and to explain the relationship among responsi-
according to his own production and that of the bility and authority.
other agents if these other results can inform on
the actions of the agent at hand. Only if the Several Principals and One Agent
results of the other agents do not add information When one agent works for (or signs his contracts
or, in other words, if an agent’s result is a suffi- with) several principals simultaneously (common
cient statistic for his effort, then he will be paid agency situation), in general, the principals are
according to his own result. better off if they cooperate. When the principals
When the only verifiable outcome is the final are not able to achieve the coordination and com-
result of teamwork (joint production models), the mitment necessary to act as a single individual and
optimal contract can only depend on this informa- they do not value the results in the same way, they
tion, and the conclusions are similar to those each demand different efforts or actions from the
obtained in models with only one agent. Alchian agent. Bernheim and Whinston (1986) show that
and Demsetz (1972) and Holmström (1982) show the effort that principals obtain when they do not
that joint production cannot lead to efficiency cooperate is less than the effort that would maxi-
when all the income is distributed among the mize their collective profits. However, the final
agents, i.e., if the budget constraint always contract that is offered to the agent minimizes the
binds. Another player should be contracted to cost of getting the agent to choose the contractual
control the productive agents and act as the resid- effort.
ual claimant of the relationship.
Tirole (1986) and Laffont (1990) have studied
the effect of coalitions among the agents in an Adverse Selection
organization on their payment scheme. If collu-
sion is bad for the organization, it adds another Basic Adverse Selection Model
dimension of moral hazard (the colluding behav- Adverse selection is the term used to refer to
ior). The principal may be obliged to apply rules problems of asymmetric information that appear
that are collusion proof, which implies more con- before the contract is signed. The classic example
straints and simpler contracts (more bureaucratic). of Akerlof (1970) illustrates very well the issue:
When coordination can improve the input of a the buyer of a used car has much less information
group of agents, the optimal contract has to find about the state of the vehicle than the seller. Sim-
payment methods that strengthen group work (see ilarly, the buyer of a product knows how much he
Principal-Agent Models 295
appreciates the quality, while the seller only has A simple model of adverse selection is the
statistical information about a typical buyer’s taste following. A risk-neutral principal contracts an
(Mussa and Rosen 1978), or the regulated firm has agent (who could be risk neutral or risk averse)
more accurate information about the marginal cost to carry out some verifiable effort on her behalf.
of production than the regulator. Effort e provides an expected payment to the
A convenient way to model adverse selection principal of P(e), with P0 ðeÞ > 0 and P00 ðeÞ <
problems is to consider that the agent can be of 0. The agent could be either of two types that differ
different types and that the agent knows his type with respect to the disutility of effort, which is v(e)
before any contract is signed while the principal for type G(good) and kv(e), with k > 1; for type B
does not know it. In the previous examples, the (bad). Hence, the agent’s utility function is either
agent’s type is the particular quality of the used car, U G ðw, eÞ ¼ uðwÞ vðeÞ or U B ðw, eÞ ¼
the level of appreciation of quality, or the firm’s uðwÞ kvðeÞ . The principal considers that the
marginal cost. How can the principal deal with this probability for an agent to be type G is q, where
informational problem? Instead of offering just one 0 < q < 1.
contract for every (or several) type of agents, she The principal designs a menu of contracts
can propose several contracts so that each type of {(eG, wG), (eB, wB)}, where (eG, wG) is directed
agent chooses the one that is best for him. A useful toward the most efficient type of agent, while
result in this literature is the revelation principle (eB, wB) is intended for the least efficient type.
(Gibbard 1973; Green and Laffont 1977; Myerson For the menu of contracts to be a sensible pro-
1979) that states that any mechanism that the prin- posal, the agent must be better off by truthfully
cipal can design is equivalent to a direct revelation revealing his type than by deceiving the principal.
mechanism by which the agent is asked to reveal The principal’s problem is therefore to maximize
his type and a contract is offered according to his her expected profits subject to the restrictions that,
declaration. That is, a direct revelation mechanism (a) after considering the contracts offered, the
offers a menu of contracts to the agent (one con- agent decides to sign with the principal
tract for each possible type), and the agent can (participation constraints), and (b) each agent
choose any of the proposed contracts. Clearly, the chooses the contract designed for his particular
mechanism must give the agent the right incentives type (incentive compatibility constraints):
to choose the appropriate contract, that is, it must
be a self-selection mechanism. Menus of contracts
Max q P eG wG þ ð1 qÞ
are not unusual. For instance, insurance companies ½ðeG , wG Þ, ðeB , wB Þ
offer several possible insurance contracts between
P eB wB s:t:u wG v eG
which clients may freely choose their most pre-
ferred. For example, car insurance contracts can U u wB kv eB U u wG v eG
be with or without deductible clauses. The second
u wB v eB u wB kv eB
goes to more risk averse or more frequent drivers,
while deductibles attract less risk averse or less u wG kv eG :
frequent drivers.
Therefore, the timing of an adverse selection The main characteristics of the optimal contract
game is the following. In the first place, the menu {(eG, wG), (eB, wB)} are the following:
agent’s characteristics (his “type”) are realized,
and only the agent learns them. Then, the principal 1. The contract offered to the good agent (eG, wG)
decides the menu of contracts to offer to the agent. is efficient (non-distortion at the top). The
Having received the proposal, the agent decides optimal salary wG however is higher than
which one of the contracts (if any) to accept. under symmetric information: this type of
Finally, if one contract has been accepted, the agent receives an informational rent. That is,
agent chooses the predetermined effort and the most efficient agent profits from his private
receives the corresponding payment. information, and in order to reveal this
296 Principal-Agent Models
information, he has to receive a utility greater preferred by all or some of the agents and that
than his reservation level. gives that principal greater expected profits. This
2. The participation condition binds for the agent is why, if information was symmetric, the equilib-
when he has the highest costs (he just receives rium contracts would be characterized by the fol-
his reservation utility). Moreover, a distortion lowing properties: (i) principals’ expected profits
is introduced into the efficiency condition for are zero, and (ii) each contract must be efficient.
this type of agent. By distorting, the principal Hence the agent receives a fixed contract insuring
loses efficiency with respect to type-B agents, him against random events. In particular, the equi-
but she pays less informational rent to the librium salary that the agent receives under sym-
G types. metric information is higher when he is of type
G than when he is of type B.
When the principals cannot observe the type of
Principals Competing for Agents in Adverse the agent, the previous contracts can no longer be
Selection Frameworks an equilibrium: all the agents would claim to be a
Starting with the pioneer work by Rothschild and good type. An equilibrium contract pair {CG, CB}
Stiglitz (1976) on insurance markets, there have must satisfy the condition that no principal can
been many studies on markets with adverse add a new contract that would give positive
selection problems where there is competition expected profits to the agents that prefer this new
among principals to attract agents. We move contract to CG and CB. If the equilibrium contracts
from a model where one principal maximizes for the two agent types turn out to be the same,
her profits subject to the above constraints, to a that is, there is only one contract that is accepted
game theory environment where each principal by both agent types, then the equilibrium is said to
has to take into account the actions by others be pooling. On the other hand, when there is a
when deciding which contract to offer. In this different equilibrium contract for each agent type,
case, the adverse selection problem may be so then we have a separating equilibrium. In fact,
severe that we may find ourselves in situations in pooling equilibria never exist, since pooling con-
which no equilibrium exists. tracts always give room for a principal to propose
To highlight the main results in this type of a profitable contract that would only be accepted
models, consider a simple case in which there by the G types (the best agents). If an equilibrium
are two possible risk-averse agent types: good does exist, it must be such that each type of agent
(G) and bad (B) with G being more productive is offered a different contract.
than B. In particular, we assume that G is more If the probability that the agent is “good” is
careful than B, in the sense that he commits fewer large enough, then a separating equilibrium does
errors. When the agent exerts effort, the result not exist either. That is, an adverse selection prob-
could be either a success (S) or a failure (F). The lem in a market may provoke the absence of any
probability that it is successful is pG when the equilibrium in that market. When separating equi-
agent is type G and pB when he is type B, where libria do exist, the results are similar to the ones
pG > pB . The principal values a successful result under moral hazard in spite of the differences in
more than a failure. The result is observable, so the type of asymmetric information and in the
that the principal can pay the agent according to method of solving. That is, contingent payoffs
the result, if she so desires. are offered to the best agent to allow the principal
There are several risk-neutral principals. to separate them from the less efficient ones. In
Therefore, we look for the set of equilibrium this equilibrium, the least efficient agents obtain
contracts in the game played by principals com- the same expected utility (and even sign the same
peting to attract agents. Equilibrium contracts contract) as under symmetric information, while
must satisfy that there does not exist a principal the best agents lose expected utility due to the
who can offer a different contract that would be asymmetric information.
Principal-Agent Models 297
if there was no asymmetric information. Starting firms have private information related to their
with Maskin and Riley (1989), several authors costs. They show that structures that concentrate
have also analyzed auctions of multiple units. all tasks to a single agent are superior, since the
Finally, Clarke (1971) and Groves (1973) ini- incentives to dishonestly reveal the costs of each
tiated another group of models in which the prin- of the phases are weaker. Da-Rocha-Alvarez and
cipal contracts with several agents simultaneously De-Frutos (1999) argue that the absolute advan-
but does not attempt to maximize her own profits. tage of the centralized hierarchy is not maintained
This is the case of the provision of a public good if the differences in costs between the different
through a mechanism provided by a benevolent phases are sufficiently important.
regulator.
Several Principals
Relationships with Several Agents: Other Models Stole (1991) and Martimort (1996) point out the
and Organizational Design difficulty of extending the revelation principle to
Adverse selection models have attempted to ana- situations where an agent with private information
lyze the optimal task assignment, the advantages is contracted by several principals who act sepa-
of delegation, or the optimal structure of contrac- rately. Given that not only one contract (or menu
tual relationships, when the principal contracts of contracts) is offered to the agent, but several
with several agents. Riordan and Sappington contracts coming from different principals, it is
(1987) analyze a situation where two tasks have not longer necessarily true that the best a principal
to be fulfilled and show that if the person in charge can do is to offer a “truth-telling mechanism.”
of each task has private information about the Consider a situation with two principals that
costs associated with the task, then the assignment are hiring a single agent. If we accept that agent’s
of tasks within the organization is an important messages are restricted to the set of possible types
decision. For example, when the costs are posi- that the agent may have, we can obtain some
tively correlated, then the principal will prefer to conclusions. If the activities or efforts that the
take charge of one of the phases herself, while she agent carries out for the two principals are sub-
will prefer to delegate the task when the costs are stitutes (e.g., a firm producing for two different
negatively correlated. customers), then the usual result on the distortion
In a very general framework, Myerson (1982) of the decision holds: the most efficient type of
shows a powerful result: in adverse selection sit- agent supplies the efficient level of effort, while
uations, centralization cannot be worse than the effort demanded from the least efficient type is
decentralization, since it is always possible to distorted. However, due to the lack of cooperation
replicate a decentralized contract with a central- between principals, the distortion induced in the
ized one. This result is really a generalization of effort demanded from the less efficient type of
the revelation principle. Baron and Besanko agent is lower than the one maximizing the prin-
(1992) and Melumad et al. (1995) show that if cipals’ aggregate profits. On the other hand, if the
the principal can offer complex contracts in a activities that the agent carries out for the princi-
decentralized organization, then a decentralized pals are complementary (e.g., the firm produces a
structure can replicate a centralized organization. final good that requires two complementary inter-
When there are problems of communication mediate goods in the production process), then the
between principal and agents, the equivalence comparison of the results under cooperation and
result does not hold: Melumad and Reichelstein under no cooperation between the principals
(1987) show that delegation of authority can be reveals that if a principal reduces the effort
preferable if communication between the princi- demanded from the agent, in the second case,
pal and the agents is difficult. Still concerning the this would imply that it is also profitable for the
optimal design of the organization, Dana (1993) other principal to do the same. Therefore, the
analyzes the optimal hierarchical structure in distortion in decisions is greater to that produced
industries with several productive phases, when in the case in which principals cooperate.
Principal-Agent Models 299
Models of Moral Hazard and Adverse Selection hazard and adverse selection, where there is only
The analysis of principal-agent models where one dimension in which information is
there are simultaneously elements of moral hazard asymmetric. A great deal of effort is devoted to
and adverse selection is a complex extension of try to ascertain whether it is moral hazard, or
classic agency theory. Conclusions can be adverse selection, or both prevalent in the market.
obtained only in particular scenarios. One com- This is a difficult task because both adverse selec-
mon class of models considers situations where tion and moral hazard generate the same predic-
the principal cannot distinguish the part tions in a cross section. For instance, a positive
corresponding to effort from the part correlation between insurance coverage and prob-
corresponding to the agent’s efficiency character- ability of accident can be due to either the intrin-
istic because both variables determine the produc- sically riskier drivers selecting into contracts with
tion level. Picard (1987) and Guesnerie et al. better coverage (as the (Rothschild and Stiglitz
(1989) propose a model with risk-neutral partici- 1976) model of adverse selection will predict) or
pants and show that if the effort demanded from to drivers with better coverage exerting less effort
the different agents is not decreasing in character- to drive carefully (as the canonical moral hazard
istic (if a higher value of this parameter implies model will predict). Chiappori et al. (2006) have
greater efficiency), then the optimal contract is a shown that the positive correlation between cov-
menu of distortionary deductibles designed to erage and risk holds more generally in the canon-
separate the agents. The menu of contracts ical models as long as the competitive assumption
includes one where the principal sells the firm to is maintained.
the agent (aiming at the most efficient type) and Future empirical approaches are likely to
another contract where she sells only a part of the incorporate market power (as in Chiappori et al.
production at a lower prize (aiming at the least 2006), multiple dimensions of asymmetric infor-
efficient type). However, there are also cases mation (as in Finkelstein and McGarry 2006), as
where fines are needed to induce the agents to well as different measures of asymmetric informa-
honestly reveal their characteristic. tion (as in Vera-Hernandez 2003). These advances
In fact, the main message of the previous liter- will be partly possible thanks to richer surveys
ature is that the optimal solution for problems that which collect subjective information regarding
mix adverse selection and moral hazard does not agents’ attributes usually unobserved by princi-
imply efficiency losses with respect to the pure pals or agent’s subjective probability distribu-
adverse selection solution when the agent’s effort tions. The wider availability of panel data will
is observable. However, in other frameworks (see mean that it will become easier to disentangle
Laffont and Tirole 1986), a true problem of asym- moral hazard from adverse selection (as in
metric information appears only when both prob- Abbring et al. 2003). Much is to be learned by
lems are mixed when, and efficiency losses are using field experiments that allow randomly vary-
evident. Therefore, the same solution as when ing contract characteristics offered to individuals
only the agent’s characteristic is private informa- and hence disentangling moral hazard from
tion cannot be achieved. adverse selection (as in Karlan and Zinman 2009).
empirical side, Ackerberg and Botticini (2002) Bolton G, Ockenfels A (2000) ERC: a theory of
find strong evidence for endogenous matching equity, reciprocity and competition. Am Econ Rev
90:166–193
between landlords and tenants and that risk shar- Chiappori PA, Macho-Stadler I (1990) Contrats de Travail
ing is an important determinant of contract choice. Répétés: Le Rôle de la Mémoire. Ann Econ Stat
Future research will extend the general equi- 17:4770
librium analysis of principal-agent contracts to Chiappori PA, Salanié B (2003) Testing contract theory: a
survey of some recent work. In: Dewatripont H,
other markets. In addition, the literature has only Turnovsky (eds) Advances in economics and econo-
studied one-to-one matching models. This should metrics, vol 1. Cambridge University Press, Cam-
be extended to situations where each principal can bridge, pp 115–149
hire several agents or where each agent deals with Chiappori PA, Macho-Stadler I, Rey P, Salanié B (1994)
Repeated moral hazard: the role of memory, commit-
several principals. The interplay between ment and access to credit markets. Eur Econ Rev
(external) market competition and (internal) col- 38:1527–1553
laboration between agents or principals can pro- Chiappori PA, Jullien B, Salanié B, Salanié F (2006)
vide useful insights about the characteristics of Asymmetric information in insurance: general testable
implications. Rand J Econ 37:783–798
optimal contracts in complex environments. Clarke E (1971) Multipart pricing of public goods. Public
Choice 11:17–33
Dam K, Pérez-Castrillo D (2006) The principal-agent
Bibliography matching market. Frontiers Econ Theory Berkeley
Electr 2(1):1–34
Dana JD (1993) The organization and scope of agents:
Primary Literature regulating multiproduct industries. J Econ Theory
Abbring J, Chiappori PA, Heckman JJ, Pinquet J (2003) 59:288–310
Adverse selection and moral hazard in insurance: can Da-Rocha-Alvarez JM, De-Frutos MA (1999) A note on
dynamic data help to distinguish? J Eur Econ Assoc the optimal structure of production. J Econ Theory
1:512–521 89:234–246
Ackerberg DA, Botticini M (2002) Endogenous matching Demski JS, Sappington D (1986) Line-item reporting,
and the empirical determinants of contract form. J Pol factor acquisition and subcontracting. J Account Res
Econ 110:564–592 24:250–269
Akerlof G (1970) The market for ‘Lemons’: qualitative Desiraju R, Sappington D (2007) Equity and adverse selec-
uncertainty and the market mechanism. Q J Econ tion. J Econ Manag Strategy 16:285–318
89:488–500 Dufwenberg M, Kirchsteiger G (2004) A theory of sequen-
Akerlof G (1982) Labor contracts as a partial gift tial reciprocity. Games Econ Behav 47:268–298
exchange. Q J Econ 97:543–569 Fehr E, Schmidt K (1999) A theory of fairness, competition
Alchian A, Demsetz H (1972) Production, information and cooperation. Q J Econ 114:817–868
costs, and economic organization. Am Econ Rev Fehr E, Kirchsteiger G, Riedl A (1993) Does fairness
62:777–795 prevent market clearing? Q J Econ 108:437–460
Alonso-Paulí E, Pérez-Castrillo D (2012) Codes of best Finkelstein A, McGarry K (2006) Multiple dimensions of
practice in competitive markets. Econ Theory private information: evidence from the long-term care
49(1):113–141 insurance market. Am Econ Rev 96:938–958
Arrow K (1985) The economics of agency. In: Pratt J, Freixas X, Guesnerie R, Tirole J (1985) Planning under
Zeckhauser R (eds) Principals and agents: the structure information and the ratchet effect. Rev Econ Stud
of business. Harvard University Press, Boston 52:173–192
Baron D, Besanko D (1984) Regulation and information in Fudenberg D, Holmström B, Milgrom B (1990) Short-term
a continuing relationship. Inf Econ Pol 1:267–302 contracts and long-term agency relationships. J Econ
Baron D, Besanko D (1987) Commitment and fairness in a Theory 51:1–31
dynamic regulatory relationship. Rev Econ Stud Gibbard A (1973) Manipulation for voting schemes.
54:413–436 Econometrica 41:587–601
Baron D, Besanko D (1992) Information, control and orga- Green JR, Laffont JJ (1977) Characterization of satisfac-
nizational structure. J Econ Manag Strategy tory mechanisms for the revelation of preferences for
1(2):237–275 public goods. Econometrica 45:427–438
Baron D, Myerson R (1982) Regulating a monopoly with Grossman SJ, Hart OD (1983) An analysis of the principal-
unknown costs. Econometrica 50:911–930 agent problem. Econometrica 51:7–45
Bernheim BD, Whinston MD (1986) Common agency. Groves T (1973) Incentives in teams. Econometrica
Econometrica 54:923–942 41:617–631
Bewley T (1999) Why rewards don’t fall during a reces- Grund C, Sliwka D (2005) Envy and compassion in tour-
sion. Harvard University Press, Cambridge naments. J Econ Manag Strateg 14:187–207
302 Principal-Agent Models
Guesnerie R, Laffont JJ (1984) A complete solution to a Martimort D (1996) Exclusive dealing, common agency
class of principal-agent problems with application to and multiprincipals incentive theory. Rand J Econ
the control of a self-managed firm. J Public Econ 27:1–31
25:329–369 Maskin E, Riley J (1989) Optimal multi-unit auctions. In:
Guesnerie R, Picard P, Rey P (1989) Adverse selection and Hahn F (ed) The economics of missing markets, infor-
moral hazard with risk neutral agents. Eur Econ Rev mation, and games. Oxford University Press, Oxford,
33:807–823 pp 312–335
Harris M, Raviv A (1978) Some results on incentive con- McAfee P, McMillan J, Reny P (1989) Extracting the
tracts with applications to education and employment, surplus in the common value auction. Econometrica
health insurance and law enforcement. Am Econ Rev 5:1451–1459
68:20–30 Melumad N, Reichelstein S (1987) Centralization vs dele-
Harris M, Raviv A (1979) Optimal incentive contracts with gation and the value of communication. J Account Res
imperfect information. J Econ Theory 2:231–259 25:1–18
Holmström B (1979) Moral hazard and observability. Bell Melumad N, Mookherjee D, Reichestein S (1995) Hierar-
J Econ 10:74–91 chical decentralization of incentive contracts. Rand
Holmström B (1982) Moral hazard in teams. Bell J Econ J Econ 26:654–672
13:324–340 Milgrom P, Weber RJ (1982) A theory of auctions and
Holmström B, Milgrom P (1991) Multitask principal-agent competitive bidding. Econometrica 50:1089–1122
analysis: incentive contracts, assets ownership, and job Mirrlees J (1971) An exploration in the theory of optimum
design. J Law Econ Organ 7(Suppl):24–52 income taxation. Rev Econ Stud 38:175–208
Huck S, Rey-Biel P (2006) Endogenous leadership in Mirrlees J (1975) The theory of moral hazard and
teams. J Inst Theor Econ 162:1–9 unobservable behavior, part I. WP Nuffield College,
Itoh H (1990) Incentives to help in multi-agent situations. Oxford
Econometrica 59:611–636 Mookherjee D (1984) Optimal incentive schemes with
Itoh H (2004) Moral hazard and other-regarding prefer- many agents. Rev Econ Stud 51:433–446
ences. Jpn Econ Rev 55:18–45 Mussa M, Rosen S (1978) Monopoly and product quality.
Jensen M, Meckling W (1976) The theory of the firm, J Econ Theory 18:301–317
managerial behavior, agency costs and ownership Myerson R (1979) Incentive compatibility and the
structure. J Finan Econ 3:305–360 bargaining problem. Econometrica 47:61–73
Karlan D, Zinman J (2009) Observing unobservables: Myerson R (1981) Optimal auction design. Math Oper Res
identifying information asymmetries with a con- 6:58–73
sumer credit field experiment. Econometrica Myerson R (1982) Optimal coordination mechanisms in
77:1993–2008 generalized principal-agent models. J Math Econ
Klemperer P (2004) Auctions: theory and practice. 10:67–81
Princeton University Press, Princeton Pauly MV (1968) The economics of moral hazard. Am
Laffont JJ (1990) Analysis of hidden gaming in a three Econ Rev 58:531–537
levels hierarchy. J Law Econ Organ 6:301–324 Pauly MV (1974) Overinsurance and public provision of
Laffont JJ, Tirole J (1986) Using cost observation to regu- insurance: the roles of moral hazard and adverse selec-
late firms. J Polit Econ 94:614–641 tion. Q J Econ 88:44–62
Laffont JJ, Tirole J (1987) Comparative statics of the Picard P (1987) On the design of incentive schemes under
optimal dynamic incentive contract. Eur Econ Rev moral hazard and adverse selection. J Public Econ
31:901–926 33:305–331
Laffont JJ, Tirole J (1988) The dynamics of incentive Rabin M (1993) Incorporating fairness into game theory
contracts. Econometrica 56:1153–1176 and economics. Am Econ Rev 83:1281–1302
Laffont JJ, Tirole J (1990) Adverse selection and renegoti- Radner R (1981) Monitoring cooperative agreements in a
ation in procurement. Rev Econ Stud 57:597–626 repeated principal-agent relationship. Econometrica
Lambert R (1983) Long term contracts and moral hazard. 49:1127–1148
Bell J Econ 14:441–452 Rey P, Salanié B (1990) Long term, short term and rene-
Lazear E (1995) Personnel economics. MIT Press, gotiation. Econometrica 58:597–619
Cambridge Rey-Biel P (2008) Inequity aversion and team incentives.
Macho-Stadler I, Pérez-Castrillo D (1993) Moral hazard ELSE WP Scand J Econ 110:297–320
with several agents: the gains from cooperation. Int Riordan MH, Sappington DE (1987) Information, incen-
J Ind Organ 11:73–100 tives, and the organizational mode. Q J Econ
Macho-Stadler I, Pérez-Castrillo D (1998) Centralized and 102:243–263
decentralized contracts in a moral hazard environment. Rogerson W (1985) Repeated moral hazard. Econometrica
J Ind Econ 46:489–510 53:69–76
Malcomson JM, Spinnewyn F (1988) The multiperiod Ross SA (1973) The economic theory of agency: the prin-
principal-agent problem. Rev Econ Stud 55:391–408 cipal’s problem. Am Econ Rev 63:134–139
Principal-Agent Models 303
Roth AE, Sotomayor M (1990) Two-sided matching: a Vickrey W (1961) Counterspeculation, auctions and com-
study in game-theoretic modeling and analysis. Cam- petitive sealed tenders. J Finan 16:8–37
bridge University Press, New York Vickrey W (1962) Auction and bidding games. In: Recent
Rothschild M, Stiglitz J (1976) Equilibrium in competitive advances in game theory. The Princeton University Con-
insurance markets: an essay in the economics of imper- ference Proceedings, Princeton, New Jersey, pp 15–27
fect information. Q J Econ 90:629–650
Rubinstein A, Yaari ME (1983) Repeated insurance con-
tracts and moral hazard. J Econ Theory 30:74–97 Books and Reviews
Serfes K (2008) Endogenous matching in a market with Hart O, Holmström B (1987) The theory of contracts. In:
heterogeneous principals and agents. Int J Game The- Bewley T (ed) Advances in economic theory, fifth
ory 36:587–619 world congress. Cambridge University Press,
Shapley LS, Shubik M (1972) The assignment game I: the Cambridge
core. Int J Game Theory 1:111–130 Hirshleifer J, Riley JG (1992) The analytics of uncertainty
Spence M (1974) Market signaling. Harvard University and information. Cambridge University Press,
Press, Cambridge Cambridge
Spence M, Zeckhauser R (1971) Insurance information, Laffont JJ, Martimort D (2002) The theory of incentives:
and individual action. Am Econ Rev 61:380–387 the principal-agent model. Princeton University Press,
Stole L (1991) Mechanism design under common agency. Princeton
WP MIT, Cambridge Laffont JJ, Tirole J (1993) A theory of incentives in pro-
Tirole J (1986) Hierarchies and bureaucracies: on the role curement and regulation. MIT Press, Cambridge
of collusion in organizations. J Law Econ Organ Macho-Stadler I, Pérez-Castrillo D (1997) An introduction
2:181–214 to the economics of information: incentives and con-
Vera-Hernandez M (2003) Structural estimation of a tracts. Oxford University Press, Oxford
principal-agent model: moral hazard in medical insur- Milgrom P, Roberts J (1992) Economics, organization and
ance. Rand J Econ 34:670–693 management. Prentice-Hall, Englewood Cliffs
mainly of the field of applications. Firstly, they
Differential Games can be considered as games where the time is
continuous. This aspect is often considered for
Marc Quincampoix applications in economics or management sci-
Laboratoire de Mathématiques de Bretagne ences. Secondly, they also can be viewed as con-
Atlantique (LMBA), Université de Brest, Brest, trol problems with several controller having
France different objectives. In this way, differential
games are a part of control theory with conflicts
between the players. The second aspect concerns
Article Outline often classical applications of control theory:
engineers sciences.
Glossary The importance of the subject was emphasized
Definition of the Subject and Its Importance by J. Von Neuman in 1946 in his pioneer book
Introduction “Theory of games and Economic Behaviour”
Qualitative and Quantitative Differential Games (Von Neumann and Morgenstern 1946) We repeat
Existence of a Value for Zero Sum Differential most emphatically that our theory is thoroughly
Games static. A dynamic theory would unquestionably be
Nonantagonist Differential Games more complete and therefore preferable. But at
Stochastic Differential Games this date, the main efforts were devoted to treat
Differential Games with Incomplete Information static aspects of game theory. The true birth of the
Miscellaneous domain was pursuit differential games (motivated
Bibliography by military applications in the “Cold War”) devel-
oped in the 1950s concurrently developed by
Glossary R. Isaacs in USA and by L. Pontryagin in Soviet
Union (Pontryagin 1968). Now differential games
Dynamics This is the law which governs the have a wide range of applications from Econom-
evolution of the system: for differential games ics to engineers sciences but also more recently to
it is a differential equation. biology, behavioral ecology and population
Strategies This is the way a player chooses his dynamics. The present article is focused on two-
control as a function of the state of the system player zero sum and antagonist differential games.
and of the action of his opponents.
Information This is the set of parameters known
by the player in order to build his strategy. Introduction
L (here L > M). The objective of the Lion is to x0 ðt Þ ¼ f ðxðt Þ, uðt Þ, vðt ÞÞ: (3)
catch the Man as soon as possible; the aim of the
Man is to escape to the Lion as long as possible. In The players act on the state variable x by
this very intuitive game, we wish to introduce the choosing two controls u and v which are functions
important elements needed for defining a differ- of the time variable t taking their values in spaces
ential game: the dynamics, the actions of the U and V. (Note that for the Lion of the Man x =
players, the objectives, the rules of the games (y, z) and each player controls a part of the state:
(the strategies). The dynamics is then the games is said separated.) For making the
notations clearer, the first player which chooses
y0 ðt Þ ¼ uðt Þ, uðt ÞU , z0 ðt Þ ¼ vðt ÞV ; t u is called Ursula and the second player which
0 (1) chooses v is called Victor.
A classification of differential games according
where u(t) is the action chosen by the first player to the level of conflict between the players is very
(the Man), u(t) belongs to a set U = {u| kuk M} relevant. The case of opposite objectives of the
similarly v(t) belongs to the set V = {v|kvk L}. two players is concerned by antagonist games or
The objectives of the two players are of antagonist zero-sum games, while the case of non opposite
nature: The Lion wants to minimize the time objective is concerned by cooperative or non-
before capture while the Man wants to maximize antagonist differential games.
it. The rule of the game is the way the two players The objectives of the game can be
choose their actions: here we can assume that they quantitative – players want to minimize or maxi-
choose their instantaneous velocity as a function mize a payoff (a function depending on the initial
of the current positions state and their action, for instance the time of
capture in the Lion an Man case) – or qualitative
uðtÞ ¼ uðxðtÞ, yðtÞÞ, vðtÞ ¼ vðxðtÞ, yðtÞÞ (2) for instance one player wants the state to stay in a
subset of the state space while the other player
Clearly the time of capture T in a function of wants the state to reach a given target. In Isaacs’
the strategies u and v; the Lion wants to minimize terminology, these games are games of degree or
this function, the Man wants to maximize games of kind. The definition of the rules of the
it. Solving the game means to find the optimal games (or strategies) is one of the most difficult
value, namely, the time of capture which is mini- topics in differential games; this problem does not
mum over all strategies v and maximum over all exist for static games of for elementary pursuit
strategies u. In the elementary example of “Lion of games like Lion and Man. To illustrate this fact,
Man,” one can easily check that suppose that we have solve the game with a strat-
egy of feedback form, i.e., when controls depend
yz yz on the current state (as in Eq. (2)), then we have to
vðy, zÞ ¼ L , uðy, zÞ ¼ M ,
ky zk ky zk solve the differential equation
kyð0Þ zð0Þk
T¼ ,
LM x0 ðt Þ ¼ f ðxðt Þ, uðxðt ÞÞ, vðxðt ÞÞ
which is a rather intuitive solution: the Lion runs in which can has neither uniqueness nor existence of
direction to the Man at his maximal possible speed solutions, even if Eq. (3) enjoys good properties
L, while the Man runs in the opposite direction with of existence and uniqueness of solutions. This
his maximal speed M. We leave this illustrative point is crucial and is of pure mathematical nature.
example which has allowed to underline the con- A way to overcome this difficulty could be to
cepts of differential games, we will present now. impose regularity properties of the feedback strat-
The present article is mainly restricted to the egies. In the – young – history of differential
two players case. A state variable x is governed by games, this was the “time of paradoxes” in the
a differential equation 1960s–1970s, where specific regularity of
Differential Games 307
feedback was adequate for a class of problem but We also denote X the state space to which the
lead to “paradoxes” (for quantitative game nonex- state variable x belongs.
istence of the value, for qualitative games existence A nonanticipative strategy for the first player
of a initial position starting from which players are associates to any action of the second player
not able neither to win nor to lose) in slightly (to any control v), a control u of the first player
modified problems. This was solved in the 1970s such that at any time the v depends only of past
by enlarging the class of feedback strategies allo- values of v. It could be shown that every regular
wing the choice of the control to depend not only feedback strategy is such a nonanticipative strat-
on the current state space but also on passed values egy. So a nonanticipative strategy of the first
of the control of the opponents (this is the so-called player is a map a from V (the set of measurable
nonanticipative strategies introduced by Varaiya control v[0, +1) 7! V) on U (the set of measur-
Roxin Elliot Kalton (Varaiya 1967; Varaiya and able controls u), which has furthermore the non-
Lin 1967; Roxin 1969; Elliot and Kalton 1972), anticipative property. Similarly, one can define a
another class of strategies was introduced par strategy of the second player as a nonanticipative
Krasovski (Breakwell 1977). function b from U to V.
For differential games, another important fea- We will now present a rather general descrip-
ture is to express the fact that the two players acts tion of a target game viewed as a game of kind
simultaneously on the system. One way to trans- (qualitative game) or as a game of degree. For this,
late this fact in a mathematical way is to prove that we consider a given set in the state space – called
the order of the actions of the player does not the target – that Victor wants the state of the
modify the final result. Section “Qualitative and system to reach while Ursula wants to avoid the
Quantitative Differential Games” will concerns target. For the Lion and Man pursuit game, the
with this question. For instance, in the case of target can be considered as the set of (y, z) such
quantitative games like Lion and Man, the ques- that y = z.
tion is that interchanging the order of operations
“minimum” and “maximum” we must obtain the
Qualitative Target Games
same result. This is the problem of existence of the
The problems consists in finding the victory
value which is of first importance for differential
domains, i.e., the set of initial positions such that
games. We discuss this question in section “Exis-
one player can win. This leads to the precise
tence of a Value for Zero Sum Differential
following definition.
Games.” The fifth section is devoted to coopera-
Victor Victory domain WV is the set of initial
tive quantitative games. Some other aspect or
position x0 (not in C) such that he can find a strategy
applications of differential games are evoked in
b such that there exists a time T such that whatever is
section “Stochastic Differential Games.” In the
the control u chosen by Ursula, the target is reached
last section, we give short descriptions of very
before the time T (cf. Aubin 1991).
actual domains of researches as problems of infor-
In fact, to make this definition mathematically
mation or impulsive differential games.
correct, we have to add a small number e and to say
that the trajectory reaches a e neighborhood of C:
Qualitative and Quantitative Differential
W V ≔ x0 C, ∃b, ∃T , e > 0, such that 8u U,
Games
∃t > 0, dist ðx½x0 , u, bðuÞðtÞ, C Þ eg:
Throughout this article, we will make supposi-
tions such that as soon as the initial position is In the above formula, dist means the distance
fixed and the controls u and v are given, we have and t 7! x[x0, u, b(u)](t) denotes the trajectory
the existence of a unique associated trajectory associated with the control u and the strategy b.
defined for every time (this could be easily In a parallel way, the victory domain WU of
obtained with assumptions on the function f ). Ursula is the set of initial positions x0 for which
308 Differential Games
Ursula can find a strategy a such that whatever is played in a circular “arena” (instead of the whole
the control played by Victor, the associate trajec- plane), the problem is not obvious (Flynn 1973).
tory never reaches the target C. For more complex forms of “arenas” and for gen-
eral differential games with restricted space
W U ≔ x0 C, ∃a, such that 8v V, domain, the alternative problem was only solved
in 2001 (Cardaliaguet et al. 2001).
x½x0 , aðvÞ, vðt Þ C, 8t 0 :
Once the alternative property is known, the
second interesting step is to describe the victory
Before going further, it could be surprising to the
domains. In several examples, Isaacs has discov-
reader that the game is not “symmetric” due to the
ered that the boundaries of the domains satisfy a
presence of e > 0. This is a mathematical problem
geometric equation called the Isaacs equations
that exceed the scope of the present paper, for better
understanding the reader can imagine that e = 0 and
H ðx, nx Þ ¼ 0 for any x on the boundary
see Cardaliaguet (1996) for more deep analysis.
of the victory domain
The main problem of such a qualitative game is
(5)
the following alternative problem. Roughly
speaking, if one player does not win, the other
where nx denotes the normal of the victory domain.
player must win. Of course this appears to be the
Hypersurfaces satisfying such equation are semiper-
minimal requirement in order the modelization of
meable barriers; one player can prevent the other
the problem is correctly formulated. Nevertheless,
player to make the state crossing the barrier. From
the alternative is not an easy problem; for
this property, it is possible to get many information
instance, it is not obvious that the intuitive notions
on the winning strategies. But unfortunately, the
of feedback strategies are not suitable for giving a
victory domains are seldom regular enough to
positive answer of this question. The alternative
have a normal. Very often they have “corners”
can be expressed by the fact that the sets C WU and
even for very elementary games. So there are two
WV form a partition of the whole space X:
different ways to treat this question. A first approach
X ¼ C [ W U \ W V , and W U \ W V ¼ Ø: initiated by Isaacs himself, and developed by
Breakwell, Bernhard, Melykian consists in a precise
The above alternative is valid under several and fine geometrical analysis of semipermeable bar-
technical conditions (for instance C is an open riers (Breakwell 1977; Bernhard 1988; Buckdahn
set); we will not describe now and under the fol- et al. 2009a). When this approach is possible, this
lowing crucial condition – called Isaacs condition gives very precise information on the behavior of
the players. Unfortunately, this is hardly possible in
high dimension games and/or when the victory
min max < f ðx, u, vÞ, p >¼ max min < f ðx, u, vÞ, p >
vV uU uU vV domain is not smooth enough. A second approach
(4) consists in proving that the boundary of the victory
domain satisfies (5) in suitable generalized sense
for any direction p. This could be understood as (Quincampoix 1992; Cardaliaguet 1997) and to
the existence of a saddle point for the static game use this information to approach numerically the
with payoff < f (x, u, v), p > (< , > is the scalar victory domain (cf. Cardaliaguet et al. (1999) for a
product). The expression of the equation (4) detailed exposition of this method).
above is called the Hamiltonian of the game; it
is a function (x, p) 7! H(x, p).
The alternative theorem was firstly obtained by Quantitative Target Games
Krassovskii and Subbotin for a slightly different Here the goal of the player is of quantitative
notion of strategy, and by Cardaliaguet (1996) for nature: Victor wants to minimize the time to
nonanticipative strategy. It is worth pointing out reach the target C while Ursula wants to minimize
that even for the elementary Lion and Man game it. For describing the game, we associate to any
Differential Games 309
(
trajectory of the dynamics t [0, +1) 7! x(t) the #♭ ðx0 Þ ¼ inf b supu U t þ #♭ ðx½x0 , u, bðuÞðt ÞÞ
first time x() reaches C:
#♯ ðx0 Þ ¼ supa inf v V t þ #♯ ðx½x0 , aðvÞ, vðt ÞÞ
(6)
#C ðxðÞÞ ¼ inf ft 0j xðt Þ C g,
which is available as soon as the trajectories do
with, by convention, #C (x()) = +1 if x() does not reach the target.
not reach C.
The modelization problem of such a quantita-
tive game is to express the fact that both players Existence of a Value for Zero Sum
act simultaneously on the system. So roughly Differential Games
speaking, we must check that if Ursula chooses
her strategy before the second player, the result is Using the dynamic programming principle, it is
the same that in the case when Victor chooses his possible to deduce an infinitesimal characteriza-
strategy first. This means that the following value tion of the value functions (formally we subtract
functions of the game do coincide. the two sides of one line of (6), divide by t and we
let t tend to 0+). Under the Isaacs condition (4),
when the value functions are differentiable, it is
#♭ ðx0 Þ ¼ inf sup #C ðx½x0 , u, bðuÞÞ ðlower valueÞ,
b uU not difficult to show that the values #♯ and #♭ are
♯
# ðx0 Þ ¼ sup inf #C ðx½x0 , aðvÞ,vÞ ðupper valueÞ: solutions to a partial differential equation: the
a vV following Hamilton Jacobi Isaacs Equation
When #♭ = #♯, one says that the game has a H ðx, D#ðxÞÞ ¼ 1 in ℝN nC: (7)
value. Because the question of existence of value
in differential game is an essential question in This fact was noticed in the very beginning of
game theory, the section “Existence of a Value the history of differential games. Furthermore, if
for Zero Sum Differential Games” is devoted to the Hamilton Jacobi Equation has a continuously
the exposition of this feature. differentiable solution, then this solution is the
It is worth pointing out, that oppositely to value function and the game can be solved using
many static games, this quantitative differential feedback strategies (for instance, the Lion and
game is not in a normal form, namely, a player Man game). This is the famous Isaacs verification
does not play a strategy against a strategy of his Theorem. This is a completely satisfactory solva-
opponents. This is another main difference with tion of the game when the value is smooth. Unfor-
static games. In fact, it is sometimes possible to tunately, very early it was noticed that the values
write the game in normal form allowing the non- are not differentiable and so the previous
anticipative strategies to have a small delay approach is not possible.
(Cardaliaguet and Quincampoix 2008). The nor- Before going further, it is worth pointing out
mal form will be used in section “Nonantagonist that this is not only a mathematical question of
Differential Games” for nonzero sum games. regularity of the value functions. Indeed the
The Dynamic programming principle says, smoothness of the value function is closely related
roughly speaking, that the game maintains the of the number of optimal strategies in the game.
same structure if the time change. Indeed if The most of differential games, even very elemen-
starting from time 0 the players plays the games tary ones, have not smooth values. The reader can
until time t, then at time t both players are facing to convince himself by considering the Lion of the
a differential game of the same nature than the Man game in a circular arena with furthermore a
initial games. This can be expressed by the round pillar in the center of the arena (that of
dynamic programming equations course the players cannot cross).
310 Differential Games
The linear quadratic case was extensively stud- such that at any time t if two controls v1 and v2
ied in the book (Basar and Bernhard 1995). In the coincide on [0, t], then the associated controls
fully nonlinear case, the Hamilton Jacobi Isaacs u1 = a(v1) and u2 = a(v2) do coincide on [0,
Equation becomes infinite dimensional. An alter- t + r]. The strategy nonanticipative with delay b
native approach can be done by setting Hamilton for the second player is defined in a similar way.
Jacobi Equations on the space of closed sets Clearly any strategy nonanticipative with delay is
(Quincampoix and Veliov 2005). nonanticipative. Moreover, it is possible to prove
(Cardaliaguet and Quincampoix 2008) that for a
Impulsive Games fixed initial condition (t0, x0) and for any pair
Impulsive differential games where the player can (a, b) of nonanticipative strategies with delays,
at any time either follow a continuous dynamics of there exists a unique pair of controls (u, v) satisfying
the form (3) or make a jump. These games com-
aðvÞ ¼ u, bðuÞ ¼ v:
bine the effects and difficulties of static and dif-
ferential games. Only very preliminaries result are Hence it is possible to associate a trajectory to
available (Buckdahn et al. 2004, pp. 223–249) for (a, b) which is the trajectory associated to (u, v) and
pursuit games when both players cannot “jump” we denote C1(t0, x0, a, b) and C1(t0, x0, a, b) the
simultaneously. Another interesting classes of associated costs. This enables to write the game in a
pursuit impulsive game is the case when each normal form.
player can choose at every time between two The antagonist differential game problem con-
dynamics. This domain is still widely open. sists in finding Nash Equilibria defined as fol-
lows. Fix (t0, x0) an initial condition, a pair of
real numbers (e1, e2) is a Nash equilibria payoff
Nonantagonist Differential Games if and only if there exists
a pair of nonanticipative
strategies with delays a,b such that
This section is devoted to a two player game on
[0, T] with different objectives: Ursula wants to
e1 ¼ C 1 t 0 , x0 , a, b , e2 ¼ C 2 t 0 , x0 , a, b
maximize a cost
Z and such that for any other pair of nonanticipative
T
C 1 ðt0 , x0 , u, vÞ≔ L1 ðxðsÞ, uðsÞ, vðsÞÞds þ g 1 ðxðT ÞÞ, strategy with delay (a, b), the following inequal-
t0 ities hold true
while Victor wants to maximize a payoff C 1 t 0 , x0 , a, b C 1 t 0 , x0 , a, b ,
Z T C 2 t 0 , x0 , a, b C 2 ðt 0 , x0 , a, bÞ:
C 2 ðt0 , x0 , u, vÞ≔ L2 ðxðsÞ, uðsÞ, vðsÞÞds þ g 2 ðxðT ÞÞ:
t0 In a completely rigorous mathematical point of
view, the above definition must be understood
Observe that if C2 + C1 = 0 (i.e., L1 + L2 = 0 and with a small e > 0 error (the
g1 + g2 = 0), the game reduces to the game studied correct
statement is:
For all ϵ > 0, there exists a, b such that
in section “Existence of a Value for Zero Sum
Differential Games.” Here we consider the general j ei C i t 0 , x0 , a, b j ϵ, i ¼ 1, 2
case where C1 + C2 is non necessarily equal to 0. In
a nonantagonist game, it is important to play a and for any other pair of strategies (a, b) we have
strategy against a strategy, so we introduced the
concept of nonanticipative strategies with delay on C 1 t 0 , x0 , a, b C 1 t 0 , x0 , a, b ϵ,
the time interval [0, T]. A nonanticipative strategy C 2 t 0 , x0 , a, b C 2 ðt 0 , x0 , a, bÞ ϵ
with delay for the first player associates to any action
of the second player (to any control v) a control u of cf. Cardaliaguet and Plasckacz (2003). Nash equi-
the first player such that there exists a delay r > 0 libria were studied in detail in Klejmenov (1993)
312 Differential Games
for another concept of strategy. Now the unstable (Bressan 2011). We refer the reader to
remaining part of this section concerns first the the bibliography for noncooperative games in
characterization of the Nash equilibria through specific cases and applications.
zero-sum games and second the existence of
Nash equilibria. For doing this, we introduce the
following value function associated with two aux-
Stochastic Differential Games
iliary zero-sum games where one player is the
minimizer while its opponent is the maximizer.
This part is devoted to the description of Stochas-
tic differential games which are games with ran-
V 1 ðt 0 , x0 Þ ¼ inf sup C 1 ðt 0 , x0 ,u, bðuÞÞ, domness in the dynamics.
b u U ðt 0 Þ
V 2 ðt 0 , x0 Þ ¼ inf sup C 2 ðt 0 , x0 , aðvÞ,vÞ: The dynamics is given by a stochastic differ-
a v V ðt Þ ential equation
0
Recall that under Isaacs condition, the value of dX t ¼ f ðX t , ut , vt Þdt þ sðX t , ut , vt ÞdW t , (10)
above auxiliary games does exist. Then now the
characterization of Nash payoff can be expressed: where W is a Brownian motion on a given proba-
the pair (e1, e2) is a Nash equilibrium payoff if and bility space. Victor and Ursula choose the controls
only if there exist two controls (u, v) such that u and v which are adapted processes (with respect
to the filtration generated by the Brownian
e1 ¼ C ðt 0 , x0 ,u,vÞ, e2 ¼ C ðt 0 , x0 ,u,vÞ motion). The notions of nonanticipative strategies
are similar to the determinist case but the strate-
and for all time t [t0, T] gies are nonanticipative in time almost surely. We
refer the reader to Cardaliaguet and Rainer (2013)
e1 V 1 ðt, x½t 0 , x0 ,u,vðt ÞÞ, for a precise formulation of strategies. The rigor-
ous description of the game is rather technical due
e2 V 2 ðt, x½t 0 , x0 ,u,vðt ÞÞ
to the need of stochastic analysis. The reader can
refer to the nice article (Rainer 2007) for a precise
Observe that this is not an intuitive result
formulation. We want just to stress some impor-
(Once again a correct statement is, a ϵ > 0 is
tant aspects of stochastic zero-sum games. Of
needed to have a rigorous statement: For any ϵ
course the notions of qualitative and quantitative
there exists a pair (u, v) satisfying above inequal-
games are relevant. The qualitative aspect is not
ity up to an error ϵ). From this result, it is
yet well-studied. Oppositely the quantitative
possible – but not easy – to obtain the existence
aspect is now well-studied. Consider for instance
of a Nash equilibrium pair under Isaacs condition.
a payoff of the form
Observe that at a first glance, it is very sur-
prising to obtain the existence of the Nash equi-
Chðt 0 , x0 ,u,vÞ≔ i
libria assuming Isaacs condition which means RT
the existence of saddle point of static games. E t0 LðxðsÞ, uðsÞ, vðsÞÞds þ gðxðT ÞÞ ,
One could assume instead the existence of Nash
equilibria of static games. Unfortunately, except where E is the expectation. The existence of the
for very specific games like some linear qua- value was obtained in Fleming and Souganidis
dratic differential games, this method fails. (1989), by adopting the same scheme that which
Another way to understand this lies in the fact was described in section “Quantitative Target
that Nash equilibria are very instable Games”: The values are the unique viscosity solu-
(cf. Buckdahn et al. 2004, pp. 57–67). One can tion of a – second order – Hamilton Jacobi Isaacs
also try to prove the existence of equilibria in Equation which has a unique solution. Another
feedback strategies; it is not always possible and approach restricted to the case where the drift s of
even if one such equilibrium exists, it is very Eq. (10) is not degenerate is possible with a
Differential Games 313
different notion of strategy and by using backward Before the game starts, the integer i is chosen
stochastic differential equations (Hamadene and randomly according to a probability p (belonging to
Lepeltier 1995). the set D(I) of probabilities on {1, 2. . . I}). The index
The nonantagonist case could be solved similarly i is communicated to Ursula only and not to Victor.
than the zero-sum case up to hard technicalities due As previously Victor wants to minimize the
to stochastic analysis (Buckdahn et al. 2004). cost C and Ursula want to maximize it. Both
players observe – in a nonanticipative way – the
actions (the controls) played by his/her opponent.
Differential Games with Incomplete It is worth pointing out that Victor has not enough
Information information to be able compute the current posi-
tion of the game. Nevertheless by observing his
This concerns the case where the players have not opponents behavior, he can try to deduce from this
the same information during the game. This was observation his missing information. Moreover
studied in the case of (discrete) repeated games knowing this, Victor will try to play in such way
since the pioneering work (Aumann and Maschler that he can hide as much as is possible his infor-
1995) with the help of mixed strategies (Petrosjan mation in order to keep an advantage (the way of
2004). hiding the information is the choice of random
We describe now a zero-sum differential game strategies).
with incomplete information on the initial posi- This allows to define values which depends on
tion. Consider the two players quantitative game (t, x) but also on the probability p: V ♯ (t, x, p) and
of section “Qualitative and Quantitative Differen- V ♭ (t, x, p). Under a suitable Isaacs condition,
tial Games,” with dynamics (3) and cost (8). We both value coincide because there are the unique
suppose that the initial position belongs to a set of viscosity solution of the following second order
I possible initial positions Hamilton Jacobi Isaacs equation
xi0 , i ¼ 1, . . . I:
8 2
< @V @V @ V
min ðt,X ,pÞ þ H x, ðt,X ,pÞ ; lmin ðt,X ,pÞ ¼ 0, for all ðt,X ,pÞ 0,T IRNI DðI Þ
@t P @t @p2
:
V ðT ,X Þ ¼ i pi g ðxi Þ, for all ðx,pÞ IRNI DðI Þ
(11)
where X = (xi)i = 1,2,. . .I, with the Hamiltonian model was also extended to both side incomplete
information: Namely, the initial position is ran-
i ,j
H ðX ,xÞ≔ domly chosen among x0 , i = 1, 2. . . I, j = 1,. . . J
XI according to a probability p q D(I) D(J), the
min max fLðxi ,u,vÞþ < f ðxi ,u,vÞ, xi >g index i is communicated to Ursula but not to Victor
vV uU
i¼1
while the index j is communicated only to Victor.
Also the information can concern not only the initial
@2 V
and where lmin @p2 ðt,x,pÞ is the smallest eigen- i, j
positions x0 but also the dynamics fi, j or the costs.
value of the symmetric matrix @@pV2 ðt,x,pÞ . This
2
The existence of the Value is obtained in these cases
shows the existence of a Value in mixed non- and also for stochastic differential games
anticipative strategies (Cardaliaguet 2007). This (Cardaliaguet and Rainer 2009).
314 Differential Games
Cardaliaguet P (2007) Differential games with asymmetric Pontryagin N (1968) Linear differential games, I–II. Soviet
information. SIAM J Control Optim 46(3):816–838 Math Dokl 8(3–4):769–771, 910–913
Cardaliaguet P (2017) The convergence problem in mean Quincampoix M (1992) Differential inclusions and target
field games with local coupling. Appl Math Optim problems. SIAM J Control Optim 30(2):324–335
76(1):177–215 Quincampoix M, Veliov V (2005) Optimal control of
Cardaliaguet P, Plasckacz S (2003) Existence and uniqueness uncertain systems with incomplete information for the
of a Nash equilibrium feedback for a simple non zero- disturbance. SIAM J Control Optim 43(4):1373–1399
sum differential game. Int J Game Theory 32(4):561–593 Rainer C (2007) On two different approaches to nonzero-sum
Cardaliaguet P, Quincampoix M (2008) Determinist differ- stochastic differential games. Appl Math Optim 56:131–144
ential games under probability knowledge of initial Roxin E (1969) The axiomatic approach in differential
condition. Int Game Theory Rev 10(1):1–16 games. J Optim Theory Appl 3:153–163
Cardaliaguet P, Rainer C (2009) Cardaliaguet, Pierre; Subbotin AI (1995) Generalized solutions of first-order
Rainer, Catherine stochastic differential games with PDEs. The dynamical optimization perspective. Trans-
asymmetric information. Appl Math Optim 59(1):136 lated from the Russian. Systems & control: foundations
Cardaliaguet P, Rainer C (2013) Pathwise strategies for sto- & applications. Birkhäuser, Boston
chastic differential games. Appl Math Optim 68(1):75–84 Varaiya P (1967) The existence of solution to a differential
Cardaliaguet P, Quincampoix M, Saint-Pierre P (1999) Set- game. SIAM J Control Optim 5:153–162
valued numerical analysis for optimal control and dif- Varaiya P, Lin J (1967) Existence of saddle points i n
ferential games. In: Bardi M, Parthasaranthy T, TES differential game. SIAM J Control Optim 7(1):141–157
R (eds) Numerical methods for optimal control and Von Neumann J, Morgenstern O (1946) Theory of games
numerical games. Annals of International Society of and economic behaviour. Princeton University Press,
Dynamical Games. Birkhäuser, Boston, pp 177–249 Princeton
Cardaliaguet P, Quincampoix M, Saint-Pierre P (2001)
Pursuit differential games with state constraints.
SIAM J Control Optim 39(5):1615–1632 Books and Reviews
Cardaliaguet P, Jimenez C, Quincampoix M (2014) Pure Bardi M, Raghavan TES, Parthasarath T (eds) (1999) Sto-
and random strategies in differential game with incom- chastic and differential games. Theory and numerical
plete informations. J Dyn Games 1(3):363–375 methods. Dedicated to Prof. A. I. Subbotin. Annals of
Elliot N, Kalton N (1972) The existence of value in differ- the International Society of Dynamic Games,
ential games. Mems Am Math Soc 126 vol 4. Birkhäuser, Boston
Evans LC, Souganidis PE (1984) Differential games and Basar T, Olsder GJ (1999) Dynamic noncooperative game
representation formulas for solutions of Hamilton- theory, 2nd edn. Classics in applied mathematics,
Jacobi equations. Indiana Univ Math J 282:487–502 vol 23. SIAM, Society for Industrial and Applied Math-
Fleming W, Souganidis P (1989) On the existence of value ematics, Philadelphia
functions of two-player, zero-sum stochastic differen- Blaquière A, Gérard F, Leitman G (1969) Quantitative and
tial games. Indiana Univ Math J 38(2):293–314 qualitative games. Academic, New York
Flynn J (1973) Lion and man: the boundary constraints. Buckdahn R, Cardaliaguet P, Quincampoix M (2011)
SIAM J Control 11:397 Some recent aspects of differential game theory. Dyn
Hamadene S, Lepeltier J-P (1995) Backward equations, Games Appl 1(1):74–114
stochastic control and zero-sum stochastic differential Dockner E, Jörgensen S, Van Long N, Sorger G (2000)
games. Stoch Int J Probab Stoch Process 54:221–231 Differential games in economics and management sci-
Jimenez C, Quincampoix M (2018) Hamilton Jacobi Isaacs ence. Cambridge University Press, Cambridge
equations for differential games with asymmetric infor- Hajek O (1975) Pursuit games. Academic, New York
mation on probabilistic initial condition. J Math Anal Isaacs R (1965) Differential games. Wiley, New York
Appl 457(2):1422–1451 Jorgensen S, Quincampoix M, Vincent T (eds) (2007)
Klejmenov AF (1993) Nonantagonistic positional differ- Advances in dynamic games theory. Annals of Interna-
ential games. Nauka, Ekaterinburg tional Society of Dynamic Games. Birkhäuser, Boston
Lasry J-M, Lions P-L (2007) Mean field games. Jpn J Math Krassovski NN, Subbotin AI (1988) Game-theorical con-
2(1):229–260 trol problems. Springer, New York
Marigonda A, Quincampoix M (2018) Mayer control prob- Melikyan AA (1998) Generalized characteristics of first
lem with probabilistic uncertainty on initial positions. order PDEs. Applications in optimal control and differ-
J Differ Equ 264(5):3212–3252 ential games. Birkhäuser, Boston
Petrosjan LA (2004) Cooperation in games with incomplete Patsko VS, Turova VL (2000) Numerical study of differ-
information. In: Nonlinear analysis and convex analysis. ential games with the homicidal chauffeur dynamics.
Yokohama Publishers, Yokohama, pp 469–479 Russian Academy of Sciences, Institute of Mathemat-
Plaskacz S, Quincampoix M (2000) Discontinuous Mayer ics and Mechanics, Ekaterinburg
control problem under state-constraints. Topol Petrosyan LA (1993) Differential games of pursuit. Series
Methods Nonlinear Anal 15:91–100 on optimization, vol 2. World Scientific, Singapore
VCG mechanisms A family of mechanisms that
Mechanism Design implement in dominant strategies the social
choice function that maximizes the social
Ron Lavi welfare.
The Technion – Israel Institute of Technology,
Haifa, Israel
Definition of the Subject
Article Outline
Mechanism design is a sub-field of economics and
game theory that studies the construction of social
Glossary
mechanisms in the presence of rational but selfish
Definition of the Subject
individuals (players/agents). The nature of the
Introduction
players dictates a basic contrast between the social
Formal Model and Early Results
planner, that aims to reach a socially desirable
Quasi-Linear Utilities and the VCG Mechanism
outcome, and the players, that care only about
The Importance of the Domain’s Dimensionality
their own private utility. The underlying question
Single-Dimensional Domains
is how to incentivize the players to cooperate, in
Multi-dimensional Domains
order to reach the desirable social outcomes.
Budget Balancedness and Bayesian Mechanism
A mechanism is a game, in which each agent is
Design
required to choose one action among a set of
Interdependent Valuations
possible actions. The social designer then chooses
Future Directions
an outcome, based on the chosen actions. This
Bibliography
outcome is typically a coupling ofa physical out-
come, and a payment given to each individual.
Glossary
Mechanism design studies how to design the
mechanism such that the equilibrium behavior of
A social choice function A function that deter-
the players will lead to the socially desired goal.
mines a social choice according to players’
The theory of mechanism design has greatly
preferences over the different possible
influenced several sub-fields of micro-economics,
alternatives.
for example auction theory and contract theory,
A mechanism A game in incomplete information,
and the 2007 Nobel prize in Economics was
in which player strategies are based on their
awarded to Leonid Hurwicz, Eric Maskin, and
private preferences. A mechanism implements a
Roger Myerson “for having laid the foundations
social choice function f if the equilibrium strate-
of mechanism design theory”.
gies yield an outcome that coincides with f.
Dominant strategies An equilibrium concept
where the strategy of each player maximizes
her utility, no matter what strategies the other Introduction
players choose.
Bayesian-Nash equilibrium An equilibrium It will be useful to start with an example of a
concept that requires the strategy of each mechanism design setting, the well-known “pub-
player to maximize the expected utility of the lic project” problem (Clarke [8]): a government is
player, where the expectation is taken over the trying to decide on a certain public project (the
types of the other players. common example is “building a bridge”). The
project costs C dollars, and each player, i, will are not fully observed by each player separately.
benefit from it to an amount of vi dollars, where Rather, each player receives a signal that gives a
this number is known only to the player herself. partial indication to her valuation. Mechanism
The government desires to build the bridge if and design for such settings is discussed in section
only if ivi > C. But how should this condition be “Interdependent Valuations”.
checked? Clearly, every player has an interest in One of the most impressive applications of the
over-stating its own vi, if this report is not accom- general mechanism design literature is auction the-
panied by any payment at all, and most probably ory. An auction is a specific form of a mechanism,
agents will understate their values, if asked to pay where the outcome is simply the specific allocation
some proportional amount to the declared value. of the goods to the players, plus the prices they are
Clarke describes an elegant mechanism that required to pay. Vickrey [27] initiated the study of
solves this problem. His mechanism has the fan- auctions in a mechanism design setting, and in fact
tastic property that, from the point of view of perhaps the study of mechanisms itself. After the
every player, no matter what the other players fundamental study general mechanism design in
declare, it is always in the best interest of the the 1970s, in the 1980s the focus of the research
player to declare his true value. Thus, truthful community returned to this important application,
reporting is a dominant-strategy equilibrium of and many models were studied.
the mechanism, and under this equilibrium, the We note that there are several other entries in this
government’s goal is fully achieved. A more for- book that are strongly related to the subject of
mal treatment of this result is given in section “mechanism design”. In particular, the entry on
“Quasi-Linear Utilities and the VCG Mechanism” “▶ Game Theory, Introduction to” gives a broader
below. background on the mathematical methods and tools
Clarke’s paper, published in the early 1970s, that are used by mechanism designers, and the entry
was part of a large body of work that started to on “▶ Implementation Theory” handles similar sub-
investigate mechanism design questions. Most of jects to this entry from a different point of view.
the early works used two different assumptions
about the structure of players’ utilities. Under the
assumption that utilities are general, and that the
influence of monetary transfers on the utility are Formal Model and Early Results
not well predicted, the literature have produced
mainly impossibilities, which are described in sec- A social designer wishes to choose one possible
tion “Formal Model and Early Results”. The outcome/alternative out of a set A of possible
assumption that utilities are quasi-linear in money alternatives. There are n players, each has her
was successfully used to introduce positive and own preference order i over A. This preference
impressive results, as discussed in detail in sections order is termed the player’s “type”. The set
“Quasi-Linear Utilities and the VCG Mechanism” (domain) of all valid player preferences is denoted
and “The Importance of the Domain’s Dimension- by Vi. The designer has a social choice function
ality” These mechanisms apply the solution con- f : V1 Vn ! A, that specifies the desired
cept of dominant-strategy equilibrium, which is a alternative, given any profile of individual prefer-
strong solution concept that may prevent several ences over the alternatives. The problem is that
desirable properties from being achieved. To over- these preferences are private information of each
come its difficulties, the weaker concept of a player – the social designer does not know them,
Bayesian–Nash equilibrium is usually employed. and thus cannot simply invoke f in order to deter-
This concept, and one main possibility result that it mine the social alternative. Players are assumed to
provides, are described in section “Budget be strategic, and therefore we are in a game-
Balancedness and Bayesian Mechanism Design” theoretic situation.
The last important model that this entry covers To implement the social choice function, the
aims to capture settings where the players’ values designer constructs a “game in incomplete
Mechanism Design 319
information”, as follows. Each player is required to the alternative set contains two alternatives
choose an action out of a set of possible actions A i, (candidate 100 and candidate 200 ), and each player
and a target function g : A 1 A n ! A spec- either prefers 1 over 2, 2 over 1, or is indifferent
ifies the chosen alternative, as a function of the between the two. It turns out that the majority
players’ actions. A player’s choice of action may, voting rule is the dominant strategy
of-course, depend on her actual preference order. implementable, by the following mechanism:
Furthermore, we assume an incomplete informa- each player reports her top candidate, and the can-
tion setting, and therefore it cannot depend on any didate that is preferred by the majority of the
of the other players’ preferences. Thus, to play the players is chosen. This mechanism is a “direct-
game, player i chooses a strategy si: V i ! A i . revelation” mechanism, in the sense that the action
A strategy si() dominates another strategy s0i ðÞ space of each player is to report a preference, and
if, for every tuple of actions ai of the other g is exactly f. In a direct-revelation mechanism, the
players, and for every preference i Vi, hope is that truthful reporting (i.e. si(i) ¼ i) is a
g(si(i), ai)i g(ai, ai)), for any ai A i . In dominant strategy. It is not hard to verify that in this
other words, no matter what the other players are two candidates setting, this is indeed the case, and
doing, the player cannot improve her situation by hence the mechanism implements in dominant-
using an action other than si(i). strategies the majority voting rule.
A mechanism implements the social choice An elegant generalization for the case of a
function f in dominant strategies if there exist “single-peaked” domain is as follows. Assume that
dominant strategies s1(), . . ., sn() such that f the alternatives are numbered as A ¼ {a1, . . ., an},
(1, . . ., n) ¼ g(s1(1), . . ., sn(n)), for and the valid preferences of a player are single-
any profile of preferences 1, . . ., n. In other peaked, in the sense that the preference order is
words, a mechanism implements the social choice completely determined by the choice of a peak
function f if, given that players indeed play their alternative, ap. Given the peak, the preference
equilibrium strategies (in this case the dominant between any two alternatives ai, aj is determined
strategies equilibrium), the outcome of the mech- according to their distance from ap, i.e. aii aj if
anism coincides with f‘s choice. and only if jj p j j i pj. Now consider the
The theory of mechanism design asks: given a social choice function f(p1, . . ., pn) ¼ median
specific problem domain (an alternative set and a (p1, . . ., pn), i.e. the chosen alternative is the
domain of preferences), and a social choice func- median alternative of all peak alternatives.
tion, how can we construct a mechanism that
implements it (if at all)? As we shall see below, Theorem 1 Suppose that the domain of prefer-
the literature uses a variety of “solution concepts”, ences is single-peaked. Then the median social
in addition to the concept of dominant strategies choice function is implementable in dominant
equilibrium, and an impressive set of understand- strategies.
ings have emerged.
The concept of implementing a function with a Proof Consider the direct revelation mechanism
dominant-strategy mechanism seems at first too in which each player reports a peak alternative,
strong, as it requires each player to know exactly and the mechanism outputs the median of all
what action to take, regardless of the actions the peaks. Let us argue that reporting the true peak
others take. Indeed, as we will next describe in alternative is a dominant strategy. Suppose the
detail, if we do not make any further assumptions other players reported pi, and that the true peak
then this notion yields mainly impossibilities. of player i is pi. Let pm be the median index. If
Nevertheless, it is not completely empty, and it pi ¼ pm then clearly player i cannot gain by
may be useful to start with a positive example, to declaring a different peak. Thus, assume that
illustrate the new notions defined above. pi < pm, and let us examine a false declaration
Consider a voting scenario, where the society p0i player i. If p0i pm then pm remains the median,
needs to choose one out oftwo candidates. Thus, and the player did not gain. If p0i > pm then the
320 Mechanism Design
new median is p0m pm, and since pi < pm, this is Theorem 3 (The Direct Revelation Principle)
less preferred by i. Thus, player i cannot gain by Any implementable social choice function can
declaring a false peak alternative if the true peak also be implemented (using the same solution
alternative is smaller or equal to the median alter- concept) by a direct-revelation mechanism.
native. A similar argument holds for the case of
pi > pm. □ Proof Given a mechanism M that implements f,
with dominant strategies si ðÞ , we construct a
In a voting situation with two candidates, the direct revelation mechanism M0 as follows: for
median rule becomes the same as the majority any tuple of preferences ¼ (1, . . ., ni,
rule, and the domain is indeed single-peaked. g0() ¼ g(s()). Since si ðÞ is a dominant strat-
When we have three or more candidates, it is not egy in M, we have that for any fixed i Vi
hard to verify that the majority rule is different than and any i Vi, the action ai ¼ si ði Þ is dom-
the median rule. In addition, one can also check inant when i’s type is i. Hence declaring any
that the direct-revelation mechanism that uses the other type ei that will “produce” an action aei ¼
majority rule does not have truthfulness as a dom- si ei , cannot increase i’s utility. Therefore, the
inant strategy. Of-course, many times one cannot strategy i in M0 is dominant. □
order the candidates on a line, and any preference
ordering over the candidates is plausible. What The proof uses the dominant-strategies solution
voting rules are implementable in such a setting? concept, but any other equilibrium definition will
This question was asked by Gibbard [12] and also work, using the same argumentation. Though
Satterthwaite [26], who provided a beautiful and technically very simple, the revelation principle is
fundamental impossibility. A domain of player fundamental. It states that, when checking if a
preferences is unrestricted if it contains all possible certain function is implementable, it is enough to
preference orderings. In our voting example, for check the direct-revelation mechanism that is asso-
instance, the domain is unrestricted if every order- ciated with it. If it turns out to be truthful, we still
ing of the candidates is valid (in contrast to the case may want to implement it with an indirect mecha-
of a single-peaked domain). A social choice func- nism that will seem more natural and “real”, but if
tion is dictatorial if it always chooses the top alter- the direct-revelation mechanism is not truthful,
native of a certain fixed player (the dictator). then there is no hope of implementing the function.
The proof of the theorem of Gibbard and
Theorem 2 ([12, 27)] Every social choice func- Satterthwaite relies on the revelation principle to
tion over an unrestricted domain preferences, with focus on direct-revelation mechanisms, but this is
at least three alternatives, must be dictatorial. just the beginning. The next step is to show that
any non-dictatorial function is non-implementable.
The proof this theorem, and in fact ofmost other The proof achieves this by an interesting reduction to
impossibility theorems in mechanism design, uses Arrow’s theorem, from the field of social choice
as a first step the powerful direct-revelation princi- theory. This theory is concerned with the possibilities
ple. Though the examples we have seen above use and impossibilities of social preference aggregations
a direct revelation mechanism, one can try to con- that will exhibit desirable properties. A social welfare
struct “complicated” mechanisms with “crazy” function F : V ! R aggregates the individuals’
action spaces and outcome functions, and by this preferences into a single preference order over all
obtain dominant strategies. How should one reason alternatives, where R is the set of all possible pref-
about such vast space of possible constructions? erence orders over A. Arrow [2] describes few desir-
The revelation principle says that one cannot gain able properties from a social welfare function, and
extra power by such complex constructions, since shows that no social choice function can satisfy all:
if there exists an implementation to a specific func-
tion then there exists a directrevelation mechanism Definition 1 (Arrow’s desirable properties)
that implements it. 1. A social welfare function satisfies “weak
Mechanism Design 321
Pareto” if whenever all individuals strictly pre- etc. The interesting exercise is to show that the
fer alternative a to alternative b then, in the implementability of implies that F satisfies
social preference, a is strictly preferred to b. Arrow’s conditions. In fact, as the proof shows
2. A social welfare function is “a dictatorship” if that any implementable social choice function
there exists an individual for which the social f entails a social welfare function F that “extends”
preference is always identical to his own f and satisfies Arrow conditions, it actually pro-
preference. vides a strong argument for the reasonability of
3. A social welfare function F satisfies the “Inde- Arrow’s requirement–they are simply implied by
pendence of Irrelevant Alternatives” property the implementability requirement.
(IIA) if, for any preference orders R, Re R and In view of these strong impossibility results, it
any a, b A, is natural to ask whether the entire concept of a
mechanism can yield positive constructions. The
answer is a big yes, under the “right” set of
a>FðRÞ b and b> e a ) ∃i : a>Ri b and b>Ri a assumptions, as discussed in the next sections.
F R
Definition 2 (Truthfulness, or Incentive Com- change i’s utility (if at all) is to declare some vei
patibility, or Strategy-Proofness) A direct rev- that will cause the project to be rejected. But in
elation mechanism is “truthful” (or incentive- this case i’s utility will be zero, hence she did not
compatible, or strategy-proof) if the dominant gain any benefit. □
strategy of each player is to reveal her true type,
i.e. if for every vi Vi and every vi , v0i V i, Subsequently, Groves [13] made the remark-
able observation that Clarke’s mechanism is in
vi ð f ðvi,
vi
ÞÞ piðvi , viÞ
fact a special case of a much more general mech-
vi f v0i , vi pi v0i , vi anism, that solves the welfare maximization
problem on any domain with private values and
quasi-linear utilities. For a given set of player
Using this framework, we can return to the types v1, . . ., vn, the welfare obtained by an
example from section “Introduction” (“building alternative a A is jvi(a). A social choice
a bridge”), and construct a truthful mechanism to function is termed a welfare maximizer if f(v)
solve it. Recall that, in this problem, a government is an alternative with maximal welfare, i.e.
Pn
is trying to decide on a certain public project, f ðvÞ argmaxa A v
i¼1 i ða Þ .
which costs C dollars. Each player, i, will benefit
from it to an amount of vi dollars, where this Definition 3 (VCG Mechanisms) Given a set of
number is known only to the player herself. The alternatives A, and a domain of players’ types
government desires to build the bridge if and only V ¼ V1 . . . Vn, a VCG mechanism is a direct
if ivi C. Clarke [8] designed the following revelation mechanism such that, for any v V,
mechanism. Each player reports a value, vei , and Pn
P
the bridge is built if and only if i vei C. Ifthe 1. f ðvÞ arg max a A i¼1 vi ðaÞ .
bridge is not built, the price of each player is 2. pi(v) ¼ j 6¼ ivj(f(v)) + hi(vi), where hi:
0. Ifthe bridge is built then each player, i, pays Vi ! ℜ is an arbitrary function.
the minimal value she could have declared to
maintain the positive decision. More precisely, if Ignore for a moment the term hi(vi) in the
P
ei0 C then she still pays zero, and other-
i0 6¼i v payment functions. Then the VCG mechanism
P
wise she pays C i0 6¼i vei0 . has a very natural interpretation: it chooses an
alternative with maximal welfare according to
Theorem 5 Bidding the true value is a dominant the reported types, and then, by making additional
strategy in the Clarke mechanism. payments, it equates the utility of each player to
that maximal welfare level.
f(vi, vi) ¼ a and f v0i , vi ¼ b. The above equation other different goals. It turns out that the answer
is now vi(a) + j 6¼Pi vj(a) < P vi(b) + j 6¼ i vj(b), depends on the “dimensionality” of the domain, as
or, equivalently, ni¼1 vi ðaÞ < ni¼1 vi ðbÞ, a con- is discussed in this section.
tradiction to the fact that f(vi, vi) ¼ a, since f() is
a welfare maximizer.
□ Single-Dimensional Domains
Thus, we see that the welfare maximizing Consider first a domain of preferences for which
social choice function can always be the type vi() can be completely described by a
implemented, no matter what the problem domain single number vi, in the following way. For each
is, under the assumption of quasi-linear utilities. player i, a subset of the alternatives are ‘iosing”
The VCG mechanism is named after Vickrey, alternatives, and her value for all these alternatives
whose seminal paper [27] on auction theory was is always 0. The other alternatives are “winning”
the first to describe a special case of the above alternatives, and the value for each “winning”
mechanism (this is the second price auction; see alternative is the same, regardless of the specific
the entry on auction theory for more details), after alternative. Such a domain is “single dimen-
Clarke, who provided the second example, and sional” in the sense that one single number
after Groves himself, that finally pinned down the completely describes the entire valuation vector.
general idea. As before, this single number (the value for win-
Clarke’s work can be viewed, in retrospect, as a ning), is private to the player, and here this is the
suggestion for one specific form of the function only private information of the player. The public
hi(vi), namely hi(vi) ¼ j 6¼ i vi(f(vi)) (this is a project domain discussed above is an example of a
slight abuse of notation, as f is defined for single-dimensional domain: the losing alternative
n players, but the intention is the straight-forward is the rejection of the project, and the winning
one f chooses an alternative with maximal wel- alternative is the acceptance of the project.
fare). This form for the hj()‘s gives the following A major drawback of the VCG mechanism, in
property: if a player does not influence the social general, and with respect to the public project
choice, her payment is zero, and, in general, a domain in particular, is the fact that the sum of
player pays the “monetary damage” to the other payments is not balanced (a broader discussion on
players (i.e. the welfare that the others lost) as a this is given in section “Budget Balancedness and
result of i’s participation. Additionally, with Bayesian Mechanism Design” below). In particu-
Clarke’s payments, a truthful player is guaranteed lar, the payments for the public project domain
a non-negative utility, no matter what the others may not cover the entire cost of the project. Is
declare. This last property is termed “individual there a different mechanism that always covers the
rationality”. entire cost? The positive answer that we shall soon
see crucially depends on the fact that the domain is
single-dimensional, and this turns out to be true
The Importance of the Domain’s for many other problem domains as well.
Dimensionality The following mechanism for the public project
problem assumes that the designer can decide not
The impressive property of the VCG mechanism only if the project will be built, but also which
is its generality with respect to the domain of players will be allowed to use it. Thus, we now
preferences – it can be used for any domain. On have many possible alternatives, that correspond to
the other hand, VCG is restrictive in the sense that the different subsets of players that will be allowed
it can be used only to implement one specific goal, to utilize the project. This is still a single-
namely welfare maximization. Given the possibil- dimensional domain, as each player only cares
ity that VCG presents, it is natural to ask if the about whether she is losing or winning, and so
assumption of quasi-linear utilities and private the alternatives, from the point of view ofa specific
values allows the designer to implement many player, can be divided to the two winning/losing
324 Mechanism Design
subsets. The following cost-sharing mechanism i is vi then bidding v0i instead of vi will increase i’s
was proposed by Moulin [20] in a general cost- utility, contradicting truthfulness.
sharing framework. The mechanism is a direct- We now show that a truthful mechanism must be
revelation mechanism, where each player, i, first value-monotone. Assume by contradiction that a
submits her winning value, vi. The mechanism then declaration of (vi, vi) will cause i to win, but a
continues in rounds, where in the first round all declaration of v0i , vi will cause i to lose, for
players are present, and in each round one or more some v0i > vi . Suppose that i pays pi for winning
players are declared losers and retire. Suppose that (when the others declare vi). Since we assume a
in a certain round x players remain. If all remaining normalized mechanism, truthfulness implies that
players have vi C/x then they are declared win- pi vi. But then when the true type of a player is
ners, and each one pays C/x. Otherwise, all players v0i , her utility from declaring the truth will be zero
with vi < C/x are declared losers, and “walk out”, (she loses), and she can increase her utility by declar-
and the process repeats. If no players remain then ing vi, which will cause her to win and to pay pi, a
the project is rejected. contradiction.
Clearly, the cost sharing mechanism always Thus, a truthful mechanism must be value-
recovers the cost of the project, if it is indeed monotone, and there exists a threshold value
accepted. But is it truthful? One can analyze it vi ðvi Þ . To see that this defioes pi, let us first
directly, to show that indeed the dominant strategy check the case of pi < vi ðvi Þ. In this case, if the
of each player is to declare her true winning value. type of i is vi with pi < vi < vi ðvi Þ, she will lose
Perhaps a better way is to understand a characteri- (by the definition of vi ðvi ÞÞ , and by bidding
zation of truthfulness for the general abstract setting some false large enough v0i she can win and get a
of a single-dimensional domain. For simplicity, we positive utility of vi pi. On the other hand, if
will assume that we require mechanisms to be “nor- pi > vi ðvi Þ then with type vi such that pi >
malized”, i.e. that a losing player will pay exactly vi > vi ðvi Þ a player will have negative utility of
zero to the mechanism. Now, a mechanism is said to vi pi) from declaring the truth, and she can
be “value-monotone” if a winner that increases her strictly increase it by losing, again a contradiction.
value will always remain a winner. More formally, Therefore, it must be that pi ¼ vi ðvi Þ.
for all vi Vi and vi Vi, if i is a winner in the To conclude, it only remains to show that a
declaration
(vi, vi) then i is a winner in the decla- value-monotone mechanism with a price for a win-
ration v0i , vi , for all v0i vi . Note that a value- ner pi ¼ vi ðvi Þ is indeed truthful. Suppose first
monotone mechanism casts a “threshold value” that with the truthful declaration i wins. Then vi >
function vi ðvi Þ such that, for every vi, player vi ðvi Þ ¼ pi and i has a positive utility. If she
i wins when declaring vi > vi ðvi Þ , and looses changes the declaration and remains a winner, her
when declaring vi < vi ðvi Þ . Quite interestingly, price does not change, and if she becomes a loser her
this structure completely characterizes incentive utility decreases to zero. Thus, a winner cannot
compatibility in single-dimensional domains: increase her utility. Similarly, a loser can change
her utility only by becoming a winner, i.e. by declar-
Theorem 7 A normalized direct-revelation ing v0i > vi ðvi Þ > vi , but since she will then pay
mechanism for a single-dimensional domain is vi ðvi Þ her utility will now decrease to be negative.
truthful if and only if it is value monotone and Thus, a loser cannot increase her utility either, and
the price of a winning player is vi ðvi Þ. the mechanism is therefore truthful. □
Proof The first observation is that the price of a This structure of truthful mechanisms is very
winner cannot depend on her declaration, vi (only powerful, and reduces the mechanism design prob-
on the fact that she wins, and on the declaration of lem to the algorithmic problem of designing mono-
the other players). Otherwise, if it can depend on tone social choice functions. Another strong
her declaration, then there are two possible bids vi implication of this structure is the fact that the pay-
and v0i such that i wins with both bids and pays pj ments of a truthful mechanism are completely
and p0i, where p0i < pi. But then if the true value of derived from the social choice rule. Consequently,
Mechanism Design 325
if two mechanisms always choose the same set of Fix a player i, and fix the declarations of the
winners and losers, then the revenues that they raise others to vi. Let us assume, without loss of gen-
must also be equal. Myerson [21] was perhaps the erality, that f is onto A (or, alternatively, define A0
first to observe that, in the context of auctions, and to be the range of f(, vi), and replace A with A0
named this the “revenue equivalence” theorem. for the discussion below). Since the prices of
As a result of this characterization, one can Eq. (1) now become constant, we simply seek an
easily verify that the above-mentioned cost-sharing assignment to the variables {pa}a A such that
mechanism is indeed truthful. It is not hard to vi(a) vi(b) pa pb for every a, b A and
check that the two conditions of the theorem vi Vi with f(vi, vi) ¼ a. This motivates the
hold, and therefore its truthfulness is concluded. following definition:
This is just one example of the usefulness of the
:
characterization. da,b ¼ inf fvi ðaÞ vi ðbÞjvi V i , f ðvi , vi Þ ¼ ag
ð2Þ
Multi-dimensional Domains
With this we can rephrase the above assign-
In the more general case, when the domain is multi- ment problem, as follows. We seek an assignment
dimensional, the simple characterization from above to the variables {pa}a A that satisfies:
does not fit, but it turns out that there exists a nice
generalization. We describe two properties, cyclic pa pb da,b 8a, b A ð3Þ
monotonicity (Rochet [25]) and weak monotonicity
(Bikhchandani et al. [7]), which achieve that. The By adding the two inequalities pa pb da, b
exposition here also relies on [14]. It will be conve- and pb pa db, a we get that a necessary
nient to use the abstract social choice setting condition to the existence of such prices is the
described above: there is a finite set A of alternatives, inequality da, b + db, a 0. Note that this inequal-
and each player has a type (valuation function) ity is completely determined by the social choice
v : A ! ℜ that assigns a real number to every function. This condition is termed the non-
possible alternative. vi(a) should be interpreted as negative 2-cycle requirement. Similarly, for any
i’s value for alternative a. The valuation function k distinct alternatives a1, . . .ak we have the
vi() belongs to the domain Vi of all possible valua- inequalities
tion functions.
Our goal is to implement in dominant strategies pa1 pa2 da1 ,a2
the social choice function f : V1 Vn ! A. As
⋮
before, it is not hard to verify that the required price
function of a player i may depend on her declaration pak1 pak dak1,ak
only through the choice of the alternative, i.e. that it pak pa1 dak ,a1
takes the form pi : Vi A ! ℜ, for every player i.
For truth-fulness, these prices should satisfy the and we get that any k-cycle must be non-negative,
P
following property. Fix any vi Vi, and any vi, i.e. that ki¼1 dai ,aiþ1 0 , where ak + 1 a1. It
v0i V i : Suppose that f(vi, vi) ¼ a and turns out that this is also a sufficient condition:
f v0i , vi ¼ b. Then it is the case that:
Theorem 8 There exists a feasible assignment to
vi ðaÞ pi ða, vi Þ vi ðbÞ pi ðb, vi Þ ð1Þ (3) if and only if there are no negative-length
cycles.
In other words, player i’s utility from declaring
his true vi is no less than his utility from declaring One constructive way to prove this is by
some lie, v0i , no matter what the other players looking at the allocation graph”: this is a directed
declare. Given a social choice function f, the weighted graph G ¼ (V, E) where V ¼ A and
underlying question is what conditions should it E ¼ A A, and an edge a ! b (for any a,
satisfy to guarantee the existence of such prices. b A) has weight da, b. A standard basic result
326 Mechanism Design
of graph theory states that there exists a feasible determined by the dab’s weights (who in turn are
assignment to (3) if and only if the allocation completely determined by the function f).
graph has no negative-length cycles. Furthermore, Cycle monotonicity satisfies our motivating
if all cycles are non-negative, the feasible assign- goal: a condition on f that involves only the prop-
ment is as follows: set pa to the length of the erties of f, without existential price qualifiers.
shortest path from a to some arbitrary fixed node However, it is quite complex. k could be large,
a A. and a “shorter” condition would have been nicer.
With the above theorem, we can easily state a “Weak monotonicity (W-MON) is exactly that:
condition for implementability:
Definition 5 (Weak Monotonicity) A social
Definition 4 (Cycle Monotonicity) Social choice function f satisfies W-MON if for every
choice function f satisfies cycle monotonicity player i, every vi, and every vi, v0i V i with
if for every player i, vi Vi, some integer f(vi, vi) ¼ a and f v0i , vi ¼ b, v0i ðbÞ
k j Aj, and v1i , . . . , vki V i vi ðbÞ v0i ðaÞ vi ðaÞ.
k h
X i
In other words, ifthe outcome changes from a to
vij a j vij a jþ1 0 b when i changes her type from vi to v0i then i’s
j¼1 value for b has increased at least as i’s value for a in
the transition vi to v0i . W-MON is equivalent to
where a j ¼ f vij , vi for 1 j k, and cycle monotonicity with k ¼ 2, or, alternatively,
ak + 1 ¼ a 1. to the requirement of no negative 2-cycles. Hence it
is necessary for truthfulness. As it turns out, it is
Theorem 9 f satisffies cycle monotonicity if and also a sufficient condition on many domains. Very
only if there are no negative cycles. recently, Monderer [19] shows that weak monoto-
nicity must imply cycle monotonicity if and only if
Corollary 1 A social choice function f is the closure of the domain of valuations is convex.
dominant-strategy implementable if and only if it Thus, for such domains, it is enough to look at the
satisffies cycle monotonicity. more simple condition of weak monotonicity.
f ðvÞ arg max x A Sni¼1 ki vi ðxÞ þ Cx makespan minimization. This goal aims to con-
struct a balanced allocation, in order to minimize
Roberts [25] shows that, if jA j 3 and Vi ¼ ℜA the completion time of the last task. Such an
for all i, then f is dominant-strategy implementable if allocation can also be viewed as being a more
and only if it is an affine maximizer. “fair” allocation, in the sense of Rawls’ maxmin
However, most interesting domains are restricted fairness criteria. Because of the strategic nature of
in some meaningful way, and for this wide interme- the workers, we wish to design a truthful mecha-
diary range of domains the current knowledge is nism. While VCG is truthful, its outcome may be
rather scarce. One impossibility result that extends far from optimal, as demonstrated above. Nisan
the result of Roberts to a restricted multi-dimensional and Ronen [23], who have first studied this prob-
case is given by Lavi et al. [18], who study multi- lem in the context of mechanism design, observed
item auctions. In a multiitem auction, one seller (the that VCG provides only an m-approximation” to
mechanism designer) wants to allocate items to the optimal makespan, meaning that VCG may
players (i.e. an alternative is an allocation of the sometimes produce a makespan that is m times
items to the players). Lavi et al. [18] shows that larger than the optimal makespan. More impor-
every social choice function for multi-item auctions, tantly, they have shown that no truthful determin-
that additionally satisfy four other social choice prop- istic mechanism can obtain an approximation
erties, must be an affine maximizer. ratio better than 2. To date, the question of closing
Before concluding the discussion on this gap between m and 2 remains open.
dominant-strategy implementation, we demon- Archer and Tardos [1], on the other hand, con-
strate the necessity for non-welfare-maximizers sidered a natural restriction this domain, that
by considering the following “scheduling makes it single-dimensional, and showed with
domain”. A designer wishes to assign n tasks/ this they can construct many possibilities (for
jobs to m workers, where worker i needs tij time example, a truthful optimal mechanism). Thus,
units to complete job task j, and incurs a cost of tij here too we see the contrast between single-
for its processing time (one dollar per time unit). dimensionality and multi-dimensionality. Lavi
Importantly, this cost is private information of the and Swamy [17] suggest a multi-dimensional spe-
worker, and workers are assumed to be strategic, cial case, and give a truthfu12-approximation for
each one selfishly trying to minimize its own cost. the special case where the processing time of each
The load of worker i is the sum of costs of the jobs job is known to be either “low” or “high” This
assigned to her, and the maximal load over all special case keeps the multi-dimensionality of the
workers (in a given schedule) is termed the domain. The construction of this result does not
“makespan” of the schedule. The welfare maxi- rely on explicit prices, but rather uses the cycle-
mizing social goal would put each task on the monotonicity condition described above, to con-
most efficient worker (for that task), which may struct a monotone allocation rule.
result in a very high makespan. For example,
consider a setting with two workers and n tasks.
The first worker incurs a cost of 1 for every task, Budget Balancedness and Bayesian
and the second worker incurs a cost of l +e for Mechanism Design
every task. The social welfare is the minus of the
sum of the costs of the two workers, and the VCG The previous sections portray a concrete picture
mechanism will therefore assign all tasks to the of the advantages and the disadvantages of the
first worker. This is a very highly unbalanced solution concept of truthfulness in dominant strat-
allocation, which takes twice the time that the egies. On the one hand, this is a strong and con-
workers optimally need in order to finish all vincing concept, which admits many positive
tasks (roughly splitting the work among then). results. However, there are several problems to
Thus, one may wish to consider a social goal all these results, that cannot be solved by a truthful
different from welfare maximization, namely mechanism. Among these, the budget-imbalance
328 Mechanism Design
problem was briefly mentioned, and this section Evi ½vi ð f ðvi , vi ÞÞ pi ðvi , vi Þ
looks again at this problem, as a motivation to the Evi vi ðf ðv0i , vi pi v0i , vi
definition of the Bayesian Nash solution concept.
To recall the budget-imbalance problem of the
VCG mechanism, let us consider a specific input to In other words, Bayesian incentive compatibil-
the Clarke mechanism from section “Quasi-Linear ity requires that a player will maximize her
Utilities and the VCG Mechanism”: suppose the expected utility by declaring her true type. An
cost of the project is $100, and there are alternative formulation is that truth-fulness in a
102 players, each values the project by$l. It is a Bayesian incentive compatible mechanism should
simple exercise to check that the Clarke mecha- be a “Bayesian-Nash equilibrium” (where the for-
nism will indeed choose to perform the project, and mal equilibrium definition naturally follows the
that each player will pay a price of zero (since the above definition). This is an “exinterim” equilib-
project would have been conducted even if a single rium: the type of the player is already known to
player is removed). Thus, the mechanism designer her, and the averaging is over the types of the
does not cover the project’s cost. As described others. A weaker equilibrium notion would be an
above, this problem, for this specific domain, can “ex-ante” notion, where the player should decide
be fixed by considering the cost-sharing mecha- on a strategy before knowing her own type, and so
nism discussed in section “The Importance of the the averaging is done over her own types as well.
Domain’s Dimensionality”. However, this mecha- A stronger notion would be an “ex-post” notion,
nism may sometimes choose not to perform the where no-averaging is done at all, and the above
project although the society as a whole will benefit inequality is required for every realization of the
from performing it (i.e. it is not “socially effi- types of the other players. It can be shown that this
cient”), and, even more importantly, it is a solution stronger ex-post condition is equivalent to the
only for the concrete domain of a public project. Is requirement of dominant-strategy incentive com-
there a general mechanism (in the sense that VCG patibility. As a Bayesian-Nash equilibrium only
is general) that is both socially efficient and considers the average over all possible realiza-
budget-balanced? In this section we describe such tions, it is clearly a weaker requirement than
a mechanism, that was independently discovered dominant-strategy implementability.
by d’Aspremont and Ge’rardVaret [10] and by We will demonstrate the usefulness of this
Arrow [3]. Its incentive compatibility will not be weaker notion by describing a general mechanism
in dominant strategies. Instead, it is assumed that that is both ex-post socially efficient and ex-post
player types are drawn i.i. d. from some fixed and budget balanced, and is Bayesian incentive-
known cumulative distribution function F (the compatible. Define,
assumption that the types are drawn from the
same distribution is not important, and is made " #
X
here only for the ease of notation; the assumption xi ðvi Þ ¼ Evi v j ð f ðvi , vi ÞÞ
that types are not correlated is important and can- j6¼i
not be removed in general). The solution concept
ofa Bayesian-Nash equilibrium is a natural exten- The “budget-balanced” (BB) mechanism asks
sion of the regular Nash equilibrium concept, for a the players to report their types, and then chooses
setting in which the distribution F is known to all the welfare-maximizing allocation according to the
players (this is termed the “common-prior” reported types (as VCG does). It then charges some
assumption), and where players aim to maximize payment pi(vi, vi) ¼ xi(vi) + hi(vi), for some
the expectation of their quasi-linear utility. function hi() that will be chosen later on in a specific
way that balances the budget. But let us first verify
Definition 6 A direct mechanism M ¼ (f, p) is that the mechanism is Bayesian incentive compati-
Bayesian incentive compatible if for every player ble, regardless of the choice of the functions hi().
i, and for every vi , v0i V i, Note that, for any realization of vi, we have that,
Mechanism Design 329
X
vi ð f ðvi , vi ÞÞ þ v j ð f ðvi , vi ÞÞ Satterthwaite [22] have shown that this is impos-
j6¼i sible: there is no general mechanism that satisfies
0 P 0 the four properties (1) Bayesian incentive com-
vi f vi , vi þ v j f vi , vi
j6¼i patibility, (2) budget balancedness, (3) individual
rationality, and (4) social efficiency. The proof
as the mechanism chooses the maximal-welfare uses a simple, natural exchange setting, where
alternative for the given reports. Clearly, taking two traders (one buyer and one seller) wish to
the expectation on both sides will maintain the exchange an item. The seller has a cost c of pro-
inequality. Therefore we get: ducing the item, and the buyer obtains a value
v from receiving it. Myerson and Satterthwaite
Evi ½vi ð f ðvi , vi ÞÞ pi ðvi , vi Þ
show that there is no Bayesian incentive compat-
" #
X ible mechanism that decides to perform the
¼ Evi ½vi ð f ðvi , vi ÞÞ
þ Evi v j ð f ðvi , vi ÞÞ exchange if and only if v > c, such that Bayesian
j6¼i
incentive compatibility and individual rationality
þEvi ½hi ðvi Þ
" # are maintained, and the price that the buyer pays
X
Evi ½vi ð f ðvi , vi ÞÞ
þ Evi v j ð f ðvi , vi ÞÞ exactly equals the payment that the seller gets. In
j6¼i particular, VCG violates this last property, while
þEvi ½hi ðvi Þ
BB satisfies it, but violates individual rationality
¼ Evi vi f v0i , vi pi v0i , vi (i.e. for some realizations of the values, a buyer
may pay more than her value, or the seller may get
which proves Bayesian incentive compatibility. To less than her cost).
balance the budget, consider the specific function, Besides this disadvantage of the BB mecha-
hi(vi) ¼ 1/(n 1ij 6¼ i xj(vj). Notice that the term nism, there are also additional disadvantages that
P
xj(vj) appears (n 1) times in the sum ni¼1 hi ðvi Þ result from the underlying assumptions of the
Pn
for any j ¼ 1, . . ., n. Therefore i¼1 h ðv Þ ¼ solution concept itself. In particular, Bayesian
P P i i
1=ðn 1i nj¼1 ðn 1Þx j v j ¼ ni¼1 xi ðvi Þ: incentive compatibility entails two strong
Pn
To conclude, we have
Pn Pn i¼1 pi ðvi , vi Þ ¼ assumptions about the characteristics of the
i¼1 hi ðvi Þ i¼1 xi ðvi Þ ¼ 0 , and the budget players. First, it assumes that players are risk-
balancedness follows. neutral, i.e. care only about maximizing the
It is worth noting that such an exercise cannot expectation of their profit (value minus price).
be employed for the VCG mechanism, as there the Thus, when players dislike risk, for example,
“parallel” xi() term should depend on the entire and prefer to decrease the variance of the out-
vector of declarations, not only on i’s own decla- come, even on the expense of lowering the
rations. This is the exact point where the averag- achieved expected profit, the rational of the
ing of the others’ valuations is crucial. Bayesian-Nash equilibrium concept breaks
In addition to the difference in the solution down. Second, the assumption of a common-
concept, one other important advantage of VCG, prior, i.e. that all players agree on the same under-
in comparison with the BB mechanism, is the fact lying distribution, seems strong and somewhat
that VCG (with the Clarke payments) is ex-post unrealistic. Often, players have different estima-
“individually rational if a player declares her true tions about the underlying statistical characteris-
valuation, it is guaranteed that she will not pay tics of the environment, and this concept does not
more than her value, no matter what the others handle this well. Note that the solution concept of
will declare. Here, on the contrary, there is no dominant-strategies does not suffer from any of
reason why this should be true, in general. Can these problems, which strengthens even more its
the solution concept of Bayesian incentive com- importance. Unfortunately, the classical econom-
patibility be used to construct a general budget- ics literature mainly ignores these disadvantages
balanced and individually rational mechanism? In and problems. A well-known exception is the
an important and influencing result, Myerson and critique known as Wilson’s critique [28], who
330 Mechanism Design
raises the above mentioned problems, and argues obtained by the different oil companies. Intui-
in favor of “detail-free” mechanisms. Recently, tively, a player that participates in an auction
this critique gained more popularity, and detail- mechanism that determines who will buy the
free solution concepts are re-examined. For some field, and at what price, has to act somehow as if
examples, see [5, 6, 11]. she knows the value of the field, although she
doesn’t. Clearly, this creates different complica-
tions. Such a model is very natural in auction
Interdependent Valuations settings, and indeed the entry on auctions handles
the subject of interdependent valuations more
Up to now, this entry described private value” broadly. Since this issue is also very relevant to
models, i.e. models where the valuation (or the general mechanism design theory, we describe
preference relation) of a player does not depend here one specific, rather general result for mecha-
on the types of the other players. There are many nisms with interdependent valuations, to exem-
settings in which this assumption is unrealistic, plify the definitions and the techniques being
and a more suitable assumption is that the valua- employed.
tion of a specific player is affected by the valua- In the formal model of interdependent valua-
tions of the other players. This last statement may tions, player i receives a signal si Si, which may
entail two interpretations. The first is that the be multi-dimensional. Her valuation for a specific
distribution over the valuations of a specific alternative a A is a function of the signals
player is correlated with the distribution over the s1, . . ., sn, i.e. vi: A S1 Sn ! ℜ. The
valuations of the other players, and, thus, knowing case where vi(a, s1, . . ., sn) ¼ vj(a, s1, . . .,
a player’s actual valuation gives partial knowl- sn) for all players i, j and all a, s1, . . ., sn is
edge about the valuations of the other players. termed the “common value” case, as the actual
This first interpretation is still termed a private values of all players are identical, and only their
value model (but with correlated values instead signals are different (as in the oil field example).
of independent values), since after the player The other extreme is when i’s valuation depends
becomes aware of the actual realization of her only on i’s signal, i.e. vi(a, s1, . . ., sn) ¼
valuation, she completely and fully knows her vi(a, si), which is a return to the private value
values for the different outcomes. case. The entire range in general is termed the
In contrast, with interdependent valuations, the case of interdependent valuations. All the results
actual valuation of a player depends on the actual described in the previous sections fail when we
valuations of the other players. Thus, a player move to interdependent valuations. For example,
does not fully know her own valuation. She only in the VCG mechanism, a player is required to
partially knows it, and can determine her full report her valuation function, which is not fully
valuation only if given the others’ valuations as known to her in the interdependent valuation case.
well. A classic example is a setting where a seller It turns out that the straight-forward modification
sells an oil field. The oil, of-course, is not seen on of reporting the players’ signals does not maintain
the ground surface, and the only way to exactly the truthfulness property, and, in fact, some strong
determine how much oil is there (and, by this, impossibilities exist (Jehiel et al. [15]). However,
determine the actual worth of the field) is to interdependent valuations may also enable possi-
extract it. Before buying the field, though, the bilities, and the classic result of Cremer and
potential buyers are only allowed to make prelim- McLean [9] will be described here to exemplify
inary tests, and by this to determine an estimation this. This result shows how to use the interdepen-
of the value of the field, which is not completely dencies in order to increase the revenue of the
accurate. If all the buyers that are interested in the mechanism designer, so that the entire surplus of
field have the same technical capabilities, it seems the players can be extracted. Cremer and McLean
reasonable to assume that the true value of the [9] study an auction setting where there is one
field is the average over all the estimations item for sale. n bidders have interdependent
Mechanism Design 331
values for the item, and it is assumed that the higher than her price (i.e. iff winning will yield a
signal that each player receives is single- positive utility). Thus, when a player “wants to
dimensional, i.e. each player receives a single win”, truthful reporting will do that, and when a
real number as her signal. The valuation functions player “wants to lose”, truthful reporting will do
are assumed to be known to the mechanism that as well, and so truth-fulness will always max-
designer, so that the only private information of imize the player’s utility.
the players are their signals. It is also assumed that The notion of an ex-post equilibrium is stron-
the valuation functions are monotonically non- ger than Bayesian-Nash equilibrium, since, here,
decreasing in the signals. For simplicity, it is even after the signals are revealed no player
assumed here that the signal space is discretisized regrets her declaration (while in Bayesian-Nash
to be Si ¼ {0, D, 2D, . . .}. The last (and equilibrium, since only the expected utility is
crucial) assumption is that the valuation maximized, there are some realizations for which
functions satisfy the “single-crossing” a player can deviate and gain). On the other hand,
property: if vi(si, si) vj(si, si) then ex-post equilibrium is weaker than dominant strat-
vi(si + D, si) vj(si + D, si). This says that i’s egies, in which truthfulness is the best strategy no
signal affects i’s own value (weakly) more than it matter what the others choose to declare, while
affects the value of any other player. This last here truthfulness is a best response only ifthe
assumption is strong, but in some sense necessary, others are truthful as well.
as it is possible to construct interdependent As seen above, both for the VCG mechanism as
valuation functions (that violate single-crossing) well as for the BB mechanism, adding a “constant”
for which no truthful mechanism can be efficient to the prices (i.e. setting Pei ðsi Þ ¼ Pi þ hi ðsi ÞÞ
(i.e. allocate the item to the player with the highest maintains the strategic properties of the mechanism,
value). since the function hi() does not depend on the
Consider the following CM mechanism for this declaration of player i. The correlation in the values
problem: each player reports her signal, and the can help the mechanism designer to extract more
player with the highest value (note that this may payments from the players, as follows. Consider the
be different than the player with the highest sig- matrix that describes the conditional probability for
nal) receives the object. In order to determine her a specific tuple of signals of the other players, given
payment, define the “threshold signal” Ti(si) of i’s own signal. There is a row for every signal si of i,
any player i to be the minimal signal that will a column for every tuple of signals si of the other
enable her to win (given the signals of the other players, and the cell (si, si) contains the condi-
players), i.e. T i ðsi Þ ¼ min fe si Si jvi ðe
si , si Þ tional probability Pr(si| si). In the private value
max j6¼i v j ðe
si , si Þg. The payment of the win- case, the signals of the players are not correlated,
ner, i, is her value if her signal was Ti(si), hence the matrix has rank one (all rows are identi-
i.e. Pi(si) ¼ vi(Ti(si), si). Clearly, if all cal). As the correlation between the signals
players report their true signals, then the player “increases”, the rank increases, and we consider
with the highest value receives the item. Truth- here the case when the matrix has full row rank.
ful reporting is also an expost Nash equilibrium, Let qi (si , si) be an indicator to the event that i is the
which means the following: if all other players winner when the signals are (si, si). The expected
report the true signal (no matter what that is) surplus of player i in the CM mechanism is Ui ðsi Þ ¼
P
then it is a best response for i to report her true si Prðsi jsi Þ ðqi ðsi ,si Þ vi ðsi ,si ÞÞ Pi ðsi ÞÞ:
signal as well. (Pi(si) is defined to be zero whenever i is not a
To verify that truthfulness is indeed an ex-post winner). Now find “constants” hi(si) such that,
P
Nash equilibrium, notice first that each player has for every si,
si hi ðsi Þ Pr ðsi jsi Þ ¼ U i ðsi Þ Note
a price for winning which does not depend on her that such an hj() function exists: we have a system of
declaration. Now, truthful reporting will ensure linear equations, where the variables are the
winning (given that the others are truthful as function values hi(si) for all possible tuples si, and
well) if and only if the true value of the player is the qualifiers are the probabilities and the expected
332 Mechanism Design
surplus. Since the matrix of qualifies has full row rank, The Internet environment also strengthens the
a solution exists. It is now not hard to verify that, with question marks posed on the solution concept of
prices Pei ðÞ , the expected utility of a truthful player Bayesian incentive compatibility, which was the
is zero. most common solution concept in mechanism
As mentioned above, truthfulness is still an design literature in the 1980s and throughout the
ex-post equilibrium of this mechanism. It is not 1990s, due to the accompanying assumption of a
ex-post individually rational, though, but rather common prior. Such an assumption seems problem-
only ex-ante, since a player pays her expected atic in general, and in particular in an environment
surplus even if the actual signals cause her to like the Internet, that brings together players from
lose. Thus, this mechanism can be considered a many different parts of the world. It seems that the
fair lottery. Also note that the crucial property was research community agrees more and more that
the correlation between the values, the alternative, detail-free solution concepts should be
interdependence assumption was not important. sought. The description of more recently new solu-
tion concepts is beyond the scope of this entry, and
the interested reader is referred, for example, to the
Future Directions papers by [5, 6, 11] for some recent examples.
Another aspect of mechanism design that is
As surveyed here, the last three decades have seen largely ignored in the classic research is the com-
the theory of mechanism design being developed putational feasibility of the mechanisms being
in many different directions. The common thread suggested. This question is not just a
of all settings is the requirement to implement technicality-some classic mechanisms imply
some social goal in the presence of incomplete heavy computational and communicational
information-the social designer does not know the requirements that scale exponentially as the num-
players’ preferences for the different outcomes. ber of players increase, making them completely
We have seen several alternative assumptions infeasible for even moderate numbers players.
about the structure of players’ preferences, the The computer science community has begun
different equilibria solution concepts that are suit- looking at the design of computationally efficient
able for the different cases, and several positive mechanisms, and the recent book by Nisan et al.
examples for elegant solutions. We have also [24] contains several surveys on the subject.
discussed some impossibilities, demonstrating
that some attractive definitions may turn out
almost powerless. One relatively new research Bibliography
direction in mechanism design is the analysis of
1. Archer A, Tardos E (2001) Truthful mechanisms for
new models for the emerging Internet economy,
one-parameter agents. In: Proceedings of the 42st
and the development of new alternative solution Annual Symposium on Foundations of Computer Sci-
concepts that better suit this setting. A very recent ence, FOCS’01, Las Vegas. IEEE Computer Society
example is the new model of “dynamic mecha- 2. Arrow K (1951) Social Choice and Individual Values.
Wiley, New York
nism design”, where the parameters of the prob-
3. Arrow K (1979) The property rights doctrine and
lem (e.g. the number players, or their types) vary demand revelation under incomplete information. In:
over time. Such settings become more and more Boskin M (ed) Economies and Human Welfare. Aca-
important as the economic environment becomes demic Press, New York
4. Athey S, Segal I (2007) Designing dynamic mecha-
more dynamic, for example due to the growing
nisms. Am Econ Rev 9(2):131–136
importance of the electronic markets. Examples 5. Babaioff M, Lavi R, Pavlov E (2006) Single-value
for such models include e.g. the works by Lavi combinatorial auctions and implementation in
and Nisan [16] in the context of computer science undominated strategies. In: Proceedings of the 17th
symposium on discrete algorithms, SODA, Miami.
models, and by Athey and Segal [4] in a more
ACM Press
classical economic context, among many other 6. Bergemann D, Morris S (2005) Robust mechanism
works that study such dynamic settings. design. Econometrica 73:1771–1813
Mechanism Design 333
7. Bikhchandani S, Chatterjee S, Lavi R, Mu’alem A, Proceedings of the 44rd annual symposium on foun-
Nisan N, Sen A (2006) Weak monotonicity character- dations of computer science, FOCS’03, Cambridge.
izes deterministic dominant-strategy implementation. IEEE Computer Society
Econometrica 74(4): 1109–1132 19. Monderer D (2007) Monotonicity and
8. Clarke E (1971) Multipart pricing of public goods. implementability. Working paper, unpublished
Public Choice 8: 17–33 20. Moulin H (1999) Incremental cost sharing: Character-
9. Cremer J, McLean R (1985) Optimal selling strategies ization by coalition strategy-proofness. Soc Choice
under uncertainty for a discriminating monopolist Welf 16:279–320
when demands are interdependent. Econometrica 21. Myerson R (1981) Optimal auction design. Math Oper
53:345–361 Res 6: 58–73
10. d’Aspremont C, Gérard-Varet L (1979) Incentives and 22. Myerson R, Satterthwaite M (1983) Efficient mecha-
incomplete information. J Public Econ 11:25–45 nisms for bilateral trading. J Econ Theor 29:265–281
11. Dekel E, Wolinsky A (2003) Rationalizable outcomes 23. Nisan N, Ronen A (2001) Algorithmic mechanism
of large private-value first-price discrete auctions. design. Games Econ Behav 35:166–196
Games Econ Behav 43 (2): 175–188 24. Nisan N, Roughgarden T, Tardos E, Vazirani W (eds)
12. Gibbard A (1973) Manipulation of voting schemes: (2007) Algorithmic game theory. Cambridge Univer-
A general result. Econometrica 41 (4):587–601 sity Press, New York
13. Groves T (1973) Incentives in teams. Econometrica 25. Roberts K (1979) The characterization of
41 (4): 617–631 implementable choise rules. In: Laffont JJ
14. Gui H, Muller R, Vohra RV (2004) Characterizing (ed) Aggregation and revelation of preferences.
dominant strategy mechanisms with multi- North-Holland, Amsterdam, pp 321–349
dimensional types. Working paper, unpublished 26. Rochet JC (1987) A necessary and sufficient condition
15. Jehiel P, Meyer ter Vehn M, Moldovanu B, Zame WR for rationalizability in a quasilinear context. J Math
(2006) The limits of ex-post implementation. Econ 16:191–200. 27
Econometrica 74(3):585–610 27. Satterthwaite M (1975) Strategy-proofness and
16. Lavi R, Nisan N (2004) Competitive analysis of incen- arrow’s conditions: Existence and correspondence the-
tive compatible on-line auctions. Theor Comput Sci orems for voting procedures and social welfare func-
310:159–180 tions. J Econ Theor 10:187–217. 28
17. Lavi R, Swamy C (2007) Truthful mechanism design 28. Vickrey W (1961) Counterspeculations, auctions, and
for multidimensional scheduling. In: The Proceedings competitive sealed tenders. J Financ 16:8–37
of the 8th ACM conference on electronic commerce, 29. Wilson R (1987) Game-theoretic analyses of trading
EC’07, San Diego. ACM Press processes. In: Bewley T (ed) Advances in economic
18. Lavi R, Mu’alem A, Nisan N (2003) Towards a char- theory: fifth world congress. Cambridge University
acterization of truthful combinatorial auctions. In: Press, New York, pp 33–70
First-price auction Bidders submit sealed bids.
Auctions The high bidder wins the item and pays her bid.
In case of a tie, the auctioneer randomly
Martin Pesendorfer assigns the item to one of the high bidders
Department of Economics, London School of with equal probability. Losers pay nothing.
Economics and Political Science, London, UK Second-price auction Bidders submit sealed
bids. The high bidder wins the item and pays
second highest bid. In case of a tie, the auc-
Article Outline tioneer randomly assigns the item to one of the
high bidders with equal probability. Losers pay
Glossary nothing.
Introduction
Second-Price Auction
First-Price Sealed-Bid Auction Introduction
Comparing Auction Outcomes
Empirics of Auctions Auctions have been a common selling form
Winner’s Curse throughout history; see Cassidy (1967) for an
Collusive Bidding account. Roman legions sold their plunder at auc-
Concluding Remarks tion. Slave auctions were held throughout medieval
Bibliography times. Art auctions have been taking place for the
last 300 years with Christie’s and Sotheby’s being
Glossary two well-known auction houses. Real estate, trea-
sury bills, flowers, livestock, even large corpora-
All-pay first-price auction Bidders submit tions are sold at auction. Government procurement
sealed bids. The high bidder wins the item. In follows specific regulations and rules which give
case of a tie, the auctioneer randomly assigns rise to an auction rule. The sale of mineral extraction
the item to one of the high bidders with equal rights and spectrum licenses are good sources for
probability. All bidders (including losing bid- governmental revenues. eBay has become a suc-
ders) pay their bid. cessful marketplace with the arrival of the Internet.
Bayesian Nash equilibrium A Bayesian Nash This entry surveys contributions of the auction
equilibrium is a collection of bidding strategies literature. It is a selected account from an econo-
so that (i) no bidder has an incentive to deviate mist’s perspective; see McAfee and McMillan
and (ii) beliefs are consistent with the underly- (1987), Klemperer (1999), Krishna (2002), Hong
ing informational assumptions. and Paarsch (2006), and Hortacsu and McAdams
Bidding strategy A bidding strategy for a buyer is (2016) for related surveys. Auctions are encountered
a mapping from the buyer’s signal into bid prices. in many settings, but specific rules and procedures
Dutch auction Price falls until one bidder pre- may differ. Broadly speaking, we can distinguish
sses her button. That bidder gets the object at single-item versus multi-item, sealed-bid versus
the current price. Losers pay nothing. open-outcry, and single-round versus multi-round
English auction Bidders call out successively auctions. The nature of the rules and format may
higher prices until one bidder remains. The item depend on the items at hand but will also affect the
is allocated to the last remaining bidder at the behavior of bidders and what revenues the seller
price at which the second last bidder dropped out. may get. Popular single-item auction rules include:
• English open-outcry auction in which bidders than any posting price scheme, as was shown by
call out successively higher prices until one Myerson (1981). An auction enables the seller to
bidder remains. The item is allocated to the learn about buyers’ willingness to pay and to
last remaining bidder at the price at which the extract as much rent as possible.
second last bidder dropped out. Sometimes The informational environment, that is, how
these are referred to as “hammer auctions” much buyers know about their own willingness to
which are commonly used by Sotheby’s and pay, can vary. Environments range from buyers
Christy’s. knowing precisely their value of the object or hav-
• Second-price sealed-bid auction in which bid- ing a rough idea only. Let’s describe some infor-
ders submit sealed bids. The high bidder wins mational structures. Suppose there are N buyers.
the item and pays the second highest bid. In Let yi [0, A] denote buyer i’s signal drawn
case of a tie, the auctioneer randomly assigns identically and independently from a cumulative
the item to one of the high bidders with equal distribution function F with probability density
probability. Losers pay nothing. function f. Formally, the joint probability density
• Dutch or descending-price auction is the oppo- function of signals equalsY the product of the mar-
site of English auction. Price falls until one ginals, f ðy1 , . . . , yN Þ ¼ f ðyi Þ and this is a
bidder presses her button. That bidder gets the i
object at the current price. Losers pay nothing. common knowledge. The independence assump-
This auction format is used to sell flowers in tion is for illustration purposes only. It can be
Holland. relaxed leading to a correlated information envi-
• First-price sealed-bid auction has bidders sub- ronment; see Milgrom and Weber (1982) for affil-
mitting sealed bids. The high bidder wins the iation which is a particular correlation structure
item and pays her bid. In case of a tie, the among signals. Heterogeneity between buyers can
auctioneer randomly assigns the item to one of be incorporated by allowing signal distributions to
the high bidders with equal probability. Losers differ across bidders, F1, . . ., FN. Following stan-
pay nothing. The first-price auction format is dard incomplete information terminology, we
commonly used for governmental procurement. assess information at the interim stage in which
• All-pay first-price sealed-bid auction has bid- buyer i has learned her own signal yi but has not
ders submitting sealed bids. The high bidder learned the value of competitors’ signals yj, j 6¼ i
wins the item. In case of a tie, the auctioneer (In information economics, the term interim is used
randomly assigns the item to one of the high to distinguished from ex ante, in which types are
bidders with equal probability. All bidders not known yet, and ex post, in which everyone
(including losing bidders) pay their bid. knows everyone else’s type.).
Private values arise when the signal equals the
The seller of the item selects the auction format value of the object to buyer i, vi = yi. Private
before bidding starts. The seller may additionally values refer to a situation in which buyers know
decide how much information to reveal about the precisely their own value. An example may be a
item and may announce a reserve price, which is a construction contract in which firms know their
minimum price at which the seller is willing to own opportunity costs of undertaking the project.
sell. If the seller knows what buyers are willing to However, a firm may be only vaguely informed
pay, then the seller can post a selling price equal to about competing firms’ costs.
the high willingness to pay. The resulting alloca- (Pure) common values arise when the value
tion extracts all the rent and gives the item to the of the X
object is determined by the average signal,
buyer that values it most. When the seller does not v ¼ N1 yj . This environment differs from pri-
j
know the willingness to pay, then this posting
price scheme may perform poorly. An auction vate values in two respects: first, as buyer i only
seems then a good choice as the seller may observes one signal yi from a total of N signals, she
achieve higher revenues when using an auction knows only a little bit about the true value and,
Auctions 337
second, all bidders assign the same value v. An expected value of the lottery. We shall comment
example may be an oil field auction. Each bidder later on the extension to bidder risk aversion is
conducts their own study of how much oil there is introduced.
and comes up with an estimate yi. The true value of The strategy space and equilibrium concept is
the oil field will be some average across the bid- shared across the following sections.
ders’ noisy signals and is the same to all bidders. A bidding strategy for buyer i is a mapping
A second example is the wallet game, in which two from signals into bid prices, bi : [0, A] ! ℝ.
bidders compete for the joint value of their wallets. A Bayesian Nash equilibrium is a collection of
In that case the total value equals the sum of bidding strategies, ðbi ÞNi¼1 , so that (i) no bidder has
signals, v = y1 + y2, or twice the average signal. an incentive to deviate and (ii) beliefs are consistent
Interdependent values are a mixture between with the underlying informational assumptions.
private and common values. With interdependent With a Bayesian Nash equilibrium, we mean a
values, the value of the object to buyer i is given stable resting point in which every bidder adopts a
P
by vi ¼ ayi þ ð1 aÞ N11
j6¼i yj with 0 < a < 1. strategy that maximizes her payoff.
When the parameter a = 1, this formula reduces to The entry is organized as follows: We shall
vi = yi, the case of private values. On the other begin by studying bidder behavior at specific auc-
hand, when the parameter a ¼ N1, then this formula tion rules, including second-price and first-price
becomes the pure common value case. auction. We then compare bidders’ payoffs and
Most of our following analysis will focus on auctioneer’s revenues across distinct auction for-
the case of (independent) private values. mats. We describe key issues for empirical work on
auctions, the winner’s curse phenomenon, and
Assumption Buyers know their values privately, issues concerning collusive behavior at auction.
vi = yi for all i.
The reader interested in the more advanced
topic of interdependent valuations is referred to Second-Price Auction
Krishna (2002) for a nice introductory exposition.
We shall illustrate on occasions what may happen The second-price auction format has been advo-
under alternative informational assumptions. cated in Vickrey (1961). It has the rule that the
The seller may have information that is useful to high bidder wins the item and pays the second
bidders. For example, on eBay, the seller may highest bid. The payoff for a winning buyer i is
decide how accurately to describe the object. Or a vi – b(2) where b(2) is the second highest bid. The
used car owner may know very well the pros and payoff to a losing bidder is zero.
cons of the car. What should the seller do? Reveal What is an optimal bidding strategy in a
the information prior to bidding, or conceal it. The second-price auction? Let’s consider an example.
seller may also impose a reserve price. Should the Suppose your value is 60; what should you bid?
seller impose a reserve price? At what level? We You could bid your value. Is it optimal to bid your
shall return to these questions in the section entitled value? The answer is yes. Suppose another bidder
“Comparing Auction Outcomes” after having stud- bids 70. Do you regret? No as you would lose
ied buyer’s behavior in standard auction formats. money if you outbid the other bidder. Suppose
The sections entitled “Second-Price Auction” other bidders’ bid is 40. Do you regret? No. So,
and “English Auction” examine optimal bidding bidding your value is indeed optimal. It is a Nash
strategies and Bayesian Nash equilibria for stan- equilibrium.
dard auction formats. The payoff bidder i receives This example generalizes and leads us to the
will depend on the auction rule, the equilibrium following result.
played, and attitudes toward risk. Attitudes
toward risk matter as bidders face lotteries, win- Theorem 1 Bidding the true value, bi(vi) = vi, is
ning the auction or not. We shall assume risk a Bayesian Nash equilibrium in weakly dominant
neutrality in which bidder i’s payoff equals the strategies.
338 Auctions
Proof Let the seller’s reserve price be denoted by Next, we shall illustrate that the equilibrium in
R. Suppose buyer i’s valuation is below the reserve the English auction shares features with the above
price, vi < R: It is (weakly) optimal for buyer i to equilibria.
bid vi. The only way that buyer i can win the item is
if buyer i bids more than i, but in case of winning, English Auction
she pays at least R and makes a loss, vi – R < 0. We consider a continuous-price version of the
Next, suppose buyer i’s valuation is above the English auction in which the price increases con-
reserve price, vi R: Following the strategy tinuously, without any bidding jumps, and does so
implies that if buyer i wins, she pays the second until only one bidder remains. As the price
highest bid b(2) < vi and makes a profit of vi b(2). increases, bidders drop out irrevocably. When
Consider a deviation: (a) Suppose buyer i bids only one bidder remains, the item is allocated to
more than her valuation, bi > vi. For b(2) vi, she the last remaining bidder at the price at which the
pays b(2) and she gets the same payoff, but for second last bidder dropped out.
bi > b(2) > vi, she makes a loss. (b) Suppose Consider a bidding strategy in the English
buyer i bids less than her valuation, bi vi. For auction in which bidder i stays in until the price
b(2) < bi, she wins, pays b(2), and gets the same reaches her value vi. If everyone adopts this strat-
payoff, but for bi b(2) < vi, she does not win the egy, does this constitute an equilibrium? Yes, for
auction, gets a payoff that is zero, and makes a loss. the same reason as given in the above proof.
In fact there is a strategic equivalence between an
The theorem characterizes an equilibrium English auction and a second-price auction with
which has the feature that all bidders bid their independent private information. Think of an agent
true value, bidders bid “sincerely.” An interesting that bids on behalf of bidder i. The agent would
feature of the second-price auction is that sincere receive a number to submit in a second-price auction
bidding is optimal irrespective what bidding strat- and a dropout value in the English auction. With
egies other bidders adopt. strategic equivalence, it is meant that the number
Vickrey (1961) is the classic auction paper that should be identical in both auction formats.
has emphasized these features of second-price Notice though that the strategic equivalence
auctions and compares the second-price auction breaks down with interdependent values. In that
to a first-price sealed-bid auction. Vickrey has case, we need to take into account that bidders
shown that the above equilibrium is in fact effi- form their valuation estimate based on all avail-
cient. With efficiency we mean that the bidder that able information. In a second-price sealed-bid
values the item the most gets it. auction, the only available information is a bid-
We can also assess the revenues to the seller. der’s private signal. In an English auction, bidders
The expected revenues (for the seller) of the Vick- learn something as the price increases. For
rey auction equal the expected second highest instance, when an opponent drops out of the auc-
valuation. tion, something can be inferred about that bidder’s
Are there other equilibria in second-price auc- private signal which may influence the valuation
tions? The answer is yes. Suppose there is no reserve estimate. Thus, as price increases, bidders will
price. Consider the following “pooling equilibrium” update their bidding strategy to take the additional
in which bidder 1 bids b1 = A and all other bidder information into account.
bid bi = 0. This is in fact a Bayesian Nash equilib- Next, we shall consider bidding behavior at a
rium. Nobody can benefit from deviating. Notice first-price auction.
though that this is not a dominant strategy equilib-
rium. Moreover, the outcome is not efficient. We
shall focus on the dominant strategy equilibrium First-Price Sealed-Bid Auction
when comparing auction outcomes across auction
formats and return to the pooling equilibrium in the In a first-price auction, bidders submit sealed
section describing empirical work. bids. The high bidder wins the item and pays her
Auctions 339
bid. In case of a tie, the auctioneer randomly property, which implies that the inverse function
assigns the item to one of the high bidders with b1 exists and is strict monotone. The final equation
equal probability. Losers pay nothing. uses the definition of the cumulative distribution
We maintain our assumption of bidder risk neu- function F, which evaluated at the number b1(bi)
trality. If bidder i wins the item, her payoff is vi bi, gives the probability that a buyer’s valuation is less
while she makes a payoff of zero if she loses. than equal to that number. In total this probability
Ignoring the issue of ties for the moment, let arises (N 1) times as there are (N 1) opponents.
Pr(pi > bj, for all j) denote the winning probability. The (interim) expected payoff becomes thus
It is the probability that bidder i submits the high bid.
Bidder i’s (interim) expected payoff Ui will N1
U i ðvi , bi Þ ¼ ½vi bi F b1 ðbi Þ (3)
depend on the valuation vi and the bid bi submit-
ted. For a first-price auction, the (interim)
expected payoff equals In a Bayesian Nash equilibrium, it must be that
the strategy b(vi) is optimal. Observe that b(0) = 0
since the valuation is zero. Observe also that any bid
U i ðvi , bi Þ ¼ ½vi bi Pr bi > bj , for all j (1)
greater than b(A) will win for sure. Thus, possible
deviation bids must be contained in the range [0,
Recall that a strategy is a mapping: bi : [0, A] b(A)]. Since the bid strategy b(.) is strict monotone,
! ℝ. We wish to find an equilibrium. One way to we can express this range with b(w) and w [0, A].
proceed is to invoke calculus and work through If bidder i with valuation x = vi bids b(w) rather
the first-order conditions. Another way to proceed than b(x), her payoff is
is to impose assumptions on the equilibrium and
use those in the derivation. At the end, we then
Ui ðx, bðwÞÞ ¼ ½x bðwÞFðwÞN1
have to verify that indeed it is an equilibrium. This
is the approach we shall adopt.
We restrict attention to bidding strategies Taking the derivative with respect to w yields
which are differentiable, strict monotone increas-
ing, and symmetric. Thus, there exists a differen- @Ui ðx, bðwÞÞ @ FðwÞN1
tiable strict monotone increasing function ¼ ½ x bð w Þ
@w @w
b : [0, A] ! ℝ so that 0 N1
b ðwÞFðwÞ
bi ðvi Þ ¼ bðvi Þ for all i and vi ½0, A (2)
For bidding b(x) to be optimal, this derivative
The restriction allows us to simplify the prob- must be zero when evaluated at w = x
lem. The winning probability when other bidders 2 3 2 3
use the strategy b(vj) is given by Pr(bi > bj, for @ FðxÞN1 @ FðxÞN1
x4 5 ¼ b ðx Þ4 5 þ b0 ðxÞFðxÞN1
all j 6¼ i) = F(b1(bi))N 1. To see this, consider @x @x
the following equivalent expressions:
Integrating both sides with respect to x
Pr bi > bj , for all j 6¼ i
¼ Prðbi > bðv1 Þ, . .. , bi > bðvN ÞÞ 2 3
ð vi @ FðxÞN1
¼ Pr b1 ðbi Þ > v1 Pr b1 ðbi Þ > vN
N1 x4 5dx ¼ bðvi ÞFðvi ÞN1
¼ F b1 ðbi Þ 0 @x
The first line writes out what bidder i’s winning bð0ÞFð0ÞN1
probability is when other bidders follow the bidding
strategy b(.). The second equation uses the indepen- Since b(0) = 0 and F(0) = 0, we have an
dence assumption and the strict monotonicity explicit solution for the bid function
340 Auctions
2 N1
3
Richer informational environments have been
Ð vi @ F ð x Þ
4 5dx studied by a number of authors. Milgrom and
0 x @x Weber (1982) is the classic reference for equilibria
bð v i Þ ¼ in standard auction formats with symmetric bid-
Fðvi ÞN1
ders with interdependent values and affiliated sig-
Observe that this strategy is indeed differentiable nals encompassing both the common value and
and strict monotone. It is thus an equilibrium. The private value models. Bergemann et al. (2017)
right-hand side is the conditional expected value of examine bidding implications in first-price auc-
the random variable v(2) with probability density tions when the information structures specifying
@ ðFðxÞN1 Þ bidders’ information about their own and others’
function @x conditional on v(2) being less values are not restricted.
than the valuation vi. This leads us to the following We next illustrate the Bayesian Nash equilib-
result for first-price auctions. rium in two variants of the first-price auction:
(i) in which all bidders pay their bid and (ii) in
Theorem 2 (First-Price Auction Equilibrium) the Dutch auction.
The Bayesian Nash equilibrium bid function in
the first-price auction is All-Pay First-Price Auction
In an all-pay first-price sealed-bid auction bid-
bðvi Þ ¼ E vð2Þ j vð2Þ < vi
ders submit sealed bids. The high bidder wins the
The equilibrium bid function has the feature item. In case of a tie, the auctioneer randomly
that bidder i marks her valuation vi down. Bidder assigns the item to one of the high bidders with
i bids the expected value of the high-valuation equal probability. All bidders (including losing
competitor conditional on her competitors’ valu- bidders) pay their bid.
ations being less than her own. Bidder i’s (interim) expected payoff Ui will
depend on her valuation vi and her submitted bid
vi. For an all-pay first-price auction, the (interim)
Example Let’s consider a simple two bidder case expected payoff equals
with F the uniform distribution on [0,1]. The
equilibrium bid function becomes Ui ðvi , bi Þ ¼ vi Pr bi > bj , for all j bi (4)
Ð vi
xdx Recall that a strategy is a mapping bi: [0,
bð v i Þ ¼ 0
vi A] ! ℝ. We wish to find an equilibrium. We follow
vi a similar approach as in the first-price auction
¼
2 which leads to a differential equation. Suppose
bidders use a strictly increasing bid function b(.).
What is v2i ? It equals the expected valuation of As before, this implies that Pr (bi > bj, for all
your competitor conditional on your competitor’s j) = F(b1(bi))N 1. If bidder i with valuation
valuation being less than your own, E [v(2)|v(2) < vi]. x = vi uses b(w) rather than b(x), her payoff is
Observe that the first-price auction is efficient.
The bidder with the high valuation wins the item. U i ðx, bðwÞÞ ¼ xFðwÞN1 bðwÞ
The reason is that the equilibrium bid function is
Taking the derivative with respect to w yields
identical and strict monotone increasing.
The equilibrium derivation was based on sym-
metric, strict monotone bid functions. The ques- @U i ðx, bðwÞÞ @ FðwÞN1
tions arise whether there are other equilibria. The ¼x b0 ð w Þ
@w @w
answer is no. Maskin and Riley (2003) have
shown that the above bidding strategy is the only For bidding b(x) to be optimal, this derivative
equilibrium in first-price auctions. must be zero when evaluated at w = x
Auctions 341
2 3
Comparing Auction Outcomes
@ FðwÞN1
x4 5 ¼ b0 ð x Þ
@w From an economic perspective, there are three key
w¼x
dimensions in which auction formats can be com-
Integrating both sides with respect to x pared, based on (i) revenues to the seller, (ii) rents
to buyers, and (iii) efficiency. We begin by com-
2 3 paring the first-price and second-price auction
ð vi @ FðwÞN1 ð vi
x4 5 dx ¼ b0 ðxÞdx outcome under the independent private value
0 @w 0 framework with risk-neutral buyers. Then we
w¼x
explore how the auction comparison looks like
The left-hand side can be integrated by parts, when we depart from this set of assumptions.
which yields A central result in the auction literature is the
equivalence theorem expressed in terms of
h ivi ð vi expected revenues, expected utilities, and effi-
xFðxÞN1 FðxÞN1 dx ¼ bðvi Þ bð0Þ ciency; see Vickrey (1961), Riley and Samuelson
0 0
(1981), and Myerson (1981) in increasing gener-
Since b(0) = 0, we have an explicit solution for ality. We shall consider the comparison of first-
the bid function price and second-price auctions only. The result
has been extended to wider class of auctions. The
ð vi
N1 following theorem is based on the dominant strat-
bðvi Þ ¼ vi Fðvi Þ FðxÞN1 dx egy equilibrium in the second-price auction char-
0
acterized in Theorem 1. We shall return to the
This leads us to the following result for all-pay pooling equilibrium in second-price auctions
first-price auctions. later on in the section on empirics of auctions.
Theorem 3 (All-Pay First-Price Auction Equi- Theorem 4 (Equivalence) Consider the sym-
librium) The Bayesian Nash equilibrium bid metric independent private value framework
function in the all-pay first-price auction is bðvi Þ with risk-neutral buyers. The following properties
Ðv hold:
¼ vi Fðvi ÞN1 0 i FðxÞN1 dx.
Observe that the all-pay first-price auction is (a) The expected revenue to the seller is the
efficient. The bidder with the high valuation wins same in the first-price and second-price auction.
the item. The reason is that the equilibrium bid (b) The item goes to the buyer who values it the
function is identical and strict monotone most in the first-price and second-price auction.
increasing. (c) The expected utility to the buyer is the same
in the first-price and second-price auction.
valuation. In expectations, the realization will better for the seller and examples in which it does
equal the expected value. worse. In terms of efficiency, the second-price auc-
(b) In the standard auction formats, the high- tion equilibrium remains efficient, while the first-
valuation bidder wins the item. The reason is that price auction equilibrium is no longer efficient.
the bid functions are identical across bidders and Third, consider interdependent valuations. The
strict monotone. classic paper analyzing bidding equilibria in this
(c) From part (b), the allocation is the same in case, also permitting that signals are correlated, is
both a second-price auction and a first-price auc- Milgrom and Weber (1982). They show that with
tion. In both cases, the high-valuation buyer wins. affiliated interdependent valuations, the English
Part (a) shows that the expected revenues are the auction performs best in terms of revenues
same. Therefore, expected utility must be followed by the second-price and then the first-
the same. price auction.
Myerson (1981) characterizes the revenue-
Notice that parts (a) and (c) are in terms of maximizing auction. This pioneering paper
expectations. Revenue, or utility, realizations develops a new approach, in which the auction
need not be the same as the price paid may differ rule is the choice variable. Myerson’s studies auc-
across auction formats. tion rules from a mechanism design perspective in
How robust is the above theorem to departures which buyers announce their valuations and the
from the assumptions? We shall see that the result mechanism determines the allocation and transfer
is very fragile. If any of the assumptions is mod- payments. Myerson shows that with independent
ified, the equivalence result breaks down. We private values, both first- and second-price auc-
shall discuss some contributions in this literature. tions are optimal, but not if bidders private values
Suppose buyers are risk averse, instead of risk are asymmetric or correlated.
neutral, while maintaining all other assumptions. In So far, we have considered the choice of auc-
a second-price auction, the (weakly dominant strat- tion format. Within an auction format, the seller
egy) equilibrium is not affected. It remains an equi- can fine-tune the auction outcome. The seller may
librium that bidders bid their value. The equilibrium decide a minimum bid level below which bids are
construction and proof of Theorem 1 remain valid in rejected, whether to charge bidder participation
this case. Next consider a first-price auction. Maskin fees, or how much information about the object
and Riley (1984) show that the equilibrium changes. to make available to bidders. The minimum bid
Bidders bid more aggressively. The intuition is that level or reserve price is easily understood with a
bidders face the following trade-off. Bidding higher single bidder. In the absence of a reserve price, the
increases the chances of winning but comes at a bidder would acquire the item at a price of zero.
utility loss of the higher bid. The first effect is not With a reserve price, the seller can force the bidder
affected by attitudes toward risk, but the second is to pay a positive price at the cost of sometimes not
valued less when bidders are risk averse instead of selling the item. The optimal reserve is achieved at
risk neutral. In terms of the revenue ranking, this the point where the marginal benefit of increasing
means a seller is better off with a first-price auction, the reserve price becomes zero. Myerson (1981)
while buyers prefer the second-price auction; see gives an intuitive interpretation for the optimal
Matthews (1987). reserve price formula. It is the valuation where
Suppose bidders are asymmetry, that is, bid- information revelation costs (or incentive costs)
ders draw their valuation from distinct probability equal the benefits from information.
distribution functions, while maintaining all other The precision of information available to bid-
assumptions. Maskin and Riley (2000) have ders can be influenced, by either the auctioneer or
shown that revenue (and utility) ranking can go the buyers. For example, in a used car auction,
either way. the seller can decide whether bidders may inspect
There are parametric examples of valuation dis- or even test-drive the car prior to the auction.
tributions in which the first-price auction does Similarly, on eBay, the seller can decide on the
Auctions 343
informativeness of the item description. In oil these extensions later on. We shall start with
auctions bidders may decide how much money first-price auctions and then consider second-
to spend on geological studies. Persico (2000) price (or English) auction data.
considers an affiliated-values environment and Before proceeding, let us raise one central
shows that bidder incentives to acquire informa- issue for empirical work which concerns the
tion differ across auctions with the marginal ben- well-known problem of selection bias. The theo-
efit of additional information being higher in a retical analysis and equilibrium characterization
first-price than in a second-price auction which for specific auction rules assumes a known num-
may overturn Milgrom and Weber (1982) reve- ber of “potential” bidders. In fact, not all of these
nue ranking result. Bergemann and Pesendorfer potential bidders may submit a bid. For example, a
(2007) study the joint decision problem when the reserve price or bidder participation cost may
seller controls the information and the auction reduce bidder participation. Empirically, this
rule in a private value setting. They show that poses a problem as we only observe the “actual”
increased information has the benefit of enhanc- bidders and need to infer the set of “potential”
ing efficiency but comes at an information reve- bidders. Put differently, the number of observed
lation cost. The optimal seller’s policy has to bids may not be an accurate picture of the degree
balance these two elements. With few bidders, of competition. One way to proceed is to estimate
providing little information is optimal, while the potential number of bidders by using the max-
when the number of bidders increases, the effi- imum number of observed bids across auctions or
ciency motives dominate the information revela- a subset of auctions. Such an extremum estimator
tion cost element. has nice asymptotic properties. Another way is to
This section has examined some optimal auc- model this selection explicitly; see Li and Zheng
tion design questions. Next, we shall consider (2009) and Athey et al. (2011). We shall ignore
empirics of auctions. this issue here in this exposition and assume that
the actual number of bidders equals the potential
number of bidders.
Empirics of Auctions
Bid data are available for many auctions and allow Empirics of First-Price Auctions
researchers to study bidders’ behavior. The empir- Consider the following assumptions about the
ical literature has focused on two central ques- data-generating process consisting of a cross sec-
tions: first, how to measure or quantify the tion of first-price sealed-bid auctions.
underlying informational distribution from bid
data and, second, how to design and assess the Assumption The data-generating process for the
N
optimal auction for a market at hand. The first bid data bti i¼1 is equilibrium bidding for inde-
question in terms of econometrics is about the pendent private-value first-price auctions in which
identification and inference of parameters deter- each auction t, t = 1, . . . T had (i) an identical
mining the distribution of information. The sec- object and (ii) a fixed (and known) number of
ond question is motivated by the fragility of the identical bidders N.
revenue equivalence theorem. Which elements,
bidder asymmetry, risk aversion, and common Equilibrium bidding means that bidders follow
versus private values, are key drivers? An answer the unique bidding strategy characterized in Theo-
to the second will tell us what auction rule is best rem 2. Independent private values mean that a bid-
used in practice, a market design question. der i’s valuation equals her signal, vi = yi, and is
This section illustrates some empirical issues drawn identically and independently from a distri-
based on hypothetical and stylized data set. We bution F. The fixed number of bidders means that
shall ignore bidder heterogeneity, auction hetero- we do not have to worry about the distinction
geneity, and covariates. We shall comment on between “actual” and “potential” bidders.
344 Auctions
The empirical question is how to estimate the even sequential auctions. In practice, researchers
cdf F from bid data. Distinct estimation methods tend to use parametric approaches to estimate the
exist. Our description shall focus on a popular distributions to allow for covariates to enter; see
and commonly used inference approach. This Hong and Paarsch (2006).
approach looks at the optimal bid choice vis–àvis
the empirical distribution of opponent bids. Empirics of Second-Price Auctions
Following Guerre et al. (2000), the problem Suppose the data-generating process is a second-
can be formulated based on H(b) the probability price auction instead of a first-price auction.
distribution function of bids b. Bidder i’s problem
of finding a bid that maximizes the expected pay- Assumption The data-generating process for the
off in a first-price auction under risk-neutrality, N
private values, and when bids are independently bid data bti i¼1 is equilibrium bidding for indepen-
distributed, can be written as: dent private-value second-price auctions in which
each auction t, t = 1, . . . T had (i) an identical object
max ½vi bi H ðbi ÞN1 : and (ii) a fixed number of identical bidders N.
bi
We have seen earlier that there can be multiple
The first-order condition is equilibria in second-price auctions. One equilib-
rium, in which one bidder bids high and all other
H ðbi ÞN1 þ ½vi bi ðN 1ÞH ðbi ÞN2 H0 ðbi Þ bidders bid low, which is called a “pooling” equi-
¼ 0, librium, has the feature that multiple valuations
rationalize a bid. If this equilibrium arises in the
which can be rewritten to obtain an explicit
data, then it may not be possible to infer the details
expression for the valuation vi
of the distribution of valuations. However, bounds
H ð bi Þ on the support of valuations could be inferred.
v i ¼ bi þ : (5) On the other hand, if the dominant strategy equi-
ðN 1ÞH 0 ðbi Þ
librium is played, in which a bid equals the value,
Equation 5 is the inverse of the theoretical bid then inference of the valuation distribution is
function characterized in the section entitled “First- straightforward. In this case, the distribution of val-
Price Sealed-Bid Auction”. It tells us which valua- uations can be estimated by using the empirical cdf
tion rationalizes the observed bid. The right-hand of bids
X
side elements are the bid bi, and the number of Fbð x Þ ¼ 1 1 bti x :
bidders N, the cdf H, and the pdf H0 . The pdf
and
TN i, t
cdf can be estimated from the bid data bti i, t by The dominant strategy equilibrium allows the
using a suitable estimator. For example, the cdf econometrician to readily infer the underlying
H can be consistently estimated with the empirical valuations and distribution of valuations from
P
cdf H b ðx Þ ¼ 1 1 b t
x where 1(.) is the the observed bids.
TN i, t i
indicator function. The indicator function equals In practice the researcher may not know the
one, if the argument is nonnegative, and zero, type of equilibrium played. Furthermore, different
otherwise. equilibria may be played in the cross section of
Guerre et al. (2000) advocate a nonparametric auctions. How to deal with inference in this case
estimator in which first the cdf and pdf of bids are has been an ongoing research area not only in the
estimated by using kernel estimators and then in empirical auction literature but in economics in
the second step, (pseudo) valuations and their general; see Tamer (2003).
distribution are inferred. In practice this approach
has gained a lot of popularity. One reason is its
simplicity. A second reason is that it extends read- Winner’s Curse
ily in various directions including bidder hetero-
geneity, different informational distributions, In addition to using field data, economists also use
different auction rules, multi-unit auctions, and laboratory experiments to study market outcomes.
Auctions 345
Bazerman and Samuelson (1983) conducted a Table 1 reports selected summary statistics for
first-price sealed-bid auction experiment with newly explored areas, Wildcat tracts, see Porter
MBA students at Boston University. The number (1995, Table II). All dollar figures are in million of
of student bidders varied between 4 and 26 across 1972 dollars. Standard deviations are in parenthe-
classes, and in each class, a jar of 800 pennies was sis. To the extent that the value of the oil field is
offered for sale. The value of the jar was unknown the same for all bidders, say because of competi-
to students. Students were asked to provide their tive raw oil prices, the informational environment
bid and best estimate for the jar value. The aver- can be viewed as common values.
age value estimate equaled $5.13 which is $2.87 Interestingly there is substantial uncertainty
below the true value. The average winning bid about the price of these oil fields. The amount
was $10.01 which amounts to an average loss of overpaid by the winning bidder, (b(1) – b(2))/b(2),
$2.01 with losses occurring in over half of all the or “money left on the table,” averages to 44% or
auctions. The evidence suggests a “winner’s about 2.67 million of the average price of 6.07
curse.” million dollars. The uncertainty does remain sub-
The curse can be explained by using a (pure) stantial even when the number of bidders equals
common value environment in which bidder i’s ten or more and amounts to 30% of the final
signal yi is bidder i’s unbiased estimate of the price paid.
value v, E[yi| v] = v (A framework that satisfies A winner’s curse implies a positive correlation
this assumption is when signals are noisy esti- between the price paid and the number of bids.
mates of the true jar value, yi = v + ei with Indeed, the data exhibit a positive correlation, as is
Eei = 0 iid.). If bidders use a monotone increasing evident in row 1. With one bidder, the average
bidding strategy, then the auction winner will price paid, b(1), equals 1.50 million dollars, while
be the bidder with the high signal, maxi{yi}. with ten or more bidders, the price paid increases
The curse arises as the high-signal bidder in fact to 21.8 million. The price paid increases as the
overestimated the true value, E[maxi{yi}| v] > number of bidders increases. This positive corre-
maxiE[yi| v] = v. The result follows since max is lation can alternatively be attributable to endoge-
a convex function and from Jensen’s inequality, nous bidder participation decisions. For example,
which says that for a convex function f, it must be high-valued tracts may attract more bidders. To
that Ef (y) > f(Ey). Thus, winning confers bad consider this alternative hypothesis, Table 1
news in the sense that the bidder learns ex post reports what fraction of tracts were drilled and
that she was the high-signal bidder. also productive (oil was found). Conditional on
How can we interpret the winner’s curse in being drilled and oil is found, the last row reports
terms of equilibrium bidding? The section entitled the discounted revenues (ignoring drilling costs).
“First-Price Sealed-Bid Auction” has shown that Table 1 shows that tracts of higher value attract
bidders will shade their bid down in a private more bidders.
value setting. Now, with common values, bidders Table 2 reports a crude assessment of net
will shade their bid down even further in antici- returns calculated from Table 1. Net returns are
pation of the bad news effect. In particular, bid- defined as discounted revenues times the proba-
ders will calculate the value of winning based on bility of drilling times the probability of being
having the high signal for the object, E[v| yi, productive minus the winning bid. This calcula-
yi yj for all j]. tion ignores drilling costs. Table 2 shows that net
Evidence from field data about a winner’s returns are positive throughout. There is no evi-
curse is mixed. Porter (1995) summarizes his dence that winning bidders make a loss in these oil
joint work with Hendricks on field data for com- field auctions. The evidence appears to reject the
mon value auctions. They analyze the US offshore winner’s curse.
mineral rights program in which the right to Investigating bidding strategies further, Porter
extract oil of the US coast has been awarded in reports an interesting (ex post) best response test
form of a first-price sealed-bid auction since the which is a precursor to the (interim) best response
1950s. formula developed by Guerre et al. (2000).
346 Auctions
Consider the set of tracts on which a given firm A bidding ring has to designate a cartel bidder.
submits bids. Assume that the bids of rival firms Suppose all bidders collude, so that the bidding
and ex post returns are held fixed. Suppose the ring is all-inclusive. Ideally, the ring would like to
vector of bids submitted by the firm in question is send only one bidder to the seller’s auction and
varied proportionally. If all of the firm’s bids are ask all other ring members to refrain from bidding
increased, it will win more tracts but earn less per or to submit phony bids. How can the ring deter-
tract. In that way the optimal bid proportion that mine their designated cartel bidder? Graham and
maximizes ex post returns can be calculated. Por- Marshall (1987) point out that a pre-auction
ter reports that few firms did not behave optimally knockout can achieve this. Suppose the colluding
and overbid. The calculation is ex post and does ring holds an English auction prior to the seller’s
not take into account the uncertainty bidders were auction and shares the proceeds periodically in
facing at the time of bid submission. equal shares. The winner of the pre-auction
So far our analysis assumed that bidders knockout gets the right to be the only (serious)
behave competitively. Next, we shall describe bidder at the seller’s auction and pays the differ-
some issues when bidders collude. ence between the second highest bid and the
seller’s reserve price, b2 – R. Theorem 1 shows
that there exists an equilibrium in which bidders
Collusive Bidding bid their value. Thus, the high-value bidder wins
the knock-out auction and pays the second highest
Collusive bidding is illegal in many settings, but bid minus the seller’s reserve price (provided the
the temptation exists which has led to many bid value is above the reserve price).
rigging cases pursued by antitrust authorities. Observe that the collusive outcome achieves
Exceptions include joint bidding in OCS auctions, an efficient allocation. The high-value bidder
which is legal at least among some bidders as wins. The rent distribution is now shifted relative
described in Porter (1995), and subcontracting, to the competitive outcome, with a larger share of
which is legal in some procurement auctions. We the rent going to bidders instead of the seller.
shall discuss some issues relating to collusion What could the auctioneer do in response? Well,
among bidders. Collusion may also arise between anticipating that a ring is in operation, the seller
the auctioneer and one or more bidders, but we can increase the reserve price. The optimal level
shall leave that aside. would be based on the expectation that the seller
Auctions 347
faces a single cartel bidder with valuation being examined well-established results including com-
the high valuation from all ring. petitive bidding behavior in standard auction for-
The pre-auction knockout requires periodic mats, revenue and utility comparison across
division of spoils. Such side payments may leave auction formats, empirics of auctions, winner’s
a paper trail which increases the risk of detection curse, and collusion. The setup involved a one-
by antitrust authorities. To minimize the risk of shot single-unit auction using a standard auction
detection, the ring may refrain from using side format.
payments all together. Instead the ring may use Richer settings involving more goods, sold
some other scheme or mechanism. Let qi ðvbÞ sequentially or simultaneously, and/or more
denote the probability that bidder i is the desig- involved market rules have already received atten-
nated cartel bidder when ring members announce tion in the economic literature and will attract
a valuation profile vb ¼ ðvb1 , . . . , vbN Þ to the ring more interest in the future. Market and mechanism
mechanism. For bidders to tell the truth, it has to design has become a successful area in economics
be that the expected probability of being the des- from a theoretical, empirical, and practical per-
ignated cartel bidder is independent of the own spective. It is to be expected that this will continue
announcement vbi . Otherwise, bidder i would to be a fruitful research area in the future.
announce the valuation that achieves the high
probability. In turn this implies that the ring mech-
anism must assign items irrespective of the valu-
ation. One scheme that achieves this is the “phases Bibliography
of the moon” allocation scheme, as used by com-
Athey S, Levin J, Seira E (2011) Comparing open and
panies involved in the great electrical conspiracy
sealed bid auctions: evidence from timber auctions.
during the 1960s. Notice though that such a Q J Econ 126:207–257
scheme is not efficient as not necessarily the bid- Bazerman MH, Samuelson WF (1983) I won the auction
der with the high valuation is selected. Thus, the but don’t want the prize. J Confl Resolut
27(4):618–634
collusive spoils will not be as high as when side
Bergemann D, Pesendorfer M (2007) Information struc-
payments are available. tures in optimal auctions. J Econ Theory 137:580–609
The empirical literature on collusion in auctions Bergemann D, Brooks B, Morris S (2017) First-price auc-
is small. Two questions have been the focus: first, tions with general information structures: implications
for bidding and revenue. Econometrica 85(1):107–143
how can collusive behavior be detected based on bid
Cassady R (1967) Auctions and Auctioneering. University
data and, second, how cartels behave in practice. of California Press, Berkeley
Porter and Zona (1993) propose a statistical test Graham DA, Marshall RC (1987) Collusive bidder behav-
procedure to determine whether a subset of bidders ior at single-object second-price and English auctions.
J Polit Econ 95(6):1217–1239
colluded or not. The test is applied to highway
Guerre E, Perrigne I, Vuong Q (2000) Optimal nonpara-
procurement auctions. Pesendorfer (2000) shows metric estimation of first-price auctions. Econometrica
that cartels may adopt distinct collusive schemes in 68(3):525–574
practice. Pesendorfer studies school milk auctions Hong H, Paarsch HH (2006) An introduction to the struc-
tural econometrics of auction data. MIT Press,
and finds that in one regional market, Florida, cartel
Cambridge
firms appear to use side payments, while in another Hortacsu A, McAdams D (2016) Empirical work on auc-
regional market, Texas, cartel firms refrained from tions of multiple objects. J Econ Lit (forthcoming)
using side payments. Klemperer P (1999) Auction theory: a guide to the litera-
ture. J Econ Surv 13(3):227–286
Krishna V (2002) Auction theory. Academic, San Diego
Li T, Zheng X (2009) Entry and competition effects in first-
Concluding Remarks price auctions: theory and evidence from procurement
auctions. Rev Econ Stud 76(4):1397–1429
Maskin E, Riley J (1984) Optimal auctions with risk averse
Since the seminal papers by Vickrey (1961) and buyers. Econometrica 52(6):1473–1518
Milgrom and Weber (1982), research on auctions Maskin E, Riley J (2000) Asymmetric auctions. Rev Econ
has created a large body of literature. This entry Stud 67:413–438
348 Auctions
Maskin E, Riley J (2003) Uniqueness of equilibrium in sealed Porter RH (1995) The role of information in U.-
high-bid auctions. Games Econ Behav 45:395–409 S. offshore oil and gas lease auctions. Econometrica
Matthews S (1987) Comparing auctions for risk averse 63(1):1–27
buyers: a buyer’s point of view. Econometrica Porter RH, Zona JD (1993) Detection of bid rigging
55(3):633–646 in procurement auctions. J Polit Econ 101(3):
McAfee P, McMillan J (1987) Auctions and bidding. 518–538
J Econ Lit 25:699–738 Riley JG, Samuelson WF (1981) Optimal auctions. Am
Milgrom PR, Weber RJ (1982) A theory of auctions and Econ Rev 71:381–392
competitive bidding. Econometrica 50:1089–1122 Tamer E (2003) Incomplete simultaneous discrete response
Myerson RB (1981) Optimal auction design. Math Oper model with multiple equilibria. Rev Econ Stud
Res 6:58–73 70(1):147–165
Persico N (2000) Information acquisition in auctions. Vickrey W (1961) Counterspeculation, auctions, and com-
Econometrica 68(1):135–148 petitive sealed tenders. J Financ 16(1):8–37
Pesendorfer M (2000) A study of collusion in first-price Wilson R (1977) A bidding model of perfect competition.
auctions. Rev Econ Stud 67(3):381–411 Rev Econ Stud 44:511–518
State of the world Description of all information
Implementation Theory possessed by all agents.
Type of an agent All the information possessed
Luis C. Corchón by this agent. It may refer to the preferences of
Departamento de Economía, Universidad this agent and/or to the knowledge of this agent
Carlos III, Madrid, Spain of the preferences of other agents.
Definition
Article Outline
Implementation theory studies which social
Glossary objectives (i.e., social choice rules) are compatible
Definition with the incentives of the agents (i.e., are
Introduction implementable). In other words, it is the system-
Brief History of Implementation Theory atic study of the social goals that can be achieved
The Main Concepts when agents behave strategically.
The Main Insights
Unsolved Issues and Further Research
Answers to the Questions Introduction
Bibliography
Dear colleague;
Glossary I wrote this survey with you in mind. You are
an economist doing research who would like to
Equilibrium concept A mapping (or a collection know why implementation is important. And by
of them) from the set of states of the world into this I do not mean why some people won the
allocations yielded by equilibrium messages. Nobel Prize working in this area. I mean, what
This equilibrium is a game-theoretical notion are the deep insights found by implementation
of how agents behave, e.g., Nash equilibrium, theory and what applications are delivered by
Bayesian equilibrium, dominant strategies, etc. these tools. I propose a simple game: try to answer
Implementable social choice rule in an equilib- the following questions. If you cannot answer
rium concept (e.g., Nash equilibrium) A them, but you think they are important, read the
social choice rule is implementable in an equi- survey. At the end of this survey, I will give you
librium concept (e.g., Nash equilibrium) if the answers. I will also tell you why I like imple-
there is a mechanism such that for each state mentation theory so much!
of the world, the allocations prescribed by the
social choice rule and those yielded by the 1. Why are agents price-takers? Is price-taking
equilibrium concept coincide. possible in economies with a finite number of
Mechanism A list of message spaces and an agents?
outcome function mapping messages into allo- 2. Suppose two firms wish to merge. They claim
cations. It represents the communication and that the merger will bring large cost reductions
decision aspects of the organization. but some people fear that the firms just want to
Social choice rule A correspondence mapping the avoid competition. What would be your advice?
set of states of the world in the set of allocations. 3. How should a monopoly be regulated when
It represents the social objectives that the society regulators do not know the cost function or
or its representatives want to achieve. the demand function of the monopolist?
4. How should it be determined whether or not a prescribed to them (or who will provide and pre-
public facility – a road, a bridge, and a serve capital in a system where the private property
stadium – should be constructed and who of such items is forbidden?)? Samuelson (1954)
should pay for it? voiced identical concern about the Lindahl solution
5. Is justice possible in this world? Can we rec- to allocate public goods: “It is in the selfish interest
oncile justice and self-interest? of each person to give false signals.” This concern
6. Can an uninformed planner achieve better allo- gave rise later on to the golden rule of incentives – as
cations than those produced by completely stated by Roger Myerson (1985): “An organization
informed agents in an unregulated market? must give its members the correct incentives to share
7. In competitive ice skating, the highest and information and act appropriately.” Earlier, it had
lowest marks awarded by judges are discarded aroused the interest of Leonid Hurwicz, the father of
and the remaining are averaged. Do you think implementation theory, in economic systems other
that this procedure eliminates incentives to than the market. In any case, it was clear that an
manipulate votes? important ingredient was missing in the theory of
8. What kind of policies would you advocate to economic systems. This element was that not all
fight global warming? information needed for resource allocation was
transmitted by prices: Some vital items have to be
The answers to these questions are found in transmitted by agents.
section “Answers to Questions.” The rest of this Several proposals arose to fill the gap: on the
paper goes as follows. Section “Brief History of one hand, models of markets under asymmetric
Implementation Theory” is a historical introduc- information, Vickrey (1961), Akerlof (1970),
tion that can be skipped. Section “The Main Con- Spence (1973), and Rothchild and Stiglitz (1976),
cepts” explains the basic model. Section “The and on the other hand, models of public interven-
Main Insights” explains the main results. tion, like optimal taxation, Mirless (1971), and
Section “Unsolved Issues and Further Research” mechanisms for allocating public goods, Clarke
offers some thoughts about the future direction of (1971) and Groves (1973), with the so-called
the topic. principal-agent models somewhere in the middle.
The keyword was “truthful revelation” or “incen-
tive compatibility”: Truthful revelation of informa-
Brief History of Implementation Theory tion must be an equilibrium strategy, either a
dominant strategy, as in Clarke and Groves, or a
From, at least Adam Smith on, we have assumed Bayesian equilibrium as in Arrow (1977) and
that agents are motivated by self-interest. We also D’Aspremont and Gerard-Varet (1979).
assumed that agents interact in a market economy A motivation for this procedure was provided by
where prices match supply and demand. This tradi- the “revelation principle,” Gibbard (1973),
tion crystallized in the Arrow-Debreu-McKenzie Myerson (1979), Dasgupta et al. (1979), and Harris
model of general equilibrium in the 1950s. But it and Townsend (1981): If a mechanism yields cer-
was quickly discovered that this model had impor- tain allocations in equilibrium, telling the truth
tant pitfalls other than focusing on a narrow class of about one’s characteristics must be an equilibrium
economic systems: On the one hand, an extra agent as well (however, telling the truth may not be an
was needed to set prices, the auctioneer. On the other equilibrium in the original mechanism you might
hand, agents follow rules, i.e., to take prices as have to use an equivalent direct mechanism). This
given, which are not necessarily consistent with result is of utmost importance and it will be thor-
self-interest. An identical question had arisen earlier oughly considered in section “The Main Con-
when Taylor (1929) and Lange (1936–1937), fol- cepts.” However, it was somehow misread as
lowing Barone (1908), proposed a market socialism, “there is no loss of generality in focussing on
where socialist managers maximize profits: Why incentive compatibility.” But what the revelation
would socialist managers choose output in the way principle asserts is that truthful revelation is one of
Implementation Theory 351
the, possibly, many equilibria. It does not say that with social objectives, the third revolves around
truthful revelation is the only equilibrium. As we the notion of a mechanism, and the last defines the
will see in some cases, it is a particularly unsatis- equilibrium concepts that we will use here.
factory way of selecting equilibria.
The paper by Hurwicz (1959), popularized by The Environment
Reiter (1977), presented a formal structure for the Let I = {1, . . . , n} be the set of agents. Let yi be
study of economic mechanisms which has been the type of i. This includes all the information in
followed by all subsequent papers. Maskin the hands of i. Let Yi be agent i’s type set. The set
(1999), whose first version circulated in 1977, is Y ∏ni¼1 Yi is the set of states of the world.
credited as the first paper where the problem of For each state of the world, we have a feasible
multiple equilibria was addressed as a part of the set A(y) and a preference profile R(y) =
model and not as an afterthought; see the report of (R1(y), . . . , Rn(y)). Ri(y) is a complete, reflex-
the Nobel Prize Committee (2007). Maskin studied ive, and transitive binary relation on A(y). Ii(y)
implementation in Nash equilibrium (see “Glos- denotes the corresponding indifference relation.
sary”). Later his results were generalized to Bayes- Set A [y YA(y). Let a=
ian equilibrium by Postlewaite and Schmeidler (a1, a2. . ., an) A be an allocation, also written
(1986) and Palfrey and Srivastava (1987, 1989). (ai, ai), where ai (a1, a2, . . . , ai 1, ai + 1,
Finally, Moulin (1979) studied dominance . . . , an).
solvability and Moore and Repullo (1988) sub- The standard model of an exchange economy
game perfect equilibrium. The century closed is a special case of this model: y is an economy.
with several characterizations on what can be Xi(y) ℜk is the consumption set of
implemented in other equilibrium concepts: i. wi(y) intXi(y) are the endowments in the
Moore and Repullo (1990) in Nash equilibrium, hands of i. The preferences of i are defined on
Palfrey and Srivastava (1991) in undominated Xi(y). The set of allocations A(y) is defined as
Nash equilibrium, Jackson (1991) in Bayesian (
X n
equilibrium, Dutta and Sen (1991a) in strong
AðyÞ ¼ a a þ wij ðyÞ 0, j ¼ 1, 2, . . . , k,
equilibrium, and Sjöström (1993) in trembling i¼1 ij
hand equilibria. With all these papers in mind,
the basic aspects of implementation theory are ðai1 , ai2 , . . . , aik Þ Xi ðyÞ, 8i I:g:
now well understood.
The interested reader may complement the A special case of an exchange economy is
previous account with the surveys by Maskin bilateral trading: Here there are two agents, the
and Sjöström (2002) and Serrano (2004) which seller and the buyer. The seller has a unit of an
cover the basic results and by Baliga and Sjöström indivisible good and both agents are endowed
(2007) for new developments including experi- with an infinitely divisible good (“money”). Pref-
ments. See also Maskin (1985), Moore (1992), erences are representable by linear utility func-
Corchón (1996), Jackson (2001), and Palfrey tions. The type of each agent, also called her
(2002). Several important applications of imple- valuation, is the marginal rate of substitution
mentation theory are not surveyed here: auctions, between both goods. Finally, the set of types is a
see Krishna (2002); contract theory, see Lafont closed interval of the real line.
and Martimort (2001); matching, see Roth Another example is the social choice model
(2008); and moral hazard, see Ma et al. (1988). where the set of states of the world is the Cartesian
product of individual type sets, Y ¼ ∏ni¼1 Yi. The
set of feasible allocations is constant. The prefer-
The Main Concepts ences of each agent only depend on her type, for
all y ϵ Y, Ri(y) = Ri(yi) all i ϵ I.
We divide this section into four subsections: The The model of public goods is a hybrid of the
first describes the environment, the second deals social choice and the exchange economy models.
352 Implementation Theory
For a subset of goods, say 1 , 2 , . . . , l, agents defective cars is less than the price of reliable cars.
receive the same bundle (these are the public But perhaps we may design ways in which the
goods). For goods l + 1 , . . . , k, agents can messages sent by different agents are checked one
consume possibly different bundles. against the other. We may also design ways in
which agents send information by indirect
Social Objectives means, say by raising flags, making gestures,
Implementation begins by asking what allocations and so on and so forth. This is the idea behind
we want to achieve. In this sense, implementation the concept of a mechanism (also called a
theory reverses the usual procedure, namely, fix a game form).
mechanism and see what the outcomes are. The Formally, a mechanism is a pair (M, g) where
theory is rather agnostic as to who is behind we: It M ∏n1 Mi is the message space and g : M ! A
could be a democratic society, it could be a dicta- is the outcome function. Mi denotes agent i’s
tor, it could be a benevolent planner, message space with typical element mi. In some
etc. Formally, a correspondence F : Y ↠ A such cases, i.e., when goods are indivisible, the out-
that F(y) A(y) for all y Y will be called a come function maps M into the set of lotteries on
social choice rule (SCR). Under risk or uncer- A, denoted by ℒA. In this case, the outcome func-
tainty, allocations are state-dependent (recall the tion yields the probability of obtaining an object.
concept of contingent commodities in general Let m = (m1, . . .mn) M, be a list of messages,
equilibrium). Thus, an allocation is a single- also written (mi, mi) where mi is a list of all
valued function f : Y ! A. The notion of a SCR messages except those sent by i.
is replaced by that of a social choice set (SCS) Another interpretation of a mechanism, more
defined as a collection of functions mapping Y in tune with decentralized systems, is that mes-
into A. Examples of SCR are the Pareto rule, sages describe contracts among agents, and the
which maps every state into the set of Pareto outcome function is a legal system that converts
efficient allocations for this state, the Walrasian contracts into allocations.
SCR which maps every economy in the set of If feasible sets are state-dependent, we have a
allocations that are a Walrasian equilibrium for problem: Suppose that at y we want to achieve
this economy, etc. allocation a A(y). So there must be a message,
If states of the world were contractible, i.e., if say m such that g(m) = a. But what if there is
they could be written in an enforceable contract another state, say y0 for which a 2 = A(y0)? In this
0
specifying the allocations in each state, SCR or case, g(m) 2 = A(y ). In other words, since mecha-
SCS would be directly achieved, assuming that nisms are not state-dependent, they may yield
those not complying could be punished harshly. unfeasible allocations. We will postpone the dis-
Unfortunately, states of the world are a description cussion of this problem until section “Unsolved
of preferences and productive capabilities, being Issues and Further Research.” For the time being,
those difficult to describe and therefore easy to let us assume that feasible sets are not state-
manipulate. Thus, we have to find another method dependent.
to reach the desired allocations.
Mechanisms Equilibrium
If the information necessary to judge the desirabil- Since the messages sent by agents are tied to their
ity of allocations is in the hands of agents, it seems incentives, it is clear that we have to use an equi-
that the only way of retrieving this information is librium concept borrowed from game theory.
by asking them. But, of course, agents cannot be Thus, given y Y, a mechanism (M, g) induces
trusted to reveal truthfully their information a game in normal form (M, g, y). There are many
because they might lose by doing so. Thus, the “solutions” to what would constitute an equilib-
owner of a defective car will think twice about rium. Let us begin by considering the notion of a
revealing the true state of the car if the price of Nash equilibrium.
Implementation Theory 353
Definition 1 A message profile m M is a (s1, . . . , sn) also written as (si, si). For simplic-
Nash equilibrium for (M, g, y) if, for all i I g ity, the next definition assumes that type sets are
ðm ÞRi ðyÞg mi , mi for all mi Mi. finite.
Let NE(M, g, y) be the set of allocations
yielded by all Nash equilibria of (M, g, y). We Definition 3 A Bayesian equilibrium (BE) for
now ask, given a SCR, what mechanism, if any, (M, g, R()) is a s* such that for all i, y Y,
would produce outcomes identical to the SCR? In and mi Mi,
this sense, the mechanism is the variable of our X
analysis, i.e., the mechanism “solves” the equa- qðyi j yi ÞV i ðgðs ðyÞÞ, yÞ
tion NE(M, g, y) = F(y), for all y Y. yi Yi
X
Formally, qðyi j yi ÞV i g mi , si ðyl iÞ , y
yi Yi
The Revelation Principle and Its Theorem 1 (T. 1 in the sequel) can be explained
Consequences in terms of a mediator, i.e., somebody to whom
The definition of a mechanism is extremely abstract. you say “who you are” and who chooses the
No conditions have been imposed on what might strategy that maximizes your payoffs on your
constitute a message space or an outcome function. behalf. Would you try to fool such a person? If
And since implementation theory considers the you do so, you are fooling yourself because the
mechanism the variable to be found, this is an mediator would choose a strategy that is not the
unhappy situation: We are asked to find something best for you. Thus, the best thing for you to do is to
whose characteristics we do not know! Fortunately, tell the truth (providing an unexpected backing to
the revelation principle comes to the rescue by stat- the aphorism “honesty is the best policy!”).
ing a necessary condition for implementation: If a Consider now the following results, due to
single-valued SCR, which we will call a social Hurwicz (1972) (who proved it for the case of
choice function (SCF), is implementable, there is a n = 2) and to Gibbard (1973) and Satterthwaite
revelation mechanism for which telling the truth is an (1975), respectively:
equilibrium. A r evelation mechanism (associated
with a SCF) is a mechanism in which the message Theorem 2 In exchange economy environments,
space for each agent is her set of types and the there is no SCF such that:
outcome function is the SCF. We say that a SCF is
truthfully implementable or incentive compatible if 1. It is truthfully implementable in dominant
truth-telling is a Bayesian equilibrium (or a dominant strategies.
strategy) of the direct mechanism associated with 2. It selects individually rational allocations.
it. The following result formally states the revelation 3. It selects efficient allocations.
principle: 4. Its domain includes all economies with convex
and continuous preferences.
Theorem 1 If f is a Bayesian (resp. dominant
strategy) implementable SCF, f is incentive Theorem 20 In social choice environments, there
compatible. is no SCF such that:
incompatible for any economy where utility func- equilibrium. The following result, due to Myerson
tions are quasi-linear, strictly concave, and differ- and Satterthwaite (1983), answers this question:
entiable and fulfill a very mild regularity
condition. These results show that Vickrey- Theorem 200 In the bilateral trading environ-
Clarke-Groves mechanisms fail to achieve effi- ment, there is no SCF such that:
cient allocations in general (Vickrey-Clarke-
Groves mechanisms are revelation mechanisms 1. It is truthfully implementable in Bayesian
that work in public good economies where utility equilibrium.
functions are quasi-linear in “money.” The out- 2. It selects individually rational allocations once
come function selects the level of public good that agents learn their types.
maximizes the sum of utilities announced by 3. It selects ex post efficient allocations.
agents, and the money received by each individual 4. Its domain includes all linear utility functions
is the sum of the utility functions announced by all with independent types distributed with posi-
other agents. For an exposition of these mecha- tive density, and the sets of types have a non-
nisms, see Green and Laffont (1979). The most empty intersection.
general domain in which they achieve efficient
allocations is in Tian (1996)). Proof (Sketch; see Krishna and Perry 1997 for
A proof of T. 2 can be found in Serizawa details) By the revenue equivalence theorem (see
(2002). Simple proofs of T. 20 can be found in Klemperer 1999, Appendix A), all mechanisms
Barberá (1983), Benoit (2000), and Sen (2001). fulfilling conditions (2) and (3) above raise iden-
T. 1 and 2–20 imply that there is no mechanism tical revenue. So it is sufficient to consider the
implementing an efficient and individually rational Vickrey-Clarke-Groves which, as we remarked
(resp. non-dictatorial) SCF in dominant strategies before, is not efficient.
when the domain of the SCF is large enough. In Again the weakening of any condition in T. 200
other words, the revelation principle implies that may produce positive results (Williams (1999),
the restriction to mechanisms where agents Table 1, presents an illuminating discussion of
announce their own characteristic is not important this issue). For instance, suppose seller valuations
when considering negative results. Thus, the reve- are 1 or 3, and buyer valuations are 0 or 2. The
lation principle is an appropriate tool for producing mechanism fixes the price at 1.5 and a sale occurs
negative results. But we will see that to rely entirely when the valuation of the buyer is larger than the
on this principle when trying to implement a SCF valuation of the seller. This mechanism imple-
may yield disastrous results. ments truthfully a SCF satisfying (2) and
Choosing wisely the domain, Barberà et al. (3) above. Unfortunately, it does not work when
(2010, 2016) have shown that many salient alloca- valuations are drawn from a common interval
tion rules are not only incentive compatible but also with positive densities.
group incentive compatible (i.e., truth is a dominant But unlike T. 2–20 , there are robust examples of
strategy for any coordinated deviation for any group SCF truthfully implementable in Bayesian equi-
of players) in a variety of allocation problems like librium when conditions (2) or (4) are relaxed.
voting, matching, fair division, cost sharing, house Also, inefficiency converges to zero very quickly
allocation, and auctions. The conditions used by when the number of agents increases (see Gresik
these authors are an example that the conflict and Satterthwaite 1989). This is because the equi-
between incentive compatibility and efficiency is librium concept is now weaker and we are
solvable as long as we are prepared to restrict the approaching a land where incentive compatibility
scope of applicability of the mechanism. has no bite, as we will see in T. 3 below.
A natural question to ask is what happens with First, d’Aspremont and Gerard-Varet (1975,
the above impossibility results when we weaken 1979) and Arrow (1979) showed that conditions
the requirement of implementation in dominant (1)-(3)-(4) are compatible with individual ratio-
strategies to that of implementation in Bayesian nality before agents learn their types in the domain
356 Implementation Theory
of public goods with quasi-linear utility functions. this allocation would be zero consumption for
They proposed the “expected externality mecha- everybody. Now we have the following result
nism” in which each agent is charged the expected (Repullo 1986; Matsushima 1988):
externality she creates on the remaining players.
Later on, Myerson (1981) and Makowski and
Mezzetti (1993) presented incentive compatible Theorem 3 If n = 2 and W holds, any SCF is
SCF yielding ex post efficient and individually truthfully implementable in Nash equilibrium. If
rational allocations in the domain of exchange n > 2, any SCF is truthfully implementable in
economies with quasi-linear preferences and Nash equilibrium.
more than two buyers. In Myerson (1981), agents
have correlated valuations. Buyers are charged Proof When n = 2, consider the following out-
even if they do not obtain the object or they may come function: g(y0, y0) = f(y0) 8 y0 Y,
receive money and no object or even receive the g(y , y ) = z for all y0 6¼ y00. Clearly, truth is an
0 00
object plus some money. Makoswki and Mezzetti equilibrium. When n > 2, consider the following
(1993) assume no correlation and that the highest outcome function: If m is such that n 1 agents
possible valuation for a buyer is larger than the announce state y0 , then g(m) = f(y0). Otherwise,
seller’s highest possible valuation. They consider g() is arbitrary. Clearly, truth is an equilibrium as
a family of mechanisms, called second price auc- well in this case.
tion with seller (SPAWS), in which the highest The first thing to notice is the difference
bidder obtains the object, the seller receives the between the cases of two and more than two indi-
first bid, and the winning buyer pays the second viduals. We will have more to say about this in the
price. These mechanisms not only induce truthful next section. The second is that the construction in
behavior and yield ex post efficient and individu- Theorem 3 produces a large number of equilibria
ally rational allocations: For any other mechanism and that there seems to be no good reason for
with these properties, we can find a SPAWS mech- individuals to coordinate in the truthful equilibria.
anism yielding the same allocation. For instance, suppose workers can be either fit
Suppose now that information is nonexclusive or unfit. When a profit-maximizing firm asks its
in the sense that the type of each player can be employees about their characteristics, and all
inferred from the knowledge of all the other workers are fit, a unanimous announcement such
players’ type. Intuition suggests that in this case, as “we are all unfit” is an equilibrium. If fit
incentive compatibility has no bite whatsoever workers are required hard work and unfit workers
(i.e., T200 does not apply) since the behavior of are asked to light work, do you think it is reason-
each player can be “policed” by the remaining able that workers coordinate in the truthful equi-
players. In order to prove this, we will concentrate librium? A more elaborate example was produced
on an extreme, but illuminating, case of non- by Postlewaite and Schmeidler (1986): There are
exclusive information, namely, Nash equilibrium. three agents. The first agent has no information
In this framework, since information is complete, and agents 2 and 3 are perfectly informed. The
a direct mechanism is one where each agent ranking of agent 1 over alternatives is the opposite
announces a state of the world. of agents 2 and 3 who share the same preferences.
Consider the following assumption: The SCF is the top alternative of agent 1 in each
state. It is intuitively clear that besides the truthful
• (W) ∃ z A such that 8y Y, 8a A, equilibria, there is another untruthful equilibrium
aRi(y)z, 8i I. where both informed agents lie, and they are
strictly better off than under truthful behavior.
This assumption will be called “universally Again, coordination in the truthful equilibrium
worst outcome” because it postulates the exis- seems very unlikely. Thus, we have to recognize
tence of an allocation which is unanimously that we have a problem here. The next section will
deemed as the worst. In an exchange economy, tell you how we can solve it.
Implementation Theory 357
Summing up, what do we learn from the results strategies, clearly, if all preference orderings are
in this section? strict, implementation and truthful implementation
becomes identical; see Dasgupta et al. (1979), Cor-
1. When looking for an implementable SCF, a ollary 4.1.4 (Laffont and Maskin 1982 present
useful first test is whether this SCF yields other conditions under which this result holds.
incentives for the agents to tell the truth; see See Repullo (1985) for the case where implemen-
T. 1. But this test is incomplete because of the tation and truthful implementation in dominant
existence of equilibria other than the truthful strategies do not coincide). For the ease of exposi-
one; see T. 3. These untruthful equilibria some- tion, we consider next Nash equilibria.
times sound more plausible than the It turns out that the key to this issue in the case
truthful one. of Nash equilibrium is the following monotonicity
2. All impossibility theorems – T. 2–20 –200 – have property, sometimes called Maskin monotonicity
the same structure: truthful implementation, because Maskin (1977) established its central rel-
individual rationality/non-dictatorship, effi- evance to implementation:
ciency/large range of the SCR, and large
domain. Usually, in social choice environ- • (M) A SCR F is monotonic if
ments, conditions 2 and 3 are weaker than in
economic environments but the condition on
the domain is stronger. fa FðyÞ, aRi ðyÞb ! aRi ðy0 Þb 8i I g
3. The classic story of the market making possi-
! a Fðy0 Þ:
ble efficient allocation of resources under pri-
vate information has to be revised. Private
Monotonicity says that if an allocation is cho-
information in many cases precludes the exis-
sen in state y and this allocation does not fall in
tence of any mechanism achieving efficient
anybody’s ranking in state y0 , this allocation must
and individually rational allocations under
also be chosen in y0 . We will also speak of a
informational decentralization; see T. 2–20 –200 .
“monotonic transformation of preferences at y00
4. The same remarks apply to naive applications
when the requirement aRi(y)b ! aRi(y0)b 8 i
of the Coase theorem where agents are sup-
I is satisfied. This requirement simply says that the
posed to achieve Pareto efficient allocations
set of preferred allocations shrinks when we go
just because they have contractual freedom
from y to y0 .
(ditto about bargaining theory). In the parlance
Monotonicity looks like a not unreasonable
of Coase, private information is an important
property, even though, as we will see in a moment,
transaction cost.
there are cases in which it is incompatible with
5. When mechanisms with adequate properties
other very desirable properties. In any case, the
exist, like those proposed by Arrow,
importance of monotonicity comes from the fact
d’Aspremont and Gerard-Varet, Myerson, and
that it is a necessary condition for implementation
Makowski and Mezzetti, they are not of the
in Nash equilibrium, as proved by Maskin (1977).
kind that we see in the streets. Careful design is
needed. These mechanisms are tailored to spe-
cific assumptions on valuations; thus, their Theorem 4 If a SCR is implementable in Nash
range of applicability may be limited. equilibrium, it is monotonic.
Let us now discuss the concept of monotonic- implementable, it satisfies M. Now since xRi(yL)
ity. First, the bad news. Popular concepts in vot- b ! xRi(y)b 8 i I, by M, x F(y).
ing, like plurality, Borda scoring, and majority Thus, under weak conditions, Walrasian allo-
rule, are not monotonic, neither is the Pareto cor- cation is always in the set of those selected by a
respondence; see Palfrey and Srivastava (1991), monotonic SCR. And these allocations may fail to
p. 484. Even the venerable Walrasian correspon- satisfy properties of fairness or justice as pointed
dence is not monotonic! The failure of the Pareto out by the critics of the market. Under stronger
and the Walrasian SCR to be monotonic can be assumptions, the converse is also true, i.e., only
amended: If preferences are strictly increasing in Walrasian allocations can be selected by a Nash-
all goods, the Pareto SCR is monotonic in eco- implementable SCR, Hurwicz (1979). Also, T. 5
nomic environments. The constrained Walrasian has the following unpleasant implication:
SCR – in which consumers maximize with respect
Theorem 6 There is no SCF in exchange econo-
to the budget constraint and the availability of
mies such that:
resources – is also monotonic. More serious is a
result due to Hurwicz (1979) that uses two weak
1. It is Nash implementable.
conditions on a SCR defined in the domain of
2. It selects individually rational allocations.
exchange economies:
3. ND holds.
4. It is defined on all exchange economies.
• (L) The domain of F contains all preferences
representable by linear utility functions.
Proof T. 5 implies that any Walrasian allocation
• (ND) If a F(y) and aIi(y)b 8 i I, then
belongs to the allocations selected by F. Since
b F(y).
Walrasian equilibrium is not unique for some
economies in the domain, hence the result.
The first condition is a rather modest require-
T. 6 has a counterpart in social choice domains,
ment on the richness of the domain of F. The
Muller and Satterthwaite (1977).
second is a non-discrimination property which
says that if everybody considers two allocations
to be indifferent and one allocation belongs to the Theorem 60 There is no SCF in a social choice
SCR, then it must be the other. Now we have the domain such that:
following:
1. It is monotonic.
Theorem 5 Let F be a SCR satisfying L and ND 2. It is not dictatorial.
and such that: 3. Its range is A with #A > 2.
4. It is defined on all possible preferences.
1. It is Nash implementable.
2. It selects individually rational allocations. An implication of T. 6–60 is that single-valued
SCR are still problematic. But the consideration of
Then, if x is a Walrasian allocation at y, multivalued SCR brings a new problem: the exis-
x F(y). tence of several Nash equilibria. For instance, if a,
b F(y) with a and b being efficient allocations,
Proof (Sketch; see Thomson 1985 for details) agents play a kind of “battle of the sexes” game
Take an economy y. Let x be a Walrasian alloca- with no clear results. Moreover, the Nash equilib-
tion for y. Consider a new economy, called yL, rium in mixed strategies may yield allocations
where the marginal rates of substitution among outside F(y) (the concern about mixed strategy
goods are constant and equal to a vector of equilibria was first raised by Jackson 1992).
Walrasian prices. By individual rationality, Now let us come to the good news. Firstly, the
F must select an allocation which is indifferent ND condition, which is essential for T. 5 to hold, is
to x. By ND, x F(yL). Since F is Nash not as harmless as it appears to be. For instance, it
Implementation Theory 359
is not satisfied by the envy-free SCR; see Thom- Rule 2 (one dissident). If there is only one agent
son (1987) for a discussion. Secondly, there are whose message is different from the rest, this
perfectly reasonable SCR which are monotonic: agent can choose any allocation that leaves her
We have already encountered the constrained worse off, according to her preference as
Walrasian SCR. Also any SCR selecting interior announced by others.
allocations in ℒA when preferences are von Rule 3 (any other case). a g(m) if a was
Neumann-Morgenstern is monotonic. In the announced by the agent who announced the
domain of exchange economies with strictly highest integer (ties are broken by an
increasing preferences, the core and the envy- arbitrary rule).
free SCR are also monotonic. In domains where
indifference curves only cross once – the single- Let us show that such a mechanism imple-
crossing condition – monotonicity vacuously ments any SCR with the required conditions.
holds. So monotonicity, restrictive as it is, is Clearly, if the true state is e
y, mi ¼ e
y, a, 1 with
worth a try. But before this, let us introduce a
new assumption: aF e y is a Nash equilibrium since no agent can
gain by saying otherwise, so Condition 1 in the
• (NVP) A SCR f satisfies no veto power if definition of Nash implementation holds. Let us
8y Y, now prove that Condition 2 there also holds. Sup-
pose we have a Nash equilibrium in Rule 1. Could
it be an “untruthful” equilibrium? If so, we have
faRi ðyÞb, 8b A, for at least n 1 agentsg two cases. Either the announced preferences are a
! a Fð y Þ monotonic transformation of preferences at e y, in
which case, M implies that the announced alloca-
In other words, if there is an allocation which is tion is also optimal at e
y. If they are not, there is an
top-ranked by, at least, n 1 agents, NVP agent who can profitably deviate. Clearly, if equi-
demands that this allocation belongs to the SCR. librium occurs in Rule 2, with, say, agent i as the
This sounds like a reasonable property for large n. dissident, any agent other than i can drive the
Also in exchange economies with strictly increas- mechanism to Rule 3, so it must be that all these
ing preferences and more than two agents, NVP is
agents are obtaining their most preferred alloca-
vacuously satisfied because there is no top alloca- tion, which by NVP belongs to F e y . An equi-
tion for n 1 agents. librium in Rule 3 implies that all agents are
The following positive result, a relief after so obtaining their most preferred allocation which,
many negative results, was stated and proved by again by NVP, belongs to F e y .
Maskin (1977), although his proof was The interpretation of the mechanism given in
incomplete: the proof of T. 7 is that if everybody agrees on the
state and the allocation is what the planner wants,
Theorem 7 If a SCR satisfies M and NVP is Nash this allocation is selected. If there is a dissident
implementable when n > 2. (a term due to Danilov 1992), she can make her
case by choosing an allocation (a “test alloca-
Proof (Sketch) Consider the following mecha- tion”) in her lower contour set, as announced by
nism. Mi Y
A
ℕ where ℕ is the set of others. Finally, with more than one dissident, it is
natural numbers. The outcome function has three the jungle! Any agent can obtain her most pre-
parts: ferred allocation by the choice of an integer. Typ-
ically, there is no equilibrium in this part of the
Rule 1 (unanimity). If m is such that all agents mechanism. Notice that (M) is just used to elimi-
announce the same state of the world, y, the nate unwanted equilibria.
same allocation a with a F(y) and the same The mechanism is an “augmented” revelation
integer, then g(m) = a. mechanism (a term due to Mookherjee and
360 Implementation Theory
Reichelstein 1990), where the announcement of the monotonic. Thus, allocations in the boundary
state is complemented with the announcement of an can be arbitrarily approximated by allocations in
allocation – this can be avoided if the SCR is single the interior.
valued – and an integer. The final proof of T. 7 was A more satisfying approach was introduced by
done independently by Williams (1986), Repullo Moore and Repullo (1988) by introducing sub-
(1987), Saijo (1988), and McKelvey (1989). game perfection as the solution concept. It is not
The case of two agents is more complicated possible to explain fully this approach here
because when an agent deviates from a common because it would take us too far; in particular,
announcement and becomes a dissident, she con- the notion of a mechanism must be generalized
verts the other agent into another dissident! As in to “stage mechanism.” Instead, we give a result
T. 3, W does the job, i.e., any SCR satisfying M, that conveys the force of subgame perfect imple-
NVP, and W is Nash implementable; see Moore mentation. It refers to public good economies with
and Repullo (1990) and Dutta and Sen (1991b) for quasi-linear utility functions – where under dom-
a full characterization. Again, the cases of two inant strategies the set of economies with ineffi-
agents and more than two agents are different. In cient outcomes is large- and with two
some areas of mathematics, such as statistics and individuals – where Nash implementability is
differential equations, the cases of two dimen- harder to obtain.
sions and more than two dimensions are also Suppose that utility functions read Ui =
different. The relationship of these with the find- V(y, yi) + mi where y Y ℜ, yi Yi, with
ings of implementation is not yet fully explored; #Yi < 1 and mi ℜ, i = 1 , 2. The set of allo-
see Saari (1987). cations {(y, m1, m2) Y
ℜ2/m1 + m2 o}
Under asymmetric information, M is where o are the endowments of “money.”
substituted by a – rather ugly – Bayesian monoto- Moore and Repullo (1988) proved the following:
nicity (BM) condition which is a generalization of
M to these environments. BM is again necessary Theorem 8 Any SCF is implementable in sub-
and, in conjunction with some technical condi- game perfect equilibrium in the domain of econ-
tions plus incentive compatibility, sufficient for omies explained above.
implementation in BE. The interested reader can Moore and Repullo proved that many SCR
do no better than to read the account of these which could not be implemented in Nash equilib-
matters in Palfrey (2002). It must be remarked rium can be implemented in subgame perfect
that many well-known SCR – including Arrow- equilibrium. This is because subgames can be
Debreu contingent commodities and some efficient designed to kill unwanted equilibria without
SCR – do not satisfy BM and thus cannot be using monotonicity. Their result was improved
implemented in BE. However, the rational expec- upon by Abreu and Sen (1991). The problem
tations equilibria and the (interim) envy-free SCR with this approach is that the concept of subgame
satisfy BM; see Palfrey and Srivastava (1987). perfection is problematic because it requires that,
T. 7 was the first positive finding of implemen- no matter what has happened in past, in the
tation theory. And it prompted researchers to be remaining subgame, players are rational, even if
more ambitious: Can we implement without this subgame was attained because some players
monotonicity? An interesting observation, due to made irrational choices.
Matsushima (1988) and Abreu and Sen (1991), is The Moore-Repullo result was not only impor-
that if agents have preferences representable by tant by itself but it opened the way to the consid-
von Neumann-Morgenstern utility functions, any eration of other equilibrium concepts that allow
SCR can be “virtually implemented” in the sense very permissive results. For instance, Palfrey and
that the set of allocations yielded by Nash equi- Srivastava (1991) proved the following result:
libria is arbitrarily close to the set of desired allo-
cations. This is because, as we saw before, any Theorem 80 Any SCR satisfying NVP is
SCR mapping in the interior of ℒA is implementable in undominated Nash equilibrium.
Implementation Theory 361
At this point, it seemed that by invoking the choice. These constructions eliminate
adequate refinement of Nash equilibrium, any SCR unwanted equilibria, which, as we saw before,
could be implemented. But the implementing is the problem with Nash implementation.
mechanisms were getting weird and some people Jackson illustrates his point by showing that
were beginning to get suspicious. Why and how is under no restrictions on mechanisms, any SCR
discussed in the next section. can be implemented in undominated strategies,
Summing up the results obtained here, we have a weak solution concept. Then, he requires that
the following: the mechanism be bounded in the following
sense: Whenever a strategy mi is dominated,
1. (Maskin) Monotonicity is a necessary and, in there is another strategy dominating mi and
many cases, sufficient condition for implemen- which is undominated. He shows that imple-
tation in Nash equilibrium; see T. 4 and 7. mentation in undominated strategies with
Similar results are obtained with Bayesian bounded mechanisms results many times in
monotonicity in Bayesian equilibrium. incentive compatibility, which as we saw in
2. The monotonicity requirements are not harm- section “The Main Concepts” is a hard require-
less. Many solution concepts do not satisfy ment. This shows the bite of the boundedness
it. Even worse, monotonicity has some unpal- assumption. However, in the case of imple-
atable consequences; see T. 5–6. mentation with undominated Nash equilib-
3. Monotonicity can be avoided by considering rium, the boundedness assumption has little
stage games or refinements of Nash equilib- impact; see Jackson et al. (1994) and Sjöström
rium. Practically, any reasonable SCR can be (1994). The first of these papers introduced a
implemented in this way; see T. 8–80 . related requirement, the best response prop-
erty: For every strategy played by the other
The Limits of Design agents, each agent has a best response.
So far, we have assumed that there are no limits to 2. Natural Mechanisms. Given that we have run
what the designer can do. She can pick up any so far from the kind of mechanism we are used
mechanism with no restrictions on its shape. This to, it seems reasonable to ask what can be
procedure, indeed, pushes the possibilities of implemented by mechanisms that resemble
design to the limit. But by doing this, we have real-life mechanisms. These mechanisms
learnt a good deal about the limitations of the must be simple too because simplicity is an
theory of implementation. It is fair to say that important characteristic in practice. Let us call
today the consensus is that there are some extra them natural mechanisms. Dutta et al. (1995)
properties which should be considered when consider mechanisms in which messages are
designing an implementing mechanism. We prices and quantities and thus resemble market
review here five approaches to this question: mechanisms. Their approach was refined by
Saijo et al. (1996) who demanded the best
1. Game-Theoretical Concerns. Jackson (1992) response property as well. They showed that
was the first to point out that some mechanisms several well-known SCR, such as the
had unusual features from the point of view of (constrained) Walrasian, are implementable in
game theory: Some subgames have no Nash Nash equilibrium. Beviá et al. (2003) showed
equilibrium. Message spaces, which in the that in Bertrand-like market games, the
corresponding game become strategy spaces, Walrasian SCR is implementable in Nash and
are unbounded or open. Thus, in the integer strong equilibrium, showing that the fear of
game considered in T. 7, if agents eliminate coalitions destabilizing market outcomes is, at
dominated strategies, each integer is domi- least, partially unwarranted. Sjöström (1996)
nated by the next highest one, and no integer considered quantity mechanisms, reminiscent
is undominated: Those agents who eliminate of those used by Soviet planners, with negative
dominated strategies are unable to make a results about what these mechanisms can
362 Implementation Theory
achieve. In public good economies, Corchón this approach is that the list of reasonable
and Wilkie (1996) and Peleg (1996) introduced constraints on allocations may be large. The
a market mechanism implementing Lindahl second possibility drives us to model imple-
allocations in Nash and strong equilibrium. mentation as a signaling game where the plan-
The mechanism works because Lindahl prices ner receives signals – messages – from the
have to add up to the marginal cost. If an agent agents, updates her beliefs, and then chooses
pretends to free ride, she decreases the quantity an allocation which maximizes her expected
of the public good. Here, contrary to utility (Baliga et al. 1997). Again, some SCR
Samuelson’s dictum, it is in the selfish interest that are Nash implementable are not
of each person to give true signals. Pérez- implementable in this framework. However,
Castrillo and Wettstein (2002) offered a bidding in this case, there are SCR that are not Nash
mechanism that implements efficient alloca- implementable but are implementable in this
tions when choosing between a finite numbers framework. This is because the model takes a
of public projects. They also applied these ideas basic assumption of game theory to the limit,
to extend the Shapley value to more general namely, that agents know the strategies of
environments, Macho-Stadler et al. (2007). other players. In this case, the planner knows
3. Credibility. Another implicit assumption is if a report on agents’ types is truthful or not
that once the mechanism is in place, there is before the allocation is delivered.
no way to stop it. Thus, if for some m, g(m) is a 4. Renegotiation. Another strong assumption is
“universally worst outcome,” the planner has that the mechanism prescribes actions that can-
to deliver this allocation even if she is trying not be changed by agents. This contradicts expe-
to implement a Pareto efficient allocation. Is riences such as black markets where agents trade
this a credible procedure? In many cases, if the on the existing goods (Hammond 1987). A way
planner is a real person, it seems that she of modeling this is to assume that agents are able
would do her best to avoid g(m)! Here we to renegotiate some allocations (Maskin and
have two possibilities: either we identify addi- Moore 1999. Renegotiation in a different con-
tional constraints on the planner that look text was considered by Rubinstein and Wolinsky
reasonable or we jump to model the planner (1992)). Assuming that agents have complete
as a full-fledged player. The first road leads us information, this is formalized by means of the
to identify a subset of allocations of A, say X, concept of a reversion function. This function,
which can never be used by the mechanism. say r, maps each allocation and each state of the
For instance, in Chakravorty et al. (2006) X, is world into a new allocation, i.e., r : A
Y !
the set of allocations that are never selected by A. The reversion function induces new prefer-
the SCR for some state of the world, i.e., ences, called reverted preferences (this is the
X = {a A/ ∄ y Y, a F(y)}. The “translation principle” in Maskin and Moore
motivation for this definition is that it hardly (1999)). Notice that reverted preferences are
seems credible that the planner can choose an state-dependent even if preferences are not. For-
allocation that is never intended to be mally, given a reversion function r, the reversion
implemented. Redefining the allocation set of R(y), denoted by Rr(y), is defined as aRri ðyÞb
as A A\X, the definitions of a mechanism , r ða, yÞRi ðyÞr ðb, yÞ, 8a, b A, 8i I . Given
and an implementable SCR can be easily a reversion function r, we can interpret that
translated in this framework. However, agents’ preferences are the reverted preferences.
depending on the domain, SCR that are mono- Then, all definitions given before can be adapted
tonic when defined on A are no longer mono- to this case. Again, SCR that were monotonic
tonic when defined on A, for instance, the there are not so in this framework and vice versa.
(constrained) Walrasian SCR. Thus, these See Jackson and Palfrey (2001) for applications.
SCR cannot be implemented when the planner An extension to the case where there are several
can only use allocations in A. A weakness of renegotiation functions is given by Amoros
Implementation Theory 363
(2004). A weakness of this approach is that it However, all the results presented in this sur-
models renegotiation as a “black box.” vey refer to environments where the feasible
5. Multiple Implementation. Maskin (1985) set is given, a far cry from any kind of planning
was the first to realize that the notion of imple- procedure. In fact, there are only a handful of
mentation requires the planner to know the papers dealing with implementation when the
solution concept used by the agents to analyze feasible set is unknown: Postlewaite
the game. He proposed the notion of “double (1979) and Sertel and Sanver (1999) studied
implementation” where a SCR was manipulation of endowments. Hurwicz et al.
implemented at the same time in Nash and (1995) studied implementation assuming that
strong equilibria. He showed that many Nash- endowments/production possibilities can be
implementable SCR indeed are doubly hidden or destroyed but never exaggerated.
implementable. We have seen in Point Instead of a mechanism, we have a collection
2 above that the (constrained) Walrasian and of state-dependent mechanisms each meant for
Lindahl SCR are doubly implementable by an economy. After the mechanism is played,
natural mechanisms. They are also doubly production capabilities are shown, e.g.,
implementable by abstract mechanisms, endowments are put on the table. This idea
Schmeidler (1980). Double implementation was worked out in a series of papers by Hong
also occurs with several solutions to the prob- on private good economies, Hong (1998), and
lem of the commons, Shin and Suh (1997), and by Tian on public good economies, Tian and Li
Pigouvian taxes, Alcalde et al. (1999). Yamato (1995). Serrano and Vohra (1997) worked out
(1993) introduced another type of double implementation of the core and Dagan et al.
implementation by requiring implementation (1999) of taxation methods. And that is all
in Nash and undominated Nash equilibria folks! Why has such an important issue been
(1993). He showed that in a large class of almost neglected? My explanation is that the
exchange economies with at least three agents, proposed mechanisms are difficult to under-
monotonicity is necessary and sufficient for stand. Another approach has been tried by
double implementation. Saijo et al. (2007) con- Corchón and Triossi (2011) where a reversion
sidered implementation in dominant strategies function takes care of restoring feasibility
and Nash equilibrium. Clearly, other variations when messages lead to unfeasible allocations.
of the idea of double implementation are pos- The approach is tractable and simpler but relies
sible; see Point 4 in section “Unsolved Issues on the black box of the renegotiation function.
and Further Research” below. 2. Sociological Factors/Bounded Rationality.
Summing up, it is now clear that So far, all the solution concepts describing the
implementing mechanisms cannot be just behavior of agents are game theoretical. In
“anything.” Their features matter. Demanding recent years, we have seen a host of equilib-
that mechanisms satisfy the best response rium concepts based on “irrational” agents. It
property, be simple, do not use extreme alloca- would be interesting to see what SCR can be
tions, and be robust to the possibility of rene- implemented with these forms of behavior.
gotiation and implement in several equilibrium Eliaz (2002) considers “fault-tolerant” imple-
concepts makes our lives more difficult but mentation where a subset of players (“faulty
makes our models a great deal better. players”) fail to achieve their optimal strate-
gies. Under complete information, no veto
power and a strong form of monotonicity are
Unsolved Issues and Further Research sufficient for implementation when the number
of faulty players is less than n/2 1 , n > 2.
1. Implementation with State-Dependent Fea- Matsushima (2008) and Dutta and Sen (2012)
sible Sets. A motivation of implementation show that a small preference for honesty is
theory was to study the possibility of socialism. sufficient to knock down unwanted equilibria.
364 Implementation Theory
Corchon and Herrero (2004) show that “decent agents in the island play and to any possible
behavior” can also be used to dispense with message sent by agents outside the island when
monotonicity. they follow their equilibrium strategies
3. Dynamic Implementation. The theory pre- (D’Aspremont and Gerard-Varet 1979). They
sented here is static but there are some papers showed that any SCR satisfying M and NVP is
dealing with implementation in dynamic setups. robustly implementable (a later contribution by
We mention a few: Freixas et al. (1985) studied Yamato 1994 showed that robust and Nash
the “ratchet effect”, where firms underproduce implementation coincide in this framework).
for fear of being asked to do too much in the The same concern has been approached in a
future. Kalai and Ledyard (1988) showed that if series of papers by Bergemann and Morris
the planner is sufficiently patient, every SCR is (see, e.g., 2005) where they ask SCR to be
dominant strategy implementable. Burguet implemented whatever the players’ beliefs
(1990/1994) showed that the revelation princi- and higher-order beliefs about other player’s
ple does not hold when outcomes are chosen in types. Artemov et al. (2007) require implemen-
several periods. Candel (2004) proved a revela- tation for the payoff type space and the space of
tion principle in a model where a public good is first-order beliefs about other agents’ payoff
produced in two periods. Cabrales (1999) and types. They obtain very permissive results.
Sandholm (2007) studied implementation in an
evolutionary setting. A related topic is that of In a different vein, Koray (2005) has argued that,
complexity; see Conitzer and Sandholm (2002). since priors are not contractible, the regulator needs
Lee and Sabourian (2011) studied implementa- to be regulated in order to stop her from manipu-
tion with infinitely lived agents whose prefer- lating the priors. He shows that the outcomes of this
ences are determined randomly in each period. game vary over a wide spectrum. Again the need of
A SCR is repeated-implementable in Nash equi- prior-free implementation is clear.
librium if there exist a sequence of (possibly
history-dependent) mechanisms such that
(i) its equilibrium set is nonempty and Answers to the Questions
(ii) every equilibrium outcome corresponds to
the desired social choice at every possible his- 1. Yes. We already saw in 4.3, Point 2, that
tory of past play and realizations of uncertainty. “Bertrand-like” mechanisms implement the
They show that, essentially, a SCR is repeated- constrained Walrasian SCR in Nash and strong
implementable if and only if it is efficient. equilibrium. But this is not all: Schmeidler
4. Robustness Under Incomplete Information. (1980) exploited the connection between
When designing a mechanism, sometimes the price-taking, which underlies Walrasian equi-
planner does not know the structure of infor- librium, and “strategy-taking”, which under-
mation. In this case, a mechanism must be lies Nash and strong equilibrium and obtained
implemented regardless of the structure of double implementation by a mechanism which
information, i.e., priors of agents, type spaces, does not resemble the market. Implementation
etc. Corchón and Ortuño-Ortín (1995) of the Lindahl SCR by an abstract mechanism
approached the problem by assuming that the was obtained by Walker (1981) building on
economy is composed of “islands” and that previous papers by Groves and Ledyard and
there is complete information inside each Hurwicz. Unfortunately, these positive results
island. A mechanism robustly implements a turn negative when we consider Arrow-Debreu
SCR if it does it in BE for every possible contingent commodities, Chattopadhyay et al.
prior (compatible with the island assumption) (2000) and Serrano and Vohra (2001).
and in uniform Nash equilibrium. The latter 2. A merger affects social welfare in two ways:
requires that an equilibrium strategy for an positively, from cost savings, and negatively,
agent must be the best reply to what other from restricting competition. The first effect is
Implementation Theory 365
uncertain, and by now, I do not have to con- all perceive the same quality of a given perfor-
vince you that we should take with utmost mance. Clearly, truth is an equilibrium,
caution all announcements made by firms because if all judges minus you tell the truth,
concerning cost savings. Corchón and you cannot change the outcome by saying
Faulí-Oller (2004) show that under a condition something different. Unfortunately, any unan-
that is fulfilled in several standard IO models, imous announcement is also an equilibrium by
the SCR that maximize social surplus can be the same reason. Thus, we are in a situation
implemented by a dominance-solvable mecha- akin to T. 3. Fortunately, if preferences of
nism with budget balance. judges fulfill certain restrictions, full imple-
3. There is a very simple mechanism which mentation of the true ranking of ice skaters is
attains maximum surplus, Loeb and Magath possible, because monotonicity and no veto
(1979). But in this mechanism, the monopolist power hold so T. 7 applies, Amoros et al.
receives all the surplus and the demand func- (2002). Amorós (2009) found necessary and
tion must be known by the planner. These sufficient conditions on the preferences of the
points were worked out by subsequent contri- judges for the socially optimal ranking to be
butions from Baron and Myerson, Lewis and the Nash implementable. Later on, Amoros
Sappington, Sibley, and others. (2016) presented a natural mechanism to pick
4. By now, the reader should know the difficulties up only the deserving winner (not a complete
of implementing efficient public decisions. ranking of the participants). If judges have
When information is exclusive, this is impos- differential information, the truth is no longer
sible, even though an approximate efficient implementable as suggested by T. 200 . See
decision can be obtained when the number of Gerardi et al. (2005) for further insights and
agents is large. When information is complete, references on this problem.
we have seen several examples of mechanisms 8. ????? Do you think that we have all answers?
implementing efficient outcomes. This is just economics!!
5. There is no difference between implementing
market and fair outcomes. Both have to pass Finally, I will tell you why I like implementation
the same tests, i.e., incentive compatibility, theory so much. Firstly, the implementation
monotonicity, and simplicity/credibility of model solves the problems of the general equilib-
design. In exchange economies, Thomson rium model mentioned in section “Brief History
(2005) presents a simple and elegant mecha- of Implementation Theory,” namely, (1) it models
nism that implements envy-free allocations in a general economic system, (2) all variables are
Nash equilibrium. In cooperative production, endogenously determined by the interaction of
Corchón and Puy (2002) presented a family of agents, and (3) agents’ incentives are carefully
mechanisms that implement in Nash equilib- modeled and are taken fully into account. Sec-
rium any efficient SCR where the distribution ondly, the theory is not based on assumptions
of rewards is a continuous function of efforts. like convexity or continuity/differentiability
6. Yes! An uninformed planner can set up a mech- which, no matter how much we are used to
anism that yields efficient outcomes in circum- them, are very stringent. By the way, a beautiful
stances where the market yields inefficient paper by Lafont and Maskin (referenced in their
allocations, i.e., under externalities or public 1982 survey) developed incentive compatibility
goods; see Point 5 in section “The Limits of in a differentiable framework.
Design” above. All we need is nonexclusive
information and that the SCR be monotonic; Acknowledgments I am grateful to Pablo Amorós,
the latter requirement can be skipped under Claude d’Aspremont, Carmen Beviá, Luis Cabral, Eric
Maskin, Bernardo Moreno, Carlos Pimienta, Socorro Puy,
refinements of Nash equilibrium. Tömas Sjöstrom, William Thomson, Matteo Triossi,
7. Not completely. Suppose complete informa- Galina Zudenkova, and an anonymous referee for helpful
tion among three or more judges and that they suggestions and to the Spanish Ministry of Education for
366 Implementation Theory
financial support under grant SEJ2005-06167. I also thank Benoit J-P (2000) The Gibbard-Satterthwaite theorem: a
the Department of Economics, Stern School of Business, simple proof. Econ Lett 69:319–322
NYU, for their hospitality while writing the first draft of Bergemann D, Morris S (2005) Robust mechanism design.
this survey. This survey was dedicated to Leo Hurwicz to Econometrica 73:1771–1813
celebrate his Nobel Prize and to the memory of those who Beviá C, Corchón L (1995) On the generic impossibility of
contributed to the area and are no longer with us: Louis- truthful behavior. Economic Theory 6:365–371
André Gerard-Varet, Jean-Jacques Laffont, Richard Beviá C, Corchón L, Wilkie (2003) Implementation of the
McKelvey, and Murat Sertel. Walrasian correspondence by market games. Rev Econ
Des 7:429–442
Burguet R (1990) Revelation in informational dynamic
settings. Econ Lett 33:237–239. Corrigendum (1994),
Bibliography 44:451–452
Cabrales A (1999) Adaptive dynamics and the implemen-
Abreu D, Sen A (1990) Virtual implementation in Nash tation problem with complete information. J Econ The-
equilibrium. Econometrica 59:997–1021 ory 86:159–184
Abreu D, Sen A (1991) Subgame perfect implementation: a Candel F (2004) Dynamic provision of public goods. Eco-
necessary and almost sufficient condition. J Econ The- nomic Theory 23:621–641
ory 50:285–299 Chakravorty B, Corchón L, Wilkie S (2006) Credible
Akerlof G The market for lemons: qualitative uncertainty implementation. Game Econ Behav 57:18–36
and the market mechanism. Q J Econ 84:488–500 Chattopadhyay S, Corchón L, Naeve J (2000) Contingent
Alcalde J, Corchón L, Moreno B (1999) Pigouvian taxes: a commodities and implementation. Econ Lett 68:293–298
strategic approach. J Public Econ Theory 1(2):271–281 Clarke E (1971) Multipart pricing of public goods. Public
Amoros P (2004) Nash implementation and uncertain rene- Choice 19–33
gotiation. Game Econ Behav 49:424–434 Conitzer V, Sandholm T (2002) Complexity of mechanism
Amorós P (2009) Eliciting socially optimal rankings from design. In: Proceedings of the 18th annual conference
unfair jurors. J Econ Theory 144:1211–1226 on uncertainty in artificial intelligence (UAI-02),
Amorós P (2016) Subgame perfect implementation of the Edmonton
deserving winner of a competition with natural mech- Corchón L (1996) The theory of implementation of
anisms. Math Soc Sci 83:44–57 socially optimal decisions in economics. St. Martin’s
Amoros P, Corchón L, Moreno B (2002) The scholarship Press, New York
assignment problem. Game Econ Behav 38:1–18 Corchón L, Faulí-Oller R (2004) To merge or not to merge:
Arrow K (1977) The property rights doctrine and demand that is the question. Rev Econ Des 9:11–30
revelation under incomplete information. Technical Corchón L, Herrero C (2004) A decent proposal. Span
report No. 243, IMSSS. Stanford University, Aug 1977 Econ Rev 6(2):107–125
Artemov G, Kunimoto T, Serrano R (2007) Robust virtual Corchón L, Ortuño-Ortín I (1995) Robust implementation
implementation with incomplete information: towards under alternative information structures. Econ Des
a reinterpretation of the Wilson doctrine. W.P. 2007-06. 1:159–171
Brown University Corchón L, Puy S (2002) Existence and Nash implemen-
Baliga S, Sjöström T (2007) Mechanism design: recent tation of efficient sharing rules for a commonly owned
developments. In: Blume L, Durlauf S (eds) The new technology. Soc Choice Welf 19:369–379
palgrave dictionary of economics, 2nd edn Corchón L, Triossi M (2011) Implementation with renego-
Baliga S, Corchón L, Sjöström T (1997) The theory of tiation when preferences and feasible sets are state
implementation when the planner is a player. J Econ dependent. Soc Choice Welf 36(2):179–198
Theory 77:15–33 Corchón L, Wilkie S (1996) Doubly implementing the ratio
Barberá S (1983) Strategy-proofness and pivotal voters: a correspondence by a market mechanism. Rev Econ Des
direct proof of the Gibbard-Satterthwaite theorem. Int 2:325–337
Econ Rev 24:413–417 d’Aspremont C, Gerard-Varet L-A (1975) Individual
Barberá S, Peleg B (1990) Strategy-proof voting schemes incentives and collective efficiency for an externality
with continuous preferences. Soc Choice Welf 7:31–38 game with incomplete information. CORE DP 7519
Barberá S, Sonnenschein H, Zhou L (1991) Voting by d’Aspremont C, Gerard-Varet L-A (1979) Incentives and
committees. Econometrica 59(3):595–609 incomplete information. J Public Econ 11:25–45
Barberà S, Berga D, Moreno B (2010) Individual versus Dagan N, Serrano R, Volij O (1999) Feasible implementa-
group strategy-proofness: when do they coincide? tion of taxation methods. Rev Econ Des 4:57–72
J Econ Theory 145:1648–1674 Danilov V (1992) Implementation via Nash equilibria.
Barberà S, Berga D, Moreno B (2016) Group strategy- Econometrica 60(1):43–56
proofness in private good economies. Am Econ Rev Dasgupta P, Hammond P, Maskin E (1979) The implemen-
106(4):1073–1099 tation of social choice rules: some results on incentive
Barone E (1908) The ministry of production in a collectiv- compatibility. Rev Econ Stud 46:185–216
ist state. Translated from the italian and reprinted in Dutta B, Sen A (1991a) Implementation under strong equi-
F. A. von Hayek Collectivist Economic Planning, librium. A complete characterization. J Math Econ
Routledge and Keegan, 1935 20:49–67
Implementation Theory 367
Dutta B, Sen A (1991b) A necessary and sufficient condi- Jackson MO, Palfrey T, Srivastava S (1994) Undominated
tion for two-person Nash implementation. Rev Econ Nash implementation in bounded mechanisms. Game
Stud 58:121–128 Econ Behav 6:474–501
Dutta B, Sen A (2012) Nash implementation with partially Kalai E, Ledyard J (1988) Repeated implementation.
honest individuals. Game Econ Behav 74(1):154–169 J Econ Theory 83:308–317
Dutta B, Sen A, Vohra R (1995) Nash implementation Klemperer P (1999) Auction theory, a guide to the litera-
through elementary mechanisms in economic environ- ture. J Econ Surv 13:227–268
ments. Econ Des 1:173–204 Koray S (2005) The need of regulating a Bayesian regula-
Eliaz K (2002) Fault tolerant implementation. Rev Econ tor. J Regul Econ 28:5–21
Stud 69:589–610 Krishna V (2002) Auction theory. Academic, San Diego
Freixas X, Guesnerie R, Tirole J (1985) The ratchet effect. Krishna V, Perry M (1997) Efficient mechanism design.
Rev Econ Stud 52:173–191 Unpublished paper, Penn State University
Gerardi D, McLean R, Postlewaite A (2005) Aggregation Laffont J-J, Martimort D (2001) The theory of incentives:
of expert opinions. Cowles Foundation Discussion the principal-agent model. Princeton University Press,
Paper # 1503 Princeton
Gibbard A (1973) Manipulation of voting schemes: a gen- Laffont J-J, Maskin E (1982) The theory of incentives:
eral result. Econometrica 41:587–602 an overview. In: Hildenbrand W (ed) Advances in
Green J, Laffont J-J (1979) Incentives in public decision economic theory, 4th World Congress of the Econo-
making. North Holland, Amsterdam metric Society. Cambridge University Press,
Gresik T, Satterthwaite M (1989) The rate at which a New York
simple market converges to efficiency as the number Lange O (1936/1937) On the economic theory of social-
of traders increases. J Econ Theory 48:304–332 ism. Rev Econ Stud 4:53–71. 123–142
Groves T (1973) Incentives in teams. Econometrica 41:617–631 Ledyard J, Roberts J (1974) On the incentive problem with
Hammond P (1987) Markets as constraints: multilateral public goods. Discussion Paper 116, Centre for Math-
incentive compatibility in continuum economies. Rev ematical Studies in Economics and Management Sci-
Econ Stud 54:399–412 ence. Northwestern University
Harris M, Townsend R (1981) Resource allocation under Lee J, Sabourian H (2011) Efficient repeated implementa-
asymmetric information. Econometrica 49:33–64 tion. Econometrica 79(6):1967–1994
Harsanyi J (1967/1968) Games with incomplete informa- Loeb M, Magath W (1979) A decentralized method for
tion played by ‘Bayesian’ players. Parts I, II and III. utility regulation. J Law Econ 22:399–404
Manag Sci 14:159–182. 320–334 and 486–502 Ma A, Moore J, Turnbull S (1988) Stop agents from
Hong L (1998) Feasible Bayesian implementation with state cheating. J Econ Theory 46:355–372
dependent feasible sets. J Econ Theory 80:201–221 Macho-Stadler I, Pérez-Castrillo D, Wettstein D (2007)
Hurwicz L (1959) Optimality and informational efficiency Sharing the surplus: an extension of the Shapley value
in resource allocation processes. In: Arrow KJ for environments with externalities. J Econ Theory
(ed) Mathematical methods in the social sciences. 135(1):339–356
Stanford University Press, Stanford, pp 27–46 Makowski L, Mezzetti C (1993) The possibility of efficient
Hurwicz L (1972) On informationally decentralized sys- mechanisms for trading an indivisible object. J Econ
tems. In: Radner R, McGuire CB (eds) Decision and Theory 59:451–465
organization: a volume in honor of Jacob Marshak. Maskin E (1985) The theory of implementation in Nash
North-Holland, Amsterdam, pp 297–336 equilibrium: a survey. In: Hur-wicz L, Schmeidler D,
Hurwicz L (1979) On allocations attainable through Nash Sonnenschein H (eds) Social goals and social organi-
equilibria. J Econ Theory 21:40–65 zation. Cambridge University Press, Cambridge, UK,
Hurwicz L, Walker M (1990) On the generic non- pp 173–204
optimality of dominant strategy mechanisms. Maskin E (1999) Nash equilibrium and welfare optimality.
Econometrica 58(3):683–704 Rev Econ Stud 66(1):23–38. Circulating in working
Hurwicz L, Maskin E, Postlewaite A (1995) Feasible Nash paper version since 1977
implementation of social choice rules when the Maskin E, Moore J (1999) Implementation with renegoti-
designer does not know endowments or production ation. Rev Econ Stud 66:39–56
set. In: Ledyard J (ed) The economics of informational Maskin E, Sjöström T (2002) Implementation theory,
decentralization: complexity, efficiency and stability. Chapter 5. In: Arrow KJ, Sen AK (eds) Handbook of
Kluwer, Dordrecht social choice and welfare. Elsevier
Jackson M (1991) Bayesian implementation. Matsushima H (1988) A new approach to the implementa-
Econometrica 59:461–477 tion problem. J Econ Theory 45:128–144
Jackson M (1992) Implementation in undominated strate- Matsushima H (2008) Role of honesty in full implementa-
gies: a look to bounded mechanisms. Rev Econ Stud tion. J Econ Theory 139:353–359
59:757–775 McKelvey R (1989) Game forms for Nash implementation
Jackson MO (2001) A crash course in implementation of general social choice correspondences. Soc Choice
theory. Soc Choice Welf 18:655–708 Welf 6:139–156
Jackson MO, Palfrey T (2001) Voluntary implementation. Mirless J (1971) An exploration in the theory of optimum
J Econ Theory 98:1–25 income taxation. Rev Econ Stud 38:175–208
368 Implementation Theory
Mookherjee D, Reichelstein S (1990) Implementation via Repullo R (1987) A simple proof of Maskin theorem on
augmented revelation mechanisms. Rev Econ Stud Nash implementation. Soc Choice Welf 4:39–41
57:453–475 Roth AE (2008) What have we learned from market
Moore J (1992). Implementation, contracts and renegotia- design? Econ J 118:285–310
tion in environments with complete information. In: Rothchild M, Stiglitz J (1976) Equilibrium in competitive
Laffont J-J (ed) Advances in economic theory, 4th insurance markets: an essay on the economics of imper-
World Congress of the Econometric Society, fect information. Q J Econ 90:629–650
vol I. Cambridge University Press, Cambridge, UK, Rubinstein A, Wolinsky A (1992) Renegotiation-proof
pp 182–282 implementation and time preferences. Am Econ Rev
Moore J, Repullo R (1988) Subgame perfect implementa- 82:600–614
tion. Econometrica 56:1191–1220 Saari D (1987) The source of some paradoxes from social
Moore J, Repullo R (1990) Nash implementation: a full choice and probability. J Econ Theory 41:1–22
characterization. Econometrica 58:1083–1089 Saijo T (1988) Strategy space reduction in Maskin’s theo-
Moulin H (1979) Dominance solvable voting schemes. rem. Econometrica 56:693–700
Econometrica 47(6):1337–1351 Saijo T (1991) Incentive compatibility and individual ratio-
Muller E, Satterthwaite M (1977) The equivalence of nality in public good economies. J Econ Theory
strong positive association and strategy-proofness. 55:103–112
J Econ Theory 14:412–418 Saijo T, Tatamitani Y, Yamato T (1996) Toward natural
Myerson R (1979) Incentive compatibility and the implementation. Int Econ Rev 37(4):949–980
bargaining problem. Econometrica 47:61–73 Saijo T, Sjöström T, Yamato T (2007) Secure implementa-
Myerson R (1981) Optimal auction design. Math Oper Res tion. Theor Econ 2:203–229
6:58–73 Samuelson PA (1954) The pure theory of public expendi-
Myerson R (1985) Bayesian equilibrium and incentive ture. Rev Econ Stat 36:387–389
compatibility: an introduction, Chapter 8. In: Sandholm W (2007) Pigouvian pricing and stochastic evo-
Hurwicz L, Schmeidler D, Sonnenschein H (eds) Social lutionary implementation. J Econ Theory 132:367–382
goals and social organization. Cambridge University Satterthwaite (1975) Strategy-proofness and arrows condi-
Press, Cambridge, UK tions: existence and correspondence theorems for vot-
Myerson R, Satterthwaite MA (1983) Efficient mecha- ing procedures and social choice functions. J Econ
nisms for bilateral trading. J Econ Theory 29:265–281 Theory 10:187–217
Palfrey TR (2002) Implementation theory. In: Aumann RJ, Schmeidler D (1980) Walrasian analysis via strategic out-
Hart S (eds) Handbook of game theory with economic come functions. Econometrica 48:1585–1593
applications, vol III. Elsevier Science, New York, Sen A (2001) Another direct proof of the Gibbard-
pp 2271–2326 Satterthwaite theorem. Econ Lett 70:381–385
Palfrey T, Srivastava S (1987) On Bayesian implementable Serizawa S (2002) Inefficiency of strategy-proof rules for
allocations. Rev Econ Stud 54:193–208 pure exchange economies. J Econ Theory 106:219–241
Palfrey T, Srivastava S (1989) Implementation with incom- Serizawa S, Weymark J (2003) Efficient strategy-proof
plete information in exchange economies. Econometrica exchange and minimum consumption guarantees.
57:115–134 J Econ Theory 109:246–263
Palfrey T, Srivastava S (1991) Nash implementation using Serrano R (2004) The theory of implementation of social
undominated strategies. Econometrica 59:479–501 choice rules. SIAM Rev 46:377–414
Peleg B (1996) Double implementation of the Lindahl Serrano R, Vohra R (1997) Non cooperative implementa-
equilibrium by a continuous mechanism. Rev Econ tion of the core. Soc Choice Welf 14:513–525
Des 2(1):311–324 Serrano R, Vohra R (2001) Some limitation of Bayesian
Pérez-Castrillo D, Wettstein D (2002) Choosing wisely: a virtual implementation. Econometrica 69:785–792
multi-bidding approach. Am Econ Rev 92:1577–1587 Sertel M, Samver R (1999) Equilibrium outcomes of
Postlewaite A. Manipulation via endowments. Rev Econ Lindahl-endowment pretension games. Eur J Polit
Stud 46:255–262 Econ 15:149–162
Postlewaite A, Schmeidler D (1986) Implementation in Shin S, Suh S-C (1997) Double implementation by a sim-
differential information economies. J Econ Theory ple game form in the commons problem. J Econ Theory
39:14–33 77:205213
Prize Committee of the Royal Swedish Academy of Sci- Sjöström T (1993) Implementation in perfect equilibria.
ences (2007) Mechanism design theory. Royal Swedish Soc Choice Welf 10:97–106
Academy of Sciences, Stockholm Sjöström T (1994) Implementation in undominated Nash
Reiter S (1977) Information and performance in the (new) equilibrium without integer games. Game Econ Behav
welfare economics. Am Econ Rev 67:226–234 6:502–511
Repullo R (1986) On the revelation principle under com- Sjöström T (1996) Implementation by demand mecha-
plete and incomplete information. In: Binmore K, nisms. Econ Des 1:343–354
Dasgupta P (eds) Economics organizations as games. Spence M (1973) Job market signalling. Q J Econ
Basil Blackwell, Oxford 87:355–374
Implementation Theory 369
Taylor FM (1929) The guidance of production in a socialist Vickrey W (1961) Counterspeculation, auctions and com-
state. Am Econ Rev 19:1–8 petitive sealed tenders. J Financ 16:8–37
Thomson W (1985) Manipulation and implementation in Walker M (1981) A simple incentive compatible scheme
economics. Unpublished manuscript, Rochester for attaining Lindahl allocations. Econometrica
Thomson W (1987) The vulnerability to manipulative 49:65–73
behavior of economic mechanisms designed to select Williams S (1986) Realization of Nash implementation:
equitable and efficient outcomes, Chapter 14. In: two aspects of mechanism design. Econometrica
Groves T, Radner R, Reiter S (eds) Information, incen- 54:139–151
tives and economic mechanisms. University of Minne- Williams S (1999) A characterization of efficient, Bayesian
sota Press, pp 375–396 incentive compatible mechanisms. Economic Theory
Thomson W (1996) Concepts of implementation. Jpn Econ 14:155–180
Rev 47:133–143 Yamato T (1993) Double implementation in Nash and
Thomson W (2005) Divide and permute. Game Econ undominated Nash equilibria. J Econ Theory
Behav 52:186–200 59(2):311–323
Tian G (1996) On the existence of optimal truth-dominant Yamato T (1994) Equivalence of Nash implementability
mechanisms. Econ Lett 53:17–24 and robust implementability with incomplete informa-
Tian G, Li Q (1995) On Nash-implementation in the pres- tion. Soc Choice Welf 11:289–303
ence of withholding. Game Econ Behav 9:222–233
Agent f compares two allowable sets by compar-
Two-Sided Matching Models ing the values of these sets. This concept is
similarly defined for w W. If the agents have
Marilda Sotomayor1,2 and Ömer Özak3 additively separable preferences, we can think
1
Department of Economics, University of Sao that if a partnership (f,w) F W is formed,
Paulo, Sao Paulo, Brazil then the partners participate in some joint activity
2
EPGE Brazilian School of Economics and that generates a payoff afw for player f and bfw for
Finance, Sao Paulo, Brazil player w. These numbers are fixed, i.e., they are
3
Department of Economics, Brown University, not negotiable. If the preferences of the agents are
Providence, USA additively separable, then they are responsive.
The converse is not true (see Kraft et al. 1959).
Allowable set of partners for f F with quota
Article Outline r( f ) is a family of elements of F[W with
k distinct W-agents, 0 k r ð f Þ, and r( f )
Glossary k repetitions of f.
Definition of the Subject Continuous two-sided matching model In this
Basic Definitions model, the structure of preferences is given by
A Brief Historical Account utility functions which are continuous in some
Gale-Shapley Algorithm with the Colleges money variable which varies continuously in the
Proposing to the Applicants set of real numbers. A particular case is obtained
Gale-Shapley Algorithm with the Applicants when agents place a monetary value on each
Proposing to the Colleges possible partner or on each possible set of partners.
Introduction Discrete two-sided matching model In the dis-
Discrete Two-Sided Matching Models crete two-sided matching models, agents have
Continuous Two-Sided Matching Model With preferences over allowable sets of partners. The
Additively Separable Utility Functions allowable sets of partners for f of the type {w,
Hybrid One-to-One Matching Model f,. . .,f } are identified with the individual agent
Incentives w Wand the allowable set of partners { f. . .f } is
Future Directions identified with f. Under this identification, agent
Bibliography w is acceptable to agent f if and only if f likes
w as well as himself/herself/itself. Similar defini-
tions and identifications apply to an agent w W.
Glossary These preferences are transitive and complete, so
they can be represented by ordered lists of pref-
Achievable mate for agent y in a discrete two- erences. The model can then be described by (F,
sided matching model is any y’s of partner W,P,r,s), where P is the profile of preferences and
under some stable matching. r and s are the arrays of quotas for the F-agents
Additively separable preferences in a discrete and W-agents, respectively.
two-sided matching model with sides F and W. Feasible assignment for a two-sided matching
Agent f F has additively separable preferences model with sides F and W is an m n matrix
if he/she/it assigns a nonnegative number afw to x ¼ (xfw) whose entries are zeros or ones
X
each w W and assigns the value vðAÞ ¼ such that x f w sðwÞ for all w W and
f
X X
to each allowable set A of partners for f. x fw r ð f Þ for all f F. We say that
w Aa fw w
xfw ¼ 1 if f and w form a partnership and xfw ¼ say that a player that does not enter any part-
0 otherwise. A feasible assignment nership is unmatched. Agents compare two
x corresponds to a matching m which matches matchings by comparing the two allowable
f to w if and only if xfw ¼ 1. Thus, if sets of partners they obtain.
X
x f w ¼ 0 , then w is unassigned at x or, Maximin preferences in a discrete two-sided
f matching model with sides F and W. Agent
equivalently, unmatched at m, and if f F with a quota of r( f ) has a maximin prefer-
X
x f w ¼ 0 , then f is likewise unassigned at ence relation over allowable sets of partners if
w whenever two allowable sets C and C0 contained
x or, equivalently, unmatched at m. in W, such that f prefers C0 to C and no w in C is
F-optimal stable matching (respectively, pay- unacceptable to f, then a) all of C0 are acceptable
off) for a discrete (respectively, continuous) to f and b) if | C | ¼ r( f ), then the least preferred
two-sided matching model is the stable worker in C0 C is preferred by f to the least
matching (respectively, payoff) which is preferred worker in C C0. Similarly we define
weakly preferred by every agent in F. Similarly maximin preference for w W.
we define the W-optimal stable matching Choice set of f F from A W(Chf (A)) in a
(respectively, payoff). discrete two-sided matching model with sides
Hybrid two-sided matching model is a unifica- F and W. Let B ¼ {A0 | A0 is an allowable set of
tion of the discrete and the continuous models. It partners for f and A0\W is contained in A}. Then,
is obtained by allowing the agents of both mar- A0 Chf (A) if and only if A0 B and f likes A0 at
kets to trade with each other in the same market. least as well as A00, for all A00 B. Similarly we
Lattice property A set L endowed with a partial define Chw(A) for w W and A F.
order relation has the lattice property if sup Outcome For the discrete two-sided matching
{x,y} x_y and inf{x,y} x^y are in L, for all models, the outcome is a matching or at least
x,x0 L. The lattice is complete if all its subsets corresponds to a matching; for the continuous
have a supremum and an infimum (see two-sided matching models, the outcome spec-
Birkhoff 1973). ifies a payoff for each agent and a matching.
Manipulable mechanism A mechanism h is Pareto-optimal matching A feasible matching
manipulable or it is not strategy proof if in m is Pareto optimal if there is no feasible
some revelation game induced by h, stating the matching which is weakly preferred to m by
true preferences is not a dominant strategy for at all players and it is strictly preferred by at least
least one player. A mechanism h is collectively one of them.
manipulable if in some revelation game induced Quota of an agent Quota of an agent in a two-
by h, there is a coalition whose members can be sided matching model is the maximum number
better off by misrepresenting its preferences. of partnerships an agent is allowed to form.
Matching mechanism For the discrete two- When every participant can form one partner-
sided matching models, a matching mecha- ship at most, the matching model is called one
nism is a function h whose range is the set of to one. If only the players of one of the sides can
all possible inputs X ¼ (F,W,P,r,s) and whose form more than one partnership, the matching
output h(X) is a matching for X. model is said to be many to one. Otherwise the
Matching m in a two-sided matching model with matching model is many to many.
sides F and W is a function that maps every r(f)-separable preference in a discrete two-
agent into an allowable set of partners for sided matching model with sides F and W.
him/her/it, such that f is in m(w) if and only if Agent f F with a quota of r( f ) has an r( f )-
w is in m( f ), for every ( f,w) F W. If we separable preference relation over allowable
relax this condition, the function is called a sets of partners if whenever A ¼ B[{w}\{f }
pre-matching.A matching describes the set of with w=2B and f B, then f prefers A to B if and
partnerships of the type ( f,w), ( f,f ), or (w,w), only if f prefers w to f. Similarly we define s(w)-
with f F and w W, formed by the agents. We separable preference for w W.
Two-Sided Matching Models 373
colleges. Thus, in this algorithm, each hospital quota is again filled or they have exhausted their
applies to its quota of students. The confirmation list. Admitted applicants again reject all but their
of this fact was obtained by David Gale in 1975. favorite hospital, giving the second tentative
In a letter from December 8, 1975, in response to a matching, etc. The algorithm terminates when,
letter from Gale to the NRMP, Elliott Peranson, after some tentative matching, no hospitals can
consultant to NRMP, responsible for the technical admit any more applicants either because their
operation of the matching program, says the quota is full or they have exhausted their list.
following: The tentative matching then becomes permanent.
However, I might point out that the NRMP algo-
rithm in fact uses the inverse procedure and pro-
duces the unique “college optimal” assignment Gale-Shapley Algorithm with the
rather than this “student optimal” assignment. This
procedure more closely parallels the actual admis-
Applicants Proposing to the Colleges
sions process where a matching algorithm is not
used. In this case students apply to all hospitals Each applicant petitions for admission to his/her
they would consider (not just their first choice), favorite hospital. In general, some hospitals will
each hospital then selects the most desirable stu-
dents, up to its quota limit, from amongst all appli-
have more petitioners than allowed by their quota.
cants, then the “bid-for” students reject all but the Such oversubscribed hospitals now reject the low-
most desirable offer, and so on. est petitioners on their preference list so as to
come within their quota. This is the first tentative
Hence, the proof that the NRMP was yielding a
matching. Next, rejected applicants petition for
stable matching, which was the optimal stable
admission to their second favorite hospital and
matching for the colleges, is that such an outcome
again oversubscribed hospitals reject the over-
is always obtained by the Gale and Shapley's
flow, etc. The algorithm terminates when every
algorithm with the colleges proposing.
applicant is tentatively admitted or has been
The discovery that the two algorithms were
rejected by every hospital on his/her list.
mathematically equivalent was first spread orally
The fact that the matching produced by the
and later reported in Gale and Sotomayor (1983)
NRMP algorithm is stable stands for one of the
with an equivalent description of the NRMP algo-
most important applications of game theory to
rithm (Roth (1984c) also presents the NRMP
economics. During about fifty years, the alloca-
algorithm). This was the first application of the
tion procedures used to assign interns to hospitals
matching theory of which we have knowledge.
in the United States produced unstable matchings.
The algorithms proposed by Gale and Shapley
Unsuccessful procedures were often proposed by
are described below. Their description is quoted
the Association of American Medical Colleges.
from Gale and Sotomayor (1983).
This sort of events culminated with a centralized
mechanism that employed the NRMP algorithm.
Such a centralized mechanism lasted for almost
Gale-Shapley Algorithm with the fifty years, suggesting that interns and hospitals
Colleges Proposing to the Applicants had reached an equilibrium. And the paper written
by Gale and Shapely corroborated that the game
Each hospital H tentatively admits its quota q H theoretical predictions were, once more, correct.
consisting of the top q H applicants on its list. Before the publication of Gale and Sotomayor
Applicants who are tentatively admitted to more (1983), a few, but important, contributions were
than one hospital tentatively accept the one they made to the theory of two-sided matching markets.
prefer. Their names are then removed from the The famous paper of 1972 by Shapley and Shubik
lists of all other hospitals which have tentatively establishes the assignment game via the introduction
admitted them. This gives the first tentative of money, as a continuous variable, into the mar-
matching. Hospitals which now fall short of their riage model. The book Marriage Stables by Knuth
quota again admit tentatively until either their was published in 1976. In this volume, the proof,
Two-Sided Matching Models 377
attributed to Conway, that the set of stable manipulability theorem attracted the authors'
matchings for the marriage model is a lattice is interest toward a fruitful line of investigation
presented. The assignment game was generalized concerning the incentives facing the agents when
by Kelso and Crawford (1982) to a model where an allocation mechanism is employed. Algorithms
the utilities satisfy some gross substitute condition. have been developed for this purpose for several
Another generalization, which considers continuous matching models. The games induced by such
utility functions, non-necessarily linear, was pre- mechanisms are played noncooperatively by the
sented in Demange and Gale (1985). agents, and in general, their self-enforcing agree-
Nevertheless, among the contributions of this ments lead to a stable outcome. In these cases, a
period, one of them caused considerable impact. noncooperative implementation of the set of sta-
This was the non-manipulability theorem by Dubins ble outcomes is provided. Analyzing the strategic
and Freedman (1981). In a stable revelation mech- behavior of the agents in such games has been an
anism, for every profile of preferences that can be important subject of research of several authors in
selected by the agents, some algorithm that yields a an attempt to get precise answers to the strategic
stable matching is used. These authors prove that the questions raised. In this direction, we can cite
revelation mechanism that produces the optimal Roth (1982; Roth 1984a), Gale and Sotomayor
stable matching for a given side of the marriage (1985), Perez-Castrillo and Sotomayor (2002),
market is not collectively manipulable by the agents Sotomayor (2004a, b, c), Kamecke (1989), Kara
of that side. Also, this non-manipulability result and Sönmez (1996, 1997), Alcalde (1996), and
holds for the college admission market when the Alcalde et al. (1998), among others.
mechanism yields the student-optimal stable Over all these years, the popularity of the
matching. An analog to this result was proved in matching theory has spread among mathemati-
Demange and Gale (1985) for the continuous model cians and economists, mainly due to the publica-
through a key lemma that became known in the tion in 1990 of the first edition of the book Two-
literature as the blocking lemma. The main chal- Sided Matchings: A Study in Game Theoretic
lenge that motivated Gale and Sotomayor (1983) Modeling and Analysis, by Roth and Sotomayor,
was to prove the discrete version of the blocking which attempts a comprehensive survey of the
lemma, which allowed to prove the non- main results on the two-sided matching theory
manipulability theorem in just three lines. Two sim- until that date (an extensive bibliography can
ple and short proofs (one with the use of the algo- also be found in http://kuznets.fas.harvard.edu/
rithm and the other one without the use of the ~aroth/bib.html#matchbib).
algorithm) of Dubins and Freedman’s theorem The stable matching problem has been gener-
were presented as an alternative to the original alized to several two-sided matching models,
proof by the authors which was about twenty which have been widely modeled and analyzed
pages long. An example in Dubins and Freedman under both cooperative and noncooperative game-
(1981), where some woman can be better off by theoretic approaches. Through these models, a
falsifying her preference list when the man-optimal variety of markets has become better understood,
stable matching is to be employed, motivated Gale which has considerably contributed to their orga-
and Sotomayor (1985). This paper proves that such nization. The deferred acceptance algorithm of
a mechanism can almost always be manipulated by Gale and Shapley and adaptations of it have
the women and then treats the strategic possibilities been applied in the reorganization of admissions
for these agents in the corresponding strategic game. processes of many two-sided matching markets.
Another paper of this period was Roth (1982), And the design of these mechanisms has also
which proves, via an example, that any rule for raised new theoretical questions (in this connec-
selecting a stable matching is manipulable (either tion, see, e.g., Balinski and Sönmez (1999), Ergin
by some man or some woman). and Sönmez (2006), and Pathak and Sönmez
The existence theorem of manipulability by the (2006), Abdulkadiroglu and Sönmez (2003), and
women, the impossibility theorem, and the non- Bardella and Sotomayor (2006)).
378 Two-Sided Matching Models
they can be represented by ordered lists of prefer- workers; worker w1 may take, at most, one job
ences. Thus, the individual preference relation of and prefers f1 to f2; worker w2 may work and
firm f can be represented by an ordered list of wants to work for both firms. If the agents can
preferences P(f) on the set W[{ f }; the individual communicate with each other, the outcome that
preference relation of worker w can be represented we expect to observe in this market is obvious: f1
by an ordered list of preferences P(w), on the set hires both workers and f2 hires only worker w2. Of
F[{w}. The array of these preferences will be course, this outcome is in the strong core. Since f1
denoted by P. Then, an agent w is acceptable to has a quota of two and w1 prefers f1 to f2, we
an agent f if w f f . Similarly, an agent f is accept- cannot expect to observe the strong corewise-
able to an agent w if f w w. stable outcome where f1 hires only w2 and f2
If an agent may form more than one partner- hires both workers. That is, both outcomes are in
ship, then his/her/its preferences are not restricted the strong core, but only the first one is expected
to the individual potential partners. That is, agents to occur. Our explanation for this is that only the
have preferences over any allowable set of part- first outcome is setwise stable.
ners. Given two allowable sets of partners, A and
B, for agent y F[W, we write A > yB to mean On the other hand, the pairwise stability con-
y prefers A to B and A yB to mean y likes A at cept is not a refinement of the core for the discrete
least as well as B. In order to fix ideas, we will many-to-many matching models. See Example 2.
assume that these preferences are responsive to
the agents’ individual preferences and are not Example 2 (Sotomayor 1999b) (A pairwise-
necessarily strict. stable matching which is not in the core) Con-
The rules of the market are that any firm and sider the following labor market with a set of firms
worker pair may sign an employment contract with F ¼ {f1, f2, f3, f4} and a set of workers W ¼ {w1,w2,
each other if they both agree; any firm may choose to w3,w4}, where each firm can hire two workers and
keep one or more of its positions unfilled, and any each worker can work for two firms. If firm fi hires
worker may choose not to fill his or her quota of jobs worker wj, then fi gets the profit aij and wj gets the
if he/she wishes to do so. salary bij . The pairs of numbers (aij,bij) are given in
When the quota of every agent is one, we have Table 1.
the marriage model. In this case, an allowable set
of partners for any agent is a singleton. Therefore, Consider the matching m where f1 and f2 are
every agent only has preferences over individuals. matched to {w3,w4} and f3 and f4 are matched to
If only the agents of one of the sides are allo- {w1,w2} (the payoffs of each matched pair are pre-
wed to have a quota greater than one, then we sented in boldface). This matching is pairwise stable.
have the so-called college admission model with In fact, f3 and f4 do not belong to any pair which
responsive preferences. causes instability, because they are matched to their
In both models, it is a matter of verification that two best choices: w1 and w2; ( f1,w1) and ( f1,w2) do
the sets of setwise-stable matchings, pairwise- not cause instabilities since f1 is the worst choice for
stable matchings, and strong corewise-stable w1 and w2 is the worst choice for f1; ( f2,w1) and
matchings coincide. For the many-to-many case, ( f2,w2) do not cause instabilities since w1 is the worst
this is not always true. The strong corewise sta- choice for f2 and f2 is the worst choice for w2.
bility concept is not a natural solution concept for Two-Sided Matching Models, Table 1 Payoff matrix
this model as Example 1 shows. of Example 2. Each row represents a firm and each column
a worker. The values in the cell represent the payoffs to the
Example 1 (Sotomayor 1999b) (The corewise corresponding firm (first value) and worker (second value)
stability concept is not a natural solution con- 10,1 1,10 4,10 2,10
cept for the many-to-many case) Consider two 1,10 10,1 4,4 2,4
firms f1 and f2 and two workers w1 and w2. Each 10,4 4,4 2,2 1,2
firm may employ and wants to employ both 10,2 4,2 2,1 1,1
380 Two-Sided Matching Models
Nevertheless, f1 and f2 prefer {w1,w2} to {w3,w4} remaining players are matched to their best
and w1 and w2 prefer { f1, f2} to { f3, f4}. Hence this choices. However, if A contains one player of the
matching is not in the core, since it is blocked by the set {f1, f2, w1,w4}, then A must contain all four
coalition {f1, f2, w1, w2}. players. In fact, if f1 A, then f1 must form a new
Example 3 shows that setwise stability is a partnership with w1, so w1 A. If w1 A, then w1
strictly stronger requirement than pairwise stability must form a new partnership with f2, so f2 A. If
plus strong corewise stability. It presents a situation f2 A, then f2 must form a new partnership with w4,
in which the set of setwise-stable matchings is a so w4 A. Finally, if w4 A, then w4 must form a
proper subset of the intersection of the strong core new partnership with f1, so f1 A. Thus, if m0
with the set of pairwise-stable matchings. weakly dominates m via A, then f1, f2, w1, and w4
are in A and f1 and f2 form new partnerships with w1
Example 3 (Sotomayor 1999b) (A strong and w4. Nevertheless, f1 must keep his partnership
corewise-stable matching which is pairwise sta- with w2, his best choice. Then w2 must be in A, so
ble and is not setwise stable) Consider the fol- she cannot be worse off and so f5 must also be in A.
lowing labor market with a set of firms F ¼ {f1, f2, But f5 requires the partnership with w4, who has
f3, f4, f5, f6} and a set of workers W ¼ {w1,w2,w3, quota of 2 and has already filled her quota with f1
w4,w5,w6,w7}, where r1 ¼ 3, r2 ¼ r5 ¼ 2, r3 ¼ r4 ¼ and f2. Hence f5 is worse off at m0 than at m and then
r6 ¼ 1, s1 ¼ s2 ¼ s4 ¼ 2, and s3 ¼ s5 ¼ s6 ¼ s7 ¼ m0 cannot weakly dominate m via A.
1. If firm fi hires worker wj, then fi gets profit aij The matching m is clearly pairwise stable. Nev-
and the worker gets a salary bij . The pairs of ertheless, the coalition {f1,f2,w1,w4} causes an
numbers (aij,bij) are given in Table 2. instability in m. In fact, if f1 is matched to {w1,
w2,w4} and f2 is matched to {w1,w4}, then f1 gets
Let m be the matching given by 28 and the rest of the players in the coalition get
11 instead of 6. Hence, the matching m is not
mð f 1 Þ ¼ fw2 , w3 , w7 g, mð f 2 Þ ¼ fw5 , w6 g, setwise stable.
mð f 3 Þ ¼ mð f 4 Þ ¼ fw1 g, mð f 5 Þ ¼ fw2 , w4 g, and The question is then to know if, given any two
mð f 6 Þ ¼ fw4 g: sets of agents with their respective preferences
and quotas, one can always find a setwise-stable
The associated payoffs are shown in bold in matching. The answer is affirmative for the one-
Table 2. This matching is in the strong core. In to-one matching model and for the many-to-one
fact, if there is a matching m0 which weakly dom- matching models with substitutable preferences.
inates m via some coalition A, then, under m0, no The existence of setwise-stable matchings for the
player in A is worse off and at least one player in marriage model was first proved by Gale and
A is better off. Furthermore, matching m0 must Shapley (1962). Sotomayor (1996a) also provides
match all members of A among themselves. By a simple proof of the existence of stable matchings
inspection, we can see that the only players that for the marriage model that connects stability with
can be better off are f1, f2, w1, and w4, for all a broader notion of stability with respect to
unmatched agents. Gale and Shapley construct a
deferred acceptance algorithm described below
Two-Sided Matching Models, Table 2 Payoff matrix
of Examples 3 and 4. Each row represents a firm and each
and prove that it yields a stable matching in a
column a worker. The values in the cell represent the finite number of steps.
payoffs to the corresponding firm (first value) and worker The deferred acceptance algorithm with the
(second value) men making the proposals. Each man begins by
13,1 14,10 4,10 1,10 0,0 0,0 3,10 proposing to his favorite woman (the first woman
1,10 0,0 0,0 10,1 4,10 2,10 0,0 on his preference list). Each woman rejects the
10,4 0,0 0,0 0,0 0,0 0,0 0,0 proposal of any man unacceptable to her, and in
10,2 0,0 0,0 0,0 0,0 0,0 0,0 case she gets several proposals, she keeps only her
0,0 9,9 0,0 10,4 0,0 0,0 0,0 most preferred one. If a man is not rejected at this
0,0 0,0 0,0 10,2 0,0 0,0 0,0 step, he is kept engaged. At any step, any man
Two-Sided Matching Models 381
who was rejected at the previous step proposes to matchings may be empty. Gale and Shapley (1962)
his next choice (his most preferred woman among present an example of a one-sided matching model
those who have not rejected him), as long as there that does not have any stable matchings. This
remains an acceptable woman to whom he has not model was called by these authors the “roommate
yet proposed (if at one step a man has been problem.” In this example, there are four agents {a,
rejected by all of his acceptable women, he issues b,c,d}, such that a’s first choice is b, b’s first choice
no further proposals). The algorithm terminates is c, c’s first choice is a, and d is the last choice of
after any step in which no man is rejected, because all the other agents. Of course, if some agent is
then every man is either engaged to some woman unmatched, then there will be two unmatched
or has been rejected by every woman on his list of agents and they will destabilize the matching. If
acceptable women. Women who did not receive every one is matched, the agent who is matched to
any acceptable proposals, and men rejected by all d will form a blocking pair with the agent who lists
women acceptable to them remain single. him at the head of his list. Therefore, there is no
To see that the matching yielded by this algo- stable matching in this example.
rithm is stable, first observe that no agent has an A small amount of literature has grown around
unacceptable partner. In addition, if there is some the issues of finding conditions in which the set of
man f and woman w not matched to each other and stable matchings is nonempty for the roommate
such that f prefers w to his current partner, then problem and the performance of algorithms that
woman w is acceptable to man f and so he must can produce them when they exist (see Abeledo
have proposed to her at some step of the algo- and Isaak (1991), Irving (1985), Tan (1991),
rithm. But then he must have been rejected by w in Chung (2000), and Sotomayor (2005)).
favor of someone she prefers to f. Therefore, w is If preferences are not substitutable, Example 2.7
matched to a man whom she prefers to f by the of Roth and Sotomayor (1990) shows that setwise-
transitiveness of the preferences and so f and w do stable matchings may not exist in the many-to-
not destabilize the matching. one case.
For the college admission model with respon- Even in the simplest case of preferences repre-
sive preferences, Gale and Shapley defined a sentable by additively separable utility functions,
related marriage model in which each college is setwise-stable matchings may not exist for the
replicated a number of times equal to its quota, so many-to-many case. See the example below.
that in the related model, every agent has a quota
of one. If f1,. . .,fr(f) are the r(f) copies of college f, Example 4 (Sotomayor 1999b) (Nonexistence
then each of these fi ’s has preferences over indi- of stable matchings) Consider again the
viduals that are identical with those of f. Each matching model of Example 2. We are going to
student’s preference list is changed by replacing show that the set of setwise-stable matchings is
f, wherever it appears on his/her list, by the string empty. First, observe that f3 prefers {w1,w2} to any
f1,. . .,fr(f) in that order of preference. Therefore, other set of players and f3 is the second choice for
the stable matchings of the related marriage mar- w1 and w2; w3 prefers {f1,f2} to any other set of
ket are in natural one-to-one correspondence with players and w3 is the second choice for f1 and f2.
the stable matchings of the college admission Then, in any stable matching m, f3 must be
market. By using the existence theorem for the matched to w1 and w2, while w3 must be matched
marriage model, we obtain the corresponding to f1 and f2. Separate the cases by considering the
result for the college admission model. possibilities for the second partner of w1, under a
The existence proof for the many-to-one case supposed stable matching m:
with strict and substitutable preferences was first
given by Kelso and Crawford (1982) through a
variant of the deferred-acceptance algorithm. 1. (w1 is matched to {f2,f3}). Then f2 is not
If the market does not have two sides or the matched to w4 and we have that {f2,w4} causes
many-to-one matching model does not have sub- an instability in the matching, since f2 prefers
stitutable preferences, then the set of setwise-stable w4 to w1 and f2 is the second choice for q4.
382 Two-Sided Matching Models
2. (w1 is matched to {f3,f4}). Then the following strongly substitutable preferences, while the
possibilities occur: other side has substitutable preferences, then the
(a) (w2 is matched to {f3,f4}.) Then {f1,f2,w1, set of setwise stable matchings coincides with the
w2} causes an instability in the matching. set of pairwise-stable matchings.
This matching is pairwise stable, but it is Konishi and Ünver (2006) give conditions on
not in the core. the preferences of the agents in a many-to-many
(b) (w2 is matched to {f3,f1}.) Then {f1,w4} causes matching market under which a pairwise-stable
an instability in the matching, since f1 is the matching cannot be quasi-dominated by a
first choice for w4 and f1 prefers w4 to w2. pairwise-unstable matching via a collation.
(c) (w2 is matched to {f3,f2} or {f3}.) Then {f4, Eeckhout (2000), under the assumption of
w2} causes an instability in both cases, strict preferences and that every man (woman) is
since w2 is the second choice for f4 and acceptable to every woman (man), presents a suf-
w2 prefers f4 to f2 and prefers f4 to have an ficient condition for uniqueness of the stable
unfilled position. matchings in the marriage market. The condition
3. (w1 is matched to {f1,f3} or {f3}.) Then {f4,w1} on preferences is simple: for every f i F ¼
causes an instability in both cases, since w1 is f f 1 , f 2 , . . . , f m g, wi > f i wk for all k > i and
the first choice for f4 and w1 prefers f4 to f1 and for every w j W ¼ fw1 , w2 , . . . , wn g, f i >wj f k
prefers f4 to have an unfilled position. for all k > i.
One line of investigation that has been developed
Hence, there are no stable matchings in this in the theory of two-sided matchings concerns the
example. mathematical structure of the set of stable matchings,
Pairwise-stable matchings always exist when the because it captures fundamental differences and sim-
preferences are substitutable. When preferences are ilarities between the several kinds of models. For the
strict, Roth (1984b) presents an algorithm that finds marriage model and the college admission model
a pairwise-stable matching for a many-to-many with responsive preferences, assuming that the pref-
matching model with substitutable preferences. erences over individuals are strict, the set of setwise-
Sotomayor (1999b) provides a simple and non- stable (stable for short) matchings have the follow-
constructive proof of the existence of pairwise- ing characteristic properties:
stable matchings for the general discrete many-to-
many matching model with substitutable and not 1. Let m and m0 be stable matchings. Then m
necessarily strict preferences. Martínez et al. Fm0 if and only if m0 Wm.
(2004) construct an algorithm, which allows finding That is, there exists an opposition of inter-
the whole set of pairwise-stable matchings, when ests between the two sides of the market along
they exist, for the many-to-one matching model. the whole set of stable matchings.
Authors have looked for sufficient conditions on 2. Every agent is matched to the same number of
the preferences of the agents for the existence of mates under every stable matching.
setwise-stable matchings in the many-to-many Consequently, if an agent is unmatched
cases. Sotomayor (2004b) proves that if the prefer- under some stable matching, then he/she/it is
ences of the firms satisfy the maximin property, then unmatched under any other stable matching.
the set of pairwise-stable matchings coincides with When preferences are strict, there are two
the set of setwise-stable matchings. An example in natural partial orders on the set of all stable
that paper shows that the set of setwise stable matchings. The partial order F is defined as
matchings may be empty if this condition is not follows: m Fm0 if mð f Þ f m0 ð f Þ for all
satisfied. It is assumed there that the preferences are f F. The partial order W is analogously
responsive and it is conjectured that the result above defined. The fact that these partial orders are
extends to the case of substitutable preferences. well defined follows from A1.
Echenique and Oviedo (2006) also address this 3. The set of stable matchings has the algebraic
problem with a different condition. They show structure of a complete lattice under the partial
that if agents on one side of the market have orders F and W.
Two-Sided Matching Models 383
The lattice property means the following: If m Property A2 was proved by Gale and
and m0 are two stable matchings, then some Sotomayor (1983) for both models. For the college
workers (respectively, firms) will get a preferable admission model with responsive preferences,
set of mates under m than under m0 and others will Roth (Roth 1986) added that if a college does not
be better off under m0 than under m. The lattice fill its quota at some stable matching, then it has
property implies that there is then a stable the same set of mates at every stable matching. The
matching which gives each agent the most prefer- restriction of property A2 to the marriage model
able of the two sets of partners and also one which where every pair of partners is mutually acceptable
gives each of them the least preferred set of part- was proved by McVitie and Wilson (1970).
ners. That is, if m and m0 are stable matchings, the For the many-to-one case with substitutable pref-
lattice property implies that the functions l, u, Z, erences, Martínez et al. (2001) presents an example
and t are stable matchings, where l ¼ m_Fm0 is in which there are agents who are unmatched under
defined by l(f ) ¼ max{m(f ),m0(f )} and l(w) ¼ some stable matching and are matched under another
min{m(w),m0(w)}, the function Z ¼ m_Wm0 is anal- one. By introducing quotas in the model with substi-
ogously defined, and the function u ¼ m^Fm0 is tutable preferences of Roth and Sotomayor (1990),
defined by u( f ) ¼ min{m( f ),m0(f)} and u(w) ¼ these authors prove that if the preferences of the
max{m(w),m0(w)}. Analogously we define colleges are strict, substitutable, and r(f) separable
t ¼ m^Wm0 (notice that m_Fm0 is the same as for every college f, then property A2 holds. Further-
m^Wm0 and m_Wm0 is the same as m^Fm0). more Roth’s result mentioned above also applies.
The fact that the lattice is complete implies the The lattice property of the set of stable
existence and uniqueness of a maximal element matchings for the marriage model is attributed
and a minimal element in the set of stable payoffs, by Knuth (1976) to Conway. The existence of
with respect to the partial order that is being the optimal stable matchings for each side of the
considered. Thus, there exists one and only one marriage market and the college admission market
stable matching mF and one and only one stable with responsive preferences was first proved in
matching mW such that mF Fm and mW Wm Gale and Shapley (1962) by using the deferred
for all stable matchings m. Property A1 then acceptance procedure. The idea of their elegant
implies that m WmF and m FmW. That is: proof is to show that a proposer is never rejected
by an achievable mate, so he/she/it ends up with
1. There is an F-optimal stable matching mF with his/hers/its best achievable mate.
the property that for any stable matching The lattice property for the college admission
m, mF Fm and m WmF; there is a W-opti- model with responsive preferences was obtained
mal stable matching m W with symmetrical in Roth and Sotomayor (1990). To show that the
properties. functions l, u, Z, and t above are well defined,
these authors used the following theorem from
Property A1 was first proved by Knuth (1976) Roth and Sotomayor (1989): If colleges and stu-
for the marriage model. The result for the college dents have strict preferences over individuals,
admission model with responsive preferences was then colleges have strict preferences over those
proved in Roth and Sotomayor (1990) by making groups of students that they may be assigned at
use of the following proposition of Roth and stable matchings. That is, if m1 and m2 are stable
Sotomayor (1989): Suppose colleges and students matchings, then a college f is indifferent between
have strict individual preferences, and let m1 and m1(f ) and m2(f ) only if m1( f ) ¼ m2( f ).
m2 be stable matchings for the college admission This result is an immediate consequence of the
model such that m1(f) 6¼ m2(f). Let m1* and m2* be proposition mentioned above, due to the responsive-
the stable matchings corresponding to m1 and m2 ness of the preferences. Therefore, if m1 and m2 are
in the related marriage model. If m1 * (fi)> f m two stable matchings, then f prefers m1(f) to m2(f) if
2 *
and only if the r(f) most preferred students by f in the
(fi) for some position fi of f then m1 f j
set formed by the union of m1(f) and m2(f) are those
f m2 f j for all positions fj of f. ones in m1(f).
384 Two-Sided Matching Models
For the many-to-many matching market with T-map is equal to the set of stable matchings. By
strict and substitutable preferences, Blair (1988) making successive iterations of the T-map,
proved that the set of pairwise-stable matchings starting from some specific pre-matching, until a
(not necessarily setwise stable) has the lattice struc- fixed point is reached, this map can be used to find
ture under some partial order relation that is not all the stable matchings, as long as they exist. This
defined by the preferences of the agents. The def- procedure is called the T algorithm. These authors
inition of the partial order F uses that if m1 and m2 show that as long as the strong core is nonempty,
are pairwise-stable matchings, then m1 f m2 if the T algorithm always converges, and if the
and only if Chf(m1(f)[m2(f)) ¼ m1(f), for every strong core is empty, it cycles. They present an
agent f F. Similarly the partial order W is example of a situation in which the preferences
defined. As remarked above, the partial order are not substitutable and the T algorithm finds
defined by Blair coincides with the one defined strong core allocations, but the algorithm with
by the preferences of the players in the college firms proposing according to their preference
admission model with responsive preferences. lists over allowable sets of workers does not do
Adachi (2000) introduces a map, which is called so. Finally, they give a bound on the computa-
a T-map, defined over the set of pre-matchings, in tional complexity of the T algorithm and show
order to show that the set of stable matchings is a how it can be used to calculate both the supremum
nonempty lattice in the marriage market under and infimum under Blair’s partial order, which,
strict preferences. Adachi defines the T-map as under non-substitutable preferences, might not be
follows: given a pre-matching m,T(m(f)) is f’s easily computed. Furthermore, under substitut-
most preferred worker in fw Wjf wmðwÞg [ ability, the set of pre-matchings endowed with
ð f Þ for all f F, and similarly, T(m(w)) is w’s most the partial order defined by Blair is again a com-
preferred firm in f f Fjw f mð f Þg [ ðwÞ for plete lattice and the T-map is isotone, so Tarski’s
all w W. Clearly, any fixed point of the T-map is theorem implies that the strong core is a nonempty
a matching, and Adachi showed it has to be stable. lattice under the partial order introduced in
Using the partial order F defined by the agents’ Blair (1988).
preferences, he showed that the set of pre- For the many-to-many model, the set of fixed-
matchings endowed with this partial order is a points of the T-map studied by Echenique and
complete lattice and the T-map is an isotone func- Oviedo (2006) is shown to be, under substitutabil-
tion (order preserving). Thus, Tarski’s fixed point ity, equal to the set of pairwise-stable matchings
theorem implies that the set of fixed points of the and a superset of the set of setwise-stable
T-map, which is the set of stable matchings, is a matchings. Furthermore, the set of pre-matchings
nonempty complete lattice. endowed with Blair’s partial order is again a com-
plete lattice. Then Tarski’s theorem applies and
Theorem 1 (Tarski’s Theorem (Tarski 1955)) the set of pairwise-stable matchings is a nonempty
Let E be a complete lattice with respect to some complete lattice. If both sides of the market satisfy
partial order , and let f be an isotone function the strong substitutability property, then the set of
from E to E. Then the set of fixed points of f is setwise stable matchings is a complete lattice,
nonempty and is itself a complete lattice with both for Blair’s partial order and for the partial
respect to the partial order . order defined by the agents’ preferences.
Martínez et al. (2004) propose an algorithm
In this same vein, Echenique and Oviedo which allows them to calculate the whole set of
(2004, 2006) extend Adachi’s (2000) approach pairwise-stable matchings under substitutability.
and the T-map in order to analyze the many-to- Echenique and Yenmez (2007) study the col-
one and many-to-many models, respectively. lege admission problem when students have pref-
Again, any fixed point of the T-map is a matching. erences over colleagues. Using the T-map, they
Echenique and Oviedo (2004) show that for the construct an algorithm, which finds all the core
many-to-one model, the set of fixed points of the allocations, as long as the core is nonempty. In a
Two-Sided Matching Models 385
similar setup, Pycia (2007) finds necessary and Ostrovsky (2008) generalizes the model pre-
sufficient conditions for the existence of stable sented in Hatfield and Milgrom (2005) to a
matchings. Dutta and Massó (1997) studied a K-sided discrete many-to-many matching model.
many-to-one version of the model of Echenique A set of contracts, which allows the production
and Yenmez (2007). They showed that under cer- and consumption of some goods, is called a net-
tain conditions on preferences, the core is work. This author considers supply networks
nonempty. where goods are sold and transformed through
Hatfield and Milgrom (2005) present a general many stages, starting from the suppliers of initial
many-to-one model in which, the central concept inputs, going through different intermediaries
is that of a contract, which allows a different until they reach the final consumer. He generalizes
formalization of a matching. In their model, the concept of pairwise stability to this setting and
there are a finite set of firms, a finite set of workers calls it chain stability, which requires that there
and a finite set of wages offered by firms. Each does not exist a chain of contracts such that all
contract c specifies the pair (f,w) involved and the members of this chain are better off. A chain of
wage the worker w gets from firm f, so that the set contracts specifies a sequence of agents, each of
of contracts is C ¼ F W WP (where WP is the whom is the seller in one contract and the buyer in
set of wages offered by the firms). Clearly, if each the next one. Under certain conditions on the
firm offers a unique wage level to all workers, preferences of the agents, he proves the existence
their model is a college admission model. Agents of chain stable networks and, by using fixed point
have preferences over the contracts in which they methods, he shows that the set of chain stable
could be involved. In this model, a feasible allo- networks is a nonempty lattice. Furthermore, he
cation is a set of contracts C0 C in which each proves that there exists a consumer optimal net-
worker w appears at most in one contract c C0 work and an initial supplier optimal network, sim-
and each firm f appears in at most r(f) contracts ilar to the F-optimal and W-optimal matchings in
c1,. . .,cr(f) C0 and such that for each agent other models. Finally, for the case in which each
a F[W, we have that Cha(Ca) ¼ Ca, where C a agent can be the seller (buyer) in at most one
is the set of contracts in C0 in which agent a contract, he shows that the set of chain stable
appears. According to Hatfield and Milgrom, a networks is equal to the core.
feasible allocation C0 is stable if there does not Crawford (2008) proposes to allow offers in
exist an alternative feasible allocation which is the NRMP mechanism to include salaries and
strictly preferred by some firm f and weakly pre- demonstrates how the resulting market can gener-
ferred by all of the workers it hires. Making use of ate stable outcomes, which might Pareto dominate
Tarski’s fixed point theorem, they prove that if the ones in the current form of the NRMP. This
preferences are strict and satisfy substitutability model can be seen as an application of the Hatfield
over the set of contracts, then the set of stable and Milgrom (2005) paper.
allocations is a nonempty lattice. They introduce Another line of investigation that has grown in
the condition of the law of aggregate demand on the last decade concerns the special case of the
preferences, which requires
that forall allowable
college admission model in which colleges have
sets X,Y, if X Y, then Ch f ðY Þ Ch f ðXÞ. By
fixed preferences, known nowadays as the school
assuming that firms’ preferences satisfy substitut- choice model. The seminal paper is Sotomayor
ability and the law of aggregate demand, they (1996b) which was motivated by the admission
prove that some characteristic results on the struc- market of economists to graduate centers of eco-
ture of the set of stable allocations and analyze the nomics in Brazil. The students take some tests and
incentives facing the agents when a mechanism each institution places weights on each of these
which produces the W-optimal stable allocation is tests in order to rank the students according to the
adopted. They show that under this mechanism, it weighted average of the tests. See Ergin and
is a dominant strategy for the workers to state their Sönmez (2006), Pathak and Sönmez (2006), and
true preferences. Balinski and Sönmez (1999).
386 Two-Sided Matching Models
The main properties that characterize the set of ðu , wÞ Qðu, wÞ for all stable payoffs (u,w).
stable outcomes of the assignment game of That is:
Shapley and Shubik are the following:
1. There is a B-optimal stable payoff ðu , w Þ
1. Let (u,w) be some stable payoff. Then m is an
with the property that for any stable payoff
optimal matching if and only if it is compatible
ðu, wÞ, u u and w w; there is a Q-opti-
with (u,w).
mal stable payoff ðu , wÞ with symmetrical
properties.
This result means that the set of stable payoffs
2. The set of stable payoffs equals the core and
is the same under every optimal matching. Then
the set of competitive equilibrium payoffs.
we can concentrate on the payoffs of the agents
rather than on the underlying matching.
Excluding property B3 which follows from
property 1 of Demange and Gale (1985), all the
1. Let (u,w) and (u0,w0) be stable payoffs. Then
other properties were first proved in Shapley and
u u0 if and only if w0 w.
Shubik (1972).
The general quota case is a version of the
That is, there exists an opposition of interests
model studied in Crawford and Knoer (1981)
between the two sides of the market along the
and was first presented in Sotomayor (1992) in
whole set of stable payoffs.
the context of a labor market of firms and workers.
Under this approach, the number r(b) is the max-
1. If an agent is unmatched under some stable
imum number of workers firm b can hire, the
outcome, then he/she gets a zero payoff under
number s(q) is the maximum number of jobs
any other stable outcome.
worker q can take, and the number v bq is the
productivity of worker q in firm b. The natural
This means, for example, that if a worker is cooperative solution concept is that of setwise
unemployed under some stable outcome, then this stability which is shown to be equivalent to the
worker will get a zero salary under any other concept of pairwise stability. Then, the feasible
stable outcome. outcome (u,w;m) is setwise stable, if ub ð min Þ þ
wq ð min Þ vbq for all pairs (b,q) with q= 2m(b),
1. The set of stable payoffs forms a convex and where ub(min) is the smallest individual payoff of
compact lattice under the partial orders B firm b and wq(min) is the smallest individual pay-
and Q. off of worker q.
The existence of setwise-stable payoffs for this
The partial order B on the set of stable model was proved in Sotomayor (1992, 1999a)
payoffs is defined as follows: ðu, wÞ Bðu0 , w0 Þ through the use of linear programming.
if ub u0b for all b B. Property B2 implies that Another interpretation of this model considers
wq w0q for all q Q, so this partial order is well a buyer-seller market: B is a set of buyers and Q is
defined. The partial order Q is symmetrically a set of sellers. Buyers are interested in sets of
defined. Then, (u,w)_B(u0,w0) ¼ (max{u,u0},min objects owned by different sellers and each seller
{w,w0}) and (u,w)^B(u0,w0) ¼ (min{u,u0},max{w, owns a set of identical objects. The number r(b) is
w0}). the number of objects buyer b is allowed to
The lattice property implies that there exist a acquire, the number s(q) is the number of identical
maximal element and a minimal element in the set objects seller q owns, and the number v bq is the
of stable payoffs. The fact that the lattice is com- amount of money buyer b considers to pay for an
plete implies the uniqueness of these extreme object of seller q. We say that v bq is the value of
points. Thus, there exists one and only one stable object q (object owned by seller q) to buyer b. An
payoff ðu , w Þ and one and only one stable artificial null object, 0, owned by the dummy
payoff ðu wÞ such that ðu , w Þ Bðu, wÞ and seller, whose value is zero to all buyers and
388 Two-Sided Matching Models
whose price is always zero, is introduced for tech- (1982) formulated a discrete and a continuous
nical convenience. Under this approach, a buyer many-to-one matching model where the functions
will be assigned to an allowable set of objects at a v({b}[A) satisfy the gross substitute condition
feasible allocation, meaning that he/she is and are not necessarily additively separable. In
matched to the set of sellers who own the objects this model, the core, the set of setwise-stable
in the given set. X payoffs, and the set of competitive equilibrium
Given a price vector p R + s, with s sðqÞ, payoffs coincide and are nonempty. These authors
qQ prove, through an example, that without this con-
the preferences of buyers over objects are dition, the core may be empty.
completely described by the numbers vbq’s: For A consequence of the competitive equilibrium
any two allowable sets of objects S and S0, buyer concept for the many-to-many case with addi-
b prefers S to S0 at prices p if his/her total payoff tively separable utility functions is that sellers do
when he/she buys S is greater than his/her total not discriminate buyers under a competitive equi-
payoff when he/she buys S0. He/she is indifferent librium payoff, as they might do under a stable
between these two sets if he/she gets the same total outcome. The competitive equilibrium payoffs for
payoff with both sets. Usually, given the prices of this model are characterized as the setwise-stable
the objects, buyers demand their favorite allowable payoffs where every seller has identical individual
sets of objects at those prices. The set of such payoffs. It is interesting to point out that if the
allowable sets is called the demand set of buyer identical objects are owned by different sellers,
b at prices p. An equilibrium is reached if every they need not be sold at the same price unless the
buyer is assigned to an allowable set of objects of two sellers have the same number of objects and
her demand set, every seller with a positive price the selling price is the minimum competitive equi-
sells all of his items, and the number of objects in librium price (Sotomayor 2007a).
the market is enough to meet the demand of all A stable (respectively, competitive equilib-
buyers. The solution concept that captures this rium) payoff is called a B-optimal stable
intuitive idea of equilibrium is that of competitive (respectively, competitive equilibrium) payoff if
equilibrium payoff defined in Sotomayor (2007a) every agent in B weakly prefers it to any other
as an extension of the concept of competitive equi- stable (respectively, competitive equilibrium)
librium price for the assignment game given in payoff. That is, the B-optimal stable
Demange et al. (1986). Formally, (u,p;m * ) is a (respectively, competitive equilibrium) payoff
competitive equilibrium outcome if (i) it is feasi- gives to each agent in B the maximum total payoff
ble; (ii) m * is a feasible allocation such that, if m * among all stable (respectively, competitive equi-
(b) ¼ S, then S is in the demand set of b at prices librium) payoffs. Similarly we define a Q-optimal
p for all b B; and (iii) pq ¼ 0 if object q is left stable (respectively, competitive equilibrium)
unsold. payoff.
If (u,p;m* ) is a competitive equilibrium out- The existence and uniqueness of the B-optimal
come, we say that (u,p) is a competitive equilib- and of the Q-optimal stable payoffs are proved in
rium payoff, (p,m*) is a competitive equilibrium Sotomayor (1999a) by showing that the set of
and p is a competitive equilibrium price or an stable payoffs is a lattice under two conveniently
equilibrium price for short. defined partial orders. This result runs into the
One characteristic of the additively separable difficulties of defining a partial order relation in
utility function is that if a buyer demands a set A of the set of stable payoffs, due to the fact that, on the
objects at prices p and some of these objects have one hand, the arrays of individual payoffs are
their prices raised, then the buyer will continue to unordered sets of numbers indexed according to
want to buy the objects in A whose prices were not the current matching and, on the other hand, the
changed. That is, the function v({b}[A) over all agents’ preferences do not define a partial order
allowable sets A of partners for b satisfies the relation, since they violate the antisymmetric
gross substitute condition. Kelso and Crawford property.
Two-Sided Matching Models 389
To solve this problem, Sotomayor (1999a) is greater than b’s total payoff under the second
defines a partnership (b,q) to be nonessential if it outcome if and only if q’s total payoff under the
occurs in some but not all optimal matchings and second outcome is greater than q’s total payoff
essential if it occurs in all optimal matchings. under the first outcome. From property B3, if a
Then, two matchings differ only by their nones- seller has some unsold object under a stable out-
sential partnerships. According to Theorem 1 of come, then one of his/her individual payoffs will be
that paper, (i) in every stable outcome, a player zero under any other such outcome.
gets the same payoff in any nonessential partner- Even though the preferences of the players do
ship; furthermore this payoff is less than or equal not define the partial orders B and Q , the
to any other payoff the player gets under the same property stated in B4 is of interest because the two
outcome; (ii) given a stable outcome (u,w;m) and a extreme points of the lattice have an important
different optimal matching m0, we can reindex the meaning for the model. The extreme points of
u bq ’s and w bq ’s according to m0 and still get a the lattice are precisely the B-optimal and the
stable outcome. Q-optimal stable payoffs. Also every buyer
Therefore, the array of individual payoffs of a weakly prefers any stable payoff to the Q-optimal
player can be represented by a vector in a Euclid- stable payoff and any seller weakly prefers any
ean space whose dimension is the quota of the stable payoff to the B-optimal stable payoff.
given player. The first coordinates are the payoffs Indeed, the set of competitive equilibrium pay-
that the player gets from his essential partners offs is a sublattice of the set of stable payoffs. This
(if any), following some given ordering. The connection is given by the following theorem of
remaining coordinates (if any) are equal to a num- Sotomayor (2007a) that states that the set of com-
ber which represents the payoff the player gets petitive equilibrium payoffs is contained in the set
from all his nonessential partners. This represen- of stable payoffs and is a nonempty and complete
tation is clearly independent of the matching, so lattice under the partial order B (respectively,
any optimal matching is compatible with a stable Q) whose supremum (respectively, infimum) is
payoff. Hence, by ordering the players in B optimal and whose infimum (respectively,
B (respectively, Q), we can immerse the stable supremum) is Q optimal.
payoffs of these players in a Euclidean space, The idea of the proof is that the set of compet-
whose dimension is the sum of the quotas of all itive equilibrium payoffs can be obtained by
players in B (respectively, Q). Then, the natural “shrinking” the set of stable payoffs through the
partial order relation of this Euclidean space application of a convenient isotone (order preserv-
induces the partial order relation B ing) map f whose fixed points are exactly the
(respectively, Q ) in the set of stable payoffs. competitive equilibrium payoffs. The desired
We say that ðu, wÞ Bðu0 , w0 Þ if the vector of result is concluded via the algebraic fixed point
individual payoffs of any buyer, under (u,w), is theorem due to Alfred Tarski (1955).
greater than or equal to his/her vector of individ- It is also proved in Sotomayor (2007a) that the
ual payoffs under (u0,w0). Similarly we define B-optimal stable payoff is a fixed point of f, so the
ðu, wÞ Qðu0 , w0 Þ. B-optimal stable payoff is the B-optimal compet-
The main results of Sotomayor (1999a) are itive equilibrium payoff.
that, under the vectorial representation of the sta- As for property B6, Sotomayor (2003a) shows
ble payoffs, properties B1, B2, B3, B4, and B5 that the core coincides with the set of pairwise-stable
hold for the general many-to-many case. payoffs in the many-to-one case where sellers have a
An implication of property B2 is the conflict of quota of one. Since a seller only owns one object,
interests that exists between the two sides of the then he cannot discriminate the buyers, so the core
market with respect to two comparable stable pay- coincides with the set of competitive equilibrium
offs. That is, if payoffs (u,w) and (u0,w0) are stable payoffs in this model. Thus, the set of competitive
and comparable, then for all (b,q) B Q, we equilibrium payoffs is a lattice in this model. The
have that b’s total payoff under the first outcome same result is reached in Gül and Stacchetti (1999)
390 Two-Sided Matching Models
for the many-to-one case in which the utilities satisfy two, so it has one unfilled position. It happens that
the gross substitute condition. b0 can pay more than two to q. Thus, if agents can
However, in the general quota case under addi- communicate with each other and behave cooper-
tively separable utilities, the core may be bigger atively, this outcome will not occur, because
than the set of stable payoffs, which in its turn worker q will not accept to receive only two
contains and may contain properly the set of com- from firm b, since she knows that she can get
petitive equilibrium payoffs, as it is illustrated in more than two by working with firm b0. Hence,
the example below from Sotomayor (2007a). This this outcome cannot be a cooperative equilibrium.
example also shows that the core may not be a Observe that 2 ¼ ub00 + wbq < vb0q ¼ 3, so this
lattice and, the polarization of interests, observed outcome is not stable. On the other hand, it is in
in the sets of stable payoffs and of competitive the core. In fact, if there is a blocking coalition,
equilibrium payoffs, does not always carry over to then it must contain {b0,q}. These agents cannot
the core payoffs: The best core payoff for the increase their total payoffs by themselves; b0
buyers is not necessarily the worst core payoff needs to hire both workers. However {b0,q,q0}
for the sellers. does not block the outcome, because q0 is worse
off by taking only one job. Nevertheless, the coa-
Example 5 (Sotomayor 2007a) Consider the fol- lition of all agents does not block the outcome,
lowing situation. The B players will be called since b loses worker q, so it will be worse off.
firms and the Q players will be called workers. Now, consider the outcome (u0,w0;m), where
There are two firms, b and b0, and two workers ubq0 ¼ 0, ubq00 ¼ 1, u0b0 q0 ¼ 1, ub000 ¼ 0; wbq0 ¼ 3,
q and q0. Each firm may employ and wants to wbq00 ¼ 1, and w0b0 q0 ¼ 2. Firm b0 cannot offer more
employ both workers; worker q may take, at than three to worker q, so the structure of the out-
most, one job and worker q0 may work and come cannot be ruptured. Then, although both out-
wants to work for both firms. The first row of comes (u,w;m) and (u0,w0;m) are corewise stable,
matrix v is (3,2) and the second one is (3,3). only the second one can be expected to occur, so
only this outcome is a cooperative equilibrium. Our
There are two optimal matchings: m and m0, explanation for this is that only (u0,w0;m) is stable.
where m(b) ¼ {q,q0}, m(b0) ¼ {q0,0} and m0(b0) ¼ The connection between the core, the set of
{q,q0}, m0(b) ¼ {q0,0}. The core is described by the stable payoffs, and the set of competitive payoffs,
set of individual payoffs (u,w) whose total payoffs exhibited in this example, can be better under-
(U,W), satisfy the following system of inequal- stood via Fig. 1. Now, if the reader prefers, sets
ities: 0 U b 2, 0 U b0 3; W q þ W q0 3, B and Q are better interpreted as being the set of
W q0 W q 2, 1 W q 3. It is not hard to see buyers and the set of sellers, respectively.
that the outcome (u,w) is stable if and only if seller In Fig. 1, C(W) is the set of seller’s total pay-
q always gets payoff wq ¼ 3 and seller q0 gets offs, which can be derived from any core payoff.
individual payoffs wbq0 [0,2] and wb0q0 [0,3]; The segment OP0 is the set of the seller’s total
the individual payoffs of buyers b and b0 are payoffs which can be derived from any stable
given by (ubq ¼ 0, ubq0 ¼ 2 wbq0) and (ub0q0 ¼ payoff. That is, (Wq,Wq0) OP0 if and only if
3 wb0q0, ub0 ¼ 0), respectively. there is a stable outcome (u,w;m) such that Wq ¼
To see that corewise stability is not adequate to wbq and W q0 ¼ wbq0 þ wb0 q0 . The segment OP is
define the cooperative equilibrium for this market, the set of seller’s total payoffs, which can be
let (u,w;m) be such that ubq ¼ 1, ubq0 ¼ 1, ub0q0 ¼ 1, derived from any competitive equilibrium price.
ub00 ¼ 0; wbq ¼ 2, wbq0 ¼ 1, wb0q0 ¼ 2. That is, firm That is, (Wq,Wq0) OP if and only if there is a
b hires workers q and q0 obtains from each one of competitive equilibrium price p such that Wq ¼ pq
them a profit of one and pays two to q and one to and Wq0 ¼ pq0 + pq0.
q0; firm b0 hires worker q0 at a salary of two and We can see that C(W) is bigger than OP0 which,
obtains a profit of one. Observe that b0 has quota of in its turn, is bigger than OP. The point (2,3)
Two-Sided Matching Models 391
game to a many-to-many matching model in the Therefore, the concepts of setwise stability and
context of firms and workers in which the quotas corewise stability are equivalent.
of the agents are not the number of partnerships This model is motivated by the fact that, in
they are allowed to form. Instead, they are given practice, a wide range of real-world matching mar-
by the units of labor time they can supply or kets are neither completely discrete nor completely
employ. Bikhchandani and Mamer (1997) analyze continuous. In the United States, for example, new
the existence of market clearing prices in an law school graduates may enter the market for
exchange economy in which agents have associate positions in private law firms, which
interdependent values over several indivisible negotiate salaries, or they may seek employment
objects. Although an agent can be both a buyer as law clerks to federal circuit court judges, which
and a seller, such an exchange economy can be are civil service positions with predetermined fixed
transformed into a many-to-one matching market salaries. In the market for academic positions and
where each seller owns only one object and professors, for example, the American universities
buyers want to buy a bundle of objects, and can compete with each other in terms of salaries, while
be viewed as an extension of the assignment the French public universities offer a preset and
game. See also Demange (1982), Leonard fixed salary. In Brazil, new professors may enter
(1983), Perez-Castrillo and Sotomayor (2002), the market for permanent positions (with preset
Sotomayor (2003c), Thompson (1980) and and fixed salaries) in federal universities, or they
Kaneko (1982). may seek employment in private universities,
which do not offer such positions but compensate
the entrants with better and negotiable salaries.
Eriksson and Karlander (2000) present an
Hybrid One-to-One Matching Model algorithm to find a stable outcome under the
assumption that the numbers apq, bpq, and cpq are
The hybrid one-to-one matching model is the name integer numbers of some unit. A nonconstructive
given in the literature to a unified model due to proof of the existence result without imposing any
Eriksson and Karlander (2000) and inspired in the restriction is provided in Sotomayor (2000a). In
unification proposed in Roth and Sotomayor this paper, it is proved that, under the assumption
(1996). Agents from the marriage market and the that the core, C, is equal to the strong core, C*, the
assignment market are put together so that they can main properties that characterize the stable pay-
trade with each other in the same market. We can offs of the marriage and of the assignment models
interpret the hybrid model as being a labor market carry over to the hybrid model. That is, for the
of firms and workers: P is the set of firms and Q is hybrid model:
the set of workers. There are two classes of agents
in each set: rigid agents and flexible agents. For 1. Let (u,w) and (u0,w0) be stable payoffs for the
each pair (p,q) P Q, there is a number cpq hybrid model. If C ¼ C*, then u u0 if and
representing the productivity of the pair. If a firm only if w0 w.
p hires worker q and both agents are flexible, then 2. If C ¼ C * and an agent is unmatched under
the number cpq is allocated into salary vq for the some stable outcome then he/she gets a zero
worker and profit up for the firm as a result of a payoff under any other stable outcome.
negotiation process. If one of the agents is rigid, 3. If C ¼ C*, then the set of stable payoffs forms a
then the payoffs of the agents are preset and fixed complete lattice under the partial orders P
and are part of the job description. In this case, the and Q.
profit of p and the salary of q will be apq and bpq, 4. If C ¼ C* then there is a P-optimal stable
respectively. The definitions of feasible outcome, payoff ðu w Þ with the property that for any
corewise-stable outcomes, and setwise-stable out- stable payoff ðu, wÞ, u u and w w; there
comes are straightforward extensions from the is a Q-optimal stable payoff ðu , wÞ with
respective concepts for the non-hybrid models. symmetrical properties.
Two-Sided Matching Models 393
Sotomayor (2007b) studies the hybrid model quasi-optimal stable payoffs for workers caused by
without imposing the assumption that the core is the entrance of rigid firms into the assignment mar-
equal to the strong core. Instead she assumes that the ket or by the entrance of flexible firms into the
preferences of the rigid agents, as well as the pref- marriage model. The results of that paper can be
erences of the flexible agents over rigid agents, are summarized as follows: Whether agents are allo-
strict. It is shown there that the core of the hybrid cated according to a quasi-optimal stable payoff for
model has a nonstandard algebraic structure given firms or according to a quasi-optimal stable payoff
by the disjoint union of complete lattices endowed for workers, it will always be the case that if flexible
with the properties above. The extreme points of the firms enter the rigid market, no rigid firm will be
lattices of the core partition are called quasi-optimal made better off and no worker will be made worse
stable payoffs for firms and quasi-optimal stable off; if rigid firms enter the flexible market, no flexible
payoffs for workers. When the workers are always firm will be made better off and no worker will be
flexible, then the marriage market is obtained when made worse off.
the flexible firms leave the hybrid market and the Comparative static results of adding agents
assignment game is obtained when the rigid firms from the same side to the marriage market or to
leave the hybrid market. the assignment market have been obtained in the
Each subset of the core partition of the hybrid literature under the assumption that the agents are
model is obtained as follows. For any matching m allocated according to the optimal stable outcome
which is compatible with some stable payoff, for firms or according to the optimal stable out-
decompose the market participants into two dis- come for workers. However, in the approach con-
joint subsets. One subset contains all rigid firms sidered in Sotomayor (2007b):
and their mates at m and the other one contains all
flexible firms, their mates at m, and the unmatched 1. The firms that are added are different from the
workers. Now, fix such a partition of the agents. firms, which are already in the market. For
The desired subset C(m) of the core partition is example, in the marriage market, where utility
formed with the core payoffs (u,w;m0) such that all is non-transferable, the comparative static adds
agents in the first set are matched among them- firms with flexible wages who can transfer
selves under m0 and all agents in the second set are utility.
matched among themselves under m0. 2. The points which are compared belong to cores
Clearly, as the rigid firms exit the hybrid mar- with quite distinct algebraic structures.
ket, the core partition for the corresponding 3. There may exist several quasi-optimal stable
assignment market is reduced to only one set, outcomes for firms and several quasi-optimal
since any stable matching is compatible with any stable outcomes for workers in the hybrid
core payoff by property B1. An analogous result model. Despite the multiplicity of these out-
holds as the flexible firms leave the hybrid market, comes, all of them reveal the same kind of
due to the fact that the matched agents in the comparative static effects.
marriage market are the same at every stable
matching, which is implied by property A2. Therefore, the result above has no parallel in
Therefore, as all flexible firms leave the hybrid the non-hybrid models.
market or as all rigid firms leave the hybrid mar- It is argued in Sotomayor (2007b) that if the
ket, the restriction of the algebraic structure to the resulting core partition is not reduced to only one
core of the resulting non-hybrid market is that of a set when, say, the flexible firms leave the hybrid
complete lattice. Then, the extreme points of the model, the comparative statics may be meaning-
resulting lattice are exactly the firm-optimal and less. This happens, for example, if we define a set
the worker-optimal stable payoffs. of the core partition as the set of all stable payoffs
This algebraic structure was used in Sotomayor compatible with some given matching. Then each
(2007b) to investigate the comparative effects on the lattice of the core partition for the marriage market
quasi-optimal stable payoffs for firms and on the has only one stable matching, which is both the
394 Two-Sided Matching Models
supremum and the infimum of the lattice. Of non-manipulability and, for the games induced
course, the distinctions between, say, the best by the mechanism, the existence of strategic equi-
stable payoff for workers of some lattice of the libria and the implementability of the set of stable
core partition of the hybrid market and an arbi- matchings via such equilibria. For the marriage
trary core point of the marriage model cannot be model with strict, the equilibrium analysis of a
attributed to the entrance of the flexible firms into game induced by a stable matching mechanism
the marriage market. leads to the following results:
Results of comparative statics were originally
obtained by Gale and Sotomayor (1983) for the 1. (Impossibility theorem) (Roth and Sotomayor,
marriage model and the college admission model: 1990) When any stable mechanism is applied
If agents from the same side of the market enter to a marriage market in which preferences are
the market, then no agent from this side is better strict and there is more than one stable
off and no agent of the opposite side is worse off, if matching, then at least one agent can profit-
any of the two optimal stable matchings prevails. ably misrepresent his/her preferences, assum-
A similar result was proved by Demange and Gale ing the others tell the truth (this agent can
(1985) for a continuous one-to-one matching misrepresent in such a way as to be matched
model that includes the assignment game. to his/her most preferred achievable mate
For the assignment game, Shapley (1962) under the true preferences at every stable
showed that the optimal stable payoff for an agent matching under the mis-stated preferences).
weakly decreases when another agent is added to 2. (Limits on successful manipulation) (Demange
the same side and weakly increases when another et al. 1987). Let P be the true preferences (not
agent is added to the other side. Still with regard to necessarily strict) of the agents, and let P0
the assignment game, Mo (1988) showed that if the differ from P in that some coalition C of men
incoming worker is allocated to some firm in some and women misstate their preferences. Then
stable outcome for the new market, there is a set of there is no matching m, stable for P0, which is
agents such that every firm is better off and every preferred to every stable matching under the
worker is worse off in the new market than in the true preferences P by all members of C.
previous one. A symmetric result holds when the
incoming agent is a firm. An analogous result is A corollary of this result is due to Dubins
demonstrated by Roth and Sotomayor (1990) for and Freedman (1981) which states that the man-
the marriage market. optimal stable matching mechanism is non-
For the many-to-one matching markets with sub- manipulable, individually and collectively, by
stitutable preferences, Kelso and Crawford (1982) the men:
showed that, within the context of (flexible) firms
and workers, the addition of one or more firms to the 1. (Gale and Sotomayor 1985) When all prefer-
market weakly improves the workers’ payoffs and ences are strict, let m be any stable matching
the addition of one or more workers weakly for (F,W,P) . Suppose each woman w in m(F)
improves the firms’ payoffs, under the firm-optimal chooses the strategy of listing only m(w) on her
stable allocation. Similar conclusions were obtained stated preference list of acceptable men (and
by Crawford (1991) for a many-to-many matching each man states his true preferences). This is a
model with strict and substitutable preferences, by Nash equilibrium in the game induced by the
comparing pairwise-stable outcomes instead of man-optimal stable matching mechanism (and
setwise-stable outcomes. m is the matching that results).
2. (Roth 1984a) Suppose each man chooses his
dominant strategy and states his true preferences
Incentives and the women choose any set of strategies
(preference lists) P0(w) that form a Nash equilib-
The strategic questions that emerge when a stable rium for the revelation game induced by the
revelation mechanism is adopted concerns its man-optimal stable mechanism. Then the
Two-Sided Matching Models 395
corresponding man-optimal stable matching for preferences, any stable matching mechanism imple-
(F,W,P0) is one of the stable matchings for (F,W,P). ments the student-optimal stable matching via
3. (Gale and Sotomayor Gale and Sotomayor strong equilibrium in the strong sense and Nash
1985) Suppose each man chooses his dominant equilibrium in the strong sense.
strategy and states his true preferences and the Ergin and Sönmez (2006) and Pathak and
women truncate their true preferences at the Sönmez (2006) analyze the Boston mechanism,
mate they get under the woman-optimal stable which is used to assign students to schools in
mechanism. This profile of preferences is a many cities in the United States and show that
strong equilibrium for the women in the game students’ parents do not have incentives to report
induced by the man-optimal stable mechanism preferences truthfully.
(and the woman-optimal stable matching Ma (2002) analyzes the strategic behavior of
under the true preferences is the matching both students and colleges in the college admis-
that results). sion model with responsive preferences. This
author proves that the set of stable matchings is
Results (3) and (4) imply that the man-optimal implemented by any stable mechanism via
stable mechanism implements the core correspon- rematching proof equilibrium and strong equilib-
dence via Nash equilibria. rium in truncation strategies at the match point.
For the college admission model with respon- The implementability of the set of stable
sive and strict preferences, the theorem of Dubins matchings through stable and non-necessarily sta-
and Freedman implies that the student-optimal ble mechanisms has also been investigated by sev-
stable mechanism is non-manipulable individu- eral authors. Alcalde (1996) presents a mechanism
ally and collectively by the students. Roth for the marriage market closely related to the algo-
(1985b) shows through an example that the col- rithm of Gale and Shapley, which implements the
lege-optimal stable mechanism is manipulable by core correspondence in undominated equilibria.
the colleges due to the fact that the colleges may Kara and Sönmez (1996) analyze the problem of
have a quota greater than one. implementation in the college admission market.
Sotomayor (1998, 2000, 2007c) analyzes the They show that the set of stable matchings is
strategic behavior of the students in a school choice implementable in Nash equilibrium. Nevertheless,
model where participants have strict preferences no subset of the core is Nash implementable.
over individuals. This paper proves that the col- Romero-Medina (1998) studies the mechanism
lege-optimal stable mechanism implements the set employed by the Spanish universities to distribute
of stable matchings via the Nash equilibrium con- the students to colleges, which can produce unsta-
cept. When some other stable mechanism is used, an ble matchings for the stated preferences. However,
example shows that the strategic behavior of the when students play in equilibrium, only stable
students may lead to unstable matchings under the allocations are reached. Sotomayor (2003b) inves-
true preferences. A sufficient condition for the sta- tigates a mechanism for the marriage model which
bility of the Nash equilibrium outcome is then pro- is not designed for producing stable matchings.
ved to be that the set of stable matchings for the Here also the equilibrium outcomes are stable
Nash equilibrium profile is a singleton. A random matchings under the true preferences.
stable matching mechanism is proposed and the For the discrete many-to-one matching model
Nash equilibrium concept ex ante is shown to be with responsive preferences, Alcalde and Romero-
equivalent to the Nash equilibrium concept ex post Medina (2000) analyze the following mechanism:
of the game induced by such a mechanism. This Firms announce a set of workers they want to hire.
refinement of the Nash equilibrium concept is called Then each worker selects the firm she wants to
Nash equilibrium in the strong sense. Under this work for. This paper proves that such a mechanism
equilibrium concept, any stable matching mecha- implements the set of stable matchings in subgame
nism (and in particular the random stable matching perfect Nash equilibrium. For the many-to-many
mechanism) implements the set of stable matchings. case, Sotomayor (2004b) shows that this result
Also, if the students only play truncations of the true does not carry over. This paper proves that
396 Two-Sided Matching Models
subgame perfect Nash equilibria always exist, buyer-optimal stable payoff is adopted, then no
while strong equilibria may not exist. The subgame coalition of buyers by falsifying demands can
perfect Nash equilibrium outcomes are precisely achieve, only through the mechanism, higher pay-
the pairwise-stable matchings, which may be out of offs to all of its members. We added only through the
the core when the preferences of the agents in one mechanism because this model allows monetary
of the sides are not maximin. Under this condition, transfers within any coalition. As in the marriage
the equilibrium outcomes are the setwise-stable model, this result is an immediate consequence of a
matchings and every subgame perfect Nash equi- more general theorem due to Sotomayor (1986)
librium is a strong equilibrium. which states the following: Let (u0,w0;m) be any
By assuming non-strict preferences, Abdulk- stable outcome for the market M0 where B0[Q0 is
adiroğlu et al. (2006) show that no mechanism the set of agents who misrepresent their utility func-
(stable or not and Pareto optimal or not), which is tions. Let (u*, w*) be the true payoff under (u0,w0;m) .
better for the students than the student proposing Then, there exists a stable payoff (u,w) for the orig-
deferred acceptance algorithm with tie breaking, inal market such that ub u b for at least one b in
can be strategy proof. B0 or wq w0q for at least one q in Q0.
Sönmez (1999) analyzes a model which he Demange and Gale (1985) also addresses the
calls the generalized indivisible allocation prob- strategic behavior by the sellers when the mecha-
lem that includes the roommate and the marriage nism produces the buyer-optimal stable payoff.
markets. He looks for conditions which explain These authors show that by specifying their supply
the differences on strategy-proofness results that functions appropriately, the sellers can force, by
have been generated in the literature. He shows strong Nash equilibrium strategies, the payoff to be
how some of the results in the literature can be given by the maximum rather than the minimum
seen as corollaries of his results. equilibrium price. Under the assumption that the
Ehlers and Massó (2004) study Bayesian Nash sellers only manipulate their reservation prices
equilibria of stable mechanisms (such as the then, if a profile of strategies does not give the
NRMP) in matching markets under incomplete maximum equilibrium price allocation, then either
information. They show that truth-telling is an some seller is using a dominated strategy or the
equilibrium of the Bayesian revelation game strategy profile is not a Nash equilibrium. For this
induced by a common belief and a stable mecha- model, Sotomayor (1986) proves that the outcome
nism if and only if all the profiles in the support of produced by a Nash equilibrium strategy is stable
the common belief have singleton cores. for the original market.
For the continuous matching models, the idea is Sotomayor (2004a) considers, for the assign-
to use competitive equilibrium as an allocation ment game of Shapley and Shubik, the strategic
mechanism to produce outcomes with the desirable games induced by a class of market clearing price
properties of fairness and efficiency. It involves mechanisms. In these procedures, buyers and
having agents specify their supply and demand sellers in different stages reveal their demand
functions. The competitive equilibria are then cal- and supply functions and a competitive equilib-
culated and allocations are made accordingly. rium is produced by the mechanism. For each
Demange (1982) and Leonard (1983) considered vector of reservation prices selected by the sellers,
the assignment game of Shapley and Shubik and, the buyers play the subgame that starts and can
independently, proved that the allocation mecha- force the buyer-optimal stable payoff through
nism that yields the minimum competitive equilib- Nash equilibrium strategies. However sellers can
rium prices is individually non-manipulable by the reverse this outcome by forcing the subgame per-
buyers. Demange and Gale (1985) consider a one- fect equilibrium allocation to be the seller-optimal
to-one matching model in which the utilities are stable payoff for the original market.
continuous functions in the money variable and Kamecke (1989) and Perez-Castrillo and
not necessarily additively separable. These authors Sotomayor (2002) consider the assignment game
prove a sort of non-manipulability theorem which of Shapley and Shubik. The former paper presents
states that if the mechanism which produces the two mechanisms for this market. In the first one,
Two-Sided Matching Models 397
agents act simultaneously. In the second game, the becomes rigid. What kind of comparative static
strategies are chosen sequentially. These mecha- effect is caused by this change in the market?
nisms implement the social choice correspon- 3. One line of investigation not yet explored in
dences that yield the core and the optimal stable the literature concerns the incentives faced by
payoff for the sellers, respectively. The second the agents in the hybrid model when some
paper analyzes a sequential mechanism, which stable allocation mechanism is used.
implements the social choice correspondence 4. Consider the discrete many-to-many matching
that yields the optimal stable payoff for the sellers. market with substitutable preferences where a
matching m is feasible if Chy(m(y)) ¼ m(y) for
Future Directions every agent y. Is the core always nonempty for
this model?
In this section, we present some directions for 5. Consider the assignment game of Shapley and
future investigations and some open problems Shubik in the context of buyers and sellers.
which have intrigued matching theorists. Consider a sealed bid auction in which the
1. The discrete two-sided matching models with buyers select a monetary value for each of the
non-necessarily strict preferences have been items. The auctioneer then chooses a compet-
explored very little in the literature. In the itive equilibrium price vector for the profile of
discrete models under strict preferences and selected values, according to some preset prob-
in the continuous models, due to the fact that ability distribution. It is of theoretical interest
there is no weak blocking pairs, the set of the investigation of the buyers’ strategic
Pareto-stable outcomes coincides with the set behavior.
of setwise-stable outcomes. However, under 6. We know that the core of the many-to-many
weak preferences, setwise-stable matchings assignment model of Sotomayor (1992) in
may not be Pareto optimal. Sotomayor (2008) which the agents negotiate in blocks is not a
proposes that in this case the Pareto-stability lattice. However, a problem that is still open is
concept, which requires that the matching is to know if the optimal stable payoffs for each
stable and Pareto optimal, should be consid- side of the market always exist.
ered the natural solution concept. The justifi-
cation for the Pareto stability concept relies in
the argument that in a decentralized setting,
Bibliography
where agents freely get together in groups,
recontracts between pairs of agents already Abdulkadiroglu A, Sönmez T (2003) School choice: a
allocated according to a stable matching lead- mechanism design approach. Am Econ Rev
ing to a (weak) Pareto improvement of the 93(3):729–747
Abdulkadiroglu A, Pathak P, Roth A, Sönmez T (2006)
original matching should be allowed. Thus,
Changing the Boston school choice mechanism:
weak blockings can upset a matching once strategy-proofness as equal access. Working paper.
they come from the grand coalition. Boston College and Harvard University, Boston
We think that the study of the discrete two- Abeledo H, Isaak G (1991) A characterization of graphs
which assure the existence of stable matchings. Math
sided matching models with non-necessarily
Soc Sci 22(1):93–96
strict preferences and the search for algorithms Adachi H (2000) On a characterization of stable matchings.
to produce the Pareto-stable matchings is a Econom Lett 68(1):43–49
new and interesting line of investigation. Alcalde J (1996) Implementation of stable solutions to
marriage problems. J Econom Theory 69(1):240–254
2. Consider the hybrid model where no worker is
Alcalde J, Romero-Medina A (2000) Simple mechanisms
rigid. If rigid firms enter the flexible market or to implement the core of college admissions problems.
flexible firms enter the rigid market, then no Games Econom Behav 31(2):294–302
firm gains and no worker loses if a quasi- Alcalde J, Pérez-Castrillo D, Romero-Medina A (1998)
Hiring procedures to implement stable allocations.
optimal stable outcome for one of the sides
J Econom Theory 82(2):469–480
always prevails. Suppose now that some rigid Balinski M, Sönmez T (1999) A tale of two mechanisms:
firm becomes flexible or some flexible firm student placement. J Econom Theory 84(1):73–94
398 Two-Sided Matching Models
Bardella F, Sotomayor M (2006) Redesign and analysis of an Gale D, Sotomayor M (1983, 1985) Some remarks on the
admission market to the graduate centers of economics in stable matching problem. Discret Appl Math 11:223–232
Brazil: a natural experiment in market organization, work- Gale D, Sotomayor M (1985) Ms. Machiavelli and the
ing paper. Universidade de São Paulo, São Paulo stable matching problem. Am Math Mon 92(4):261–268
Bikhchandani S, Mamer J (1997) Competitive equilibrium Gül F, Stacchetti E (1999) Walrasian equilibrium with
in an exchange economy with indivisibilities. gross substitutes. J Econom Theory 87(1):95–124
J Econom Theory 74(2):385–413 Gül F, Stacchetti E (2000) The english auction with differ-
Birkhoff G (1973) Lattice theory, vol v. 25. Colloquium entiated commodities. J Econom Theory 92(1):66–95
publications, American Mathematical Society, Gusfield D (1988) The structure of the stable roommate
Providence problem: efficient representation and enumeration of all
Blair C (1988) The lattice structure of the Set of stable stable assignments. SIAM J Comput 17:742–769
matchings with multiple partners. Math Oper Res Hatfield J, Milgrom P (2005) Matching with contracts. Am
13(4):619–628 Econ Rev 95(4):913–935
Chung K (2000) On the existence of stable roommate Irving R (1985) An efficient algorithm for the stable room-
matchings. Games Econom Behav 33(2):206–230 mates problem. J Algorithms 6:577–595
Crawford V (1991) Comparative statics in matching mar- Kamecke U (1989) Non-cooperative matching games. Int
kets. J Econom Theory 54(2):389–400 J Game Theor 18(4):423–431
Crawford V (2008) The flexible-salary match: a proposal to Kaneko M (1982) The central assignment game and the
increase the salary flexibility of the national resident assignment markets. J Math Econom 10(2–3):205–232
matching program. J Econom Behav Organ Kara T, Sönmez T (1996) Nash implementation of
66(2):149–160 matching rules. J Econom Theory 68(2):425–439
Crawford V, Knoer E (1981) Job matching with heteroge- Kara T, Sönmez T (1997) Implementation of college
neous firms and workers. Econometrica 49(2):437–450 admission rules. J Econom Theory 9(2):197–218
Demange G (1982) Strategyproofness in the assignment Kelso A Jr, Crawford V (1982) Job matching, coalition forma-
market game, working paper. Ecole Polytechnique, tion, and gross substitutes. Econometrica 50(6):1483–1504
Laboratoire D’Econometrie, Paris Kesten O (2004) Student placement to public schools in the
Demange G, Gale D (1985) The strategy structure of two- US: two new solutions, working paper. University of
sided matching markets. Econometrica 53(4):873–888 Rochester, Rochester
Demange G, Gale D, Sotomayor M (1986) Multi-item Kesten O (2006) On two competing mechanisms for
auctions. J Political Econom 94(4):863–872 priority-based allocation problems. J Econom Theory
Demange G, Gale D, Sotomayor M (1987) A further note 127(1):155–171
on the stable matching problem. Discrete Appl Math Knuth D (1976) Marriage stables. Les Presses de
16(3):217–222 l’Université de Montréal, Montréal
Dubins L, Freedman D (1981) Machiavelli and the Gale- Konishi H, Ünver MU (2006) Credible group stability in
Shapley algorithm. Am Math Mon 88(7):485–494 many-to-many matching problems. J Econom Theory
Dutta B, Massó J (1997) Stability of matchings when 127(1):57–80
individuals have preferences over colleagues. Kraft C, Pratt J, Seidenberg A (1959) Intuitive probability
J Econom Theory 75(2):464–475 on finite sets. Ann Math Stat 30(2):408–419
Echenique F, Oviedo J (2004) Core many-to-one Leonard H (1983) Elicitation of honest preferences for the
matchings by fixed-point methods. J Econom Theory assignment of individuals to positions. J Political Econ
115(2):358–376 91(3):461–479
Echenique F, Oviedo J (2006) A theory of stability in Ma J (2002) Stable matchings and the small core in Nash
many-to-many matching markets. Theor Econ equilibrium in the college admissions problem. Rev
1(2):233–273 Econ Des 7(2):117–134
Echenique F, Yenmez M (2007) A solution to matching Martínez R, Massó J, Neme A, Oviedo J (2001) On the
with preferences over colleagues. Games Econom lattice structure of the set of stable matchings for a
Behav 59(1):46–71 many-to-one model. Optimization 50(5):439–457
Eeckhout J (2000) On the uniqueness of stable marriage Martínez R, Massó J, Neme A, Oviedo J (2004) An algo-
matchings. Econom Lett 69(1):1–8 rithm to compute the full set of many-to-many stable
Ehlers L, Massó J (2004) Incomplete information and matchings. Math Soc Sci 47(2):187–210
small cores in matching markets, working paper. McVitie D, Wilson L (1970) Stable marriage assignment
CREA, Barcelona for unequal sets. BIT Numer Math 10(3):295–309
Ergin H, Sönmez T (2006) Games of school choice under the Mo J (1988) Entry and structures of interest groups in
Boston mechanism. J Public Econom 90(1–2):215–237 assignment games. J Econom Theory 46(1):66–96
Eriksson K, Karlander J (2000) Stable matching in a com- Ostrovsky M (2008) Stability in supply chain networks.
mon generalization of the marriage and assignment Am Econ Rev 98(3):897–923
models. Discrete Math 217(1):135–156 Pathak P, Sönmez T (2006) Leveling the playing field: sincere
Gale D, Shapley L (1962) College admissions and the and strategic players in the Boston mechanism, working
stability of marriage. Am Math Mon 69(1):9–15 paper. Boston College and Havard University, Boston
Two-Sided Matching Models 399
Pérez-Castrillo D, Sotomayor M (2002) A simple selling and Sotomayor M (1998) The strategy structure of the college
buying procedure. J Econom Theory 103(2):461–474 admissions stable mechanisms. In: Annals of Jornadas
Pycia M (2007) Many-to-one matching with complemen- Latino Americanas de Teoria Econômica, San Luis,
tarities and peer effects, working paper. Penn State Argentina, 2000; First World Congress of the Game
working paper, Pennsylvania Theory Society, Bilbao, 2000; 4th Spanish Meeting,
Rochford S (1984) Symmetrically pairwise-bargained allo- Valencia, 2000; International Symposium of Mathe-
cations in an assignment market. J Econom Theory matical Programming, Atlanta, 2000; World Congress
34(2):262–281 of the Econometric Society, Seattle, 2000, http://www.
Romero-Medina A (1998) Implementation of stable solu- econ.fea.usp.br/marilda/artigos/roommates_1.doc
tions in a restricted matching market. Rev Econ Des Sotomayor M (1999a) The lattice structure of the set of
3(2):137–147 stable outcomes of the multiple partners assignment
Roth A (1982) The economics of matching: stability and game. Int J Game Theory 28(4):567–583
incentives. Math Oper Res 7(4):617–628 Sotomayor M (1999b) Three remarks on the many-to-many
Roth A (1984a) Misrepresentation and stability in the stable matching problem. Math Soc Sci 38(1):55–70
marriage problem. J Econom Theory 34(2):383–387 Sotomayor M (2000a) Existence of stable outcomes and
Roth A (1984b) Stability and polarization of interests in job the lattice property for a unified matching market. Math
matching. Econometrica 52(1):47–58 Soc Sci 39(2):119–132
Roth A (1984c) The evolution of the labor market for Sotomayor M (2000) Reaching the core through college
medical interns and residents: a case study in game admissions stable mechanisms. In: Annals of abstracts
theory. J Political Econ 92(6):991–1016 of the following congresses: International Conference
Roth A (1985a) Conflict and coincidence of interest in job on Game Theory, 2001, Stony Brook; Annals of Bra-
matching: some new results and open questions. Math zilian Meeting of Econometrics, Salvador, Brazil,
Op Res 10(3):379–389 2001; Latin American Meeting of the Econometric
Roth A (1985b) The college admissions problem is not Society, Buenos Aires, Argentina, 2001, http://www.
equivalent to the marriage problem. J Econom Theory econ.fea.usp.br/marilda/artigos/reaching_%20core_ran
36(2):277–288 dom_stable_allocation_mechanisms.pdf
Roth A (1986) On the allocation of residents to rural Sotomayor M (2002) A simultaneous descending bid auc-
hospitals: a general property of two-sided matching tion for multiple items and unitary demand. Revista
markets. Econometrica 54(2):425–427 Brasileira Economia 56:497–510
Roth A, Sotomayor M (1988) Interior points in the core of Sotomayor M (2003a) A labor market with heterogeneous
two-sided matching markets. J Econom Theory firms and workers. Int J Game Theory 31(2):269–283
45(1):85–101 Sotomayor M (2003b) Reaching the core of the marriage
Roth A, Sotomayor M (1989) The college admissions market through a non-revelation matching mechanism.
problem revisited. Econometrica 57(3):559–570 Int J Game Theory 32(2):241–251
Roth A, Sotomayor M (1990) Two-sided matching: a study Sotomayor M (2003c) Some further remark on the core
in game-theoretic modeling and analysis. In: Econo- structure of the assignment game. Math Soc Sci
metric society monographs, vol 18. Cambridge Univer- 46:261–265
sity Press, New York Sotomayor M (2004a) Buying and selling strategies in the
Roth A, Sotomayor M (1996) Stable outcomes in discrete assignment game, working paper. Universidade São
and continuous models of two-sided matching: a uni- Paulo, São Paulo
fied treatment. Braz Rev Econom 16:1–4 Sotomayor M (2004b) Implementation in the many-to-
Shapley L (1962) Complements and substitutes in the opti- many matching market. Games Econom Behav
mal assignment problem. Navals Res Logist Q 9:45–48 46(1):199–212
Shapley L, Shubik M (1972) The assignment game I: the Sotomayor M (2005) The roommate problem revisited,
core. Int J Game Theory 1(1):111–130 working paper. Universidade São Paulo, São Paulo.
Sönmez T (1999) Strategy-proofness and essentially http://www.econ.fea.usp.br/marilda/artigos/THE_ROO
single-valued cores. Econometrica 67(3):677–689 MMATE_PROBLEM_REVISITED_2007.pdf
Sotomayor M (1986) On incentives in a two-sided Sotomayor M (2006) Adjusting prices in the many-to-
matching market, working paper. Depart of Mathemat- many assignment game to yield the smallest competi-
ics, PUC/RJ, Rio de Janeiro tive equilibrium price vector, working paper.
Sotomayor M (1992) The multiple partners game in equilib- Universidade de São Paulo, São Paulo. http://www.
rium and dynamics. In: Majumdar M (ed) Essays in hon- econ.fea.usp.br/marilda/artigos/A_dynam_stable-mech
our of David Gale. Macmillan, New York, pp 322–336 an_proof_lemma_1.pdf
Sotomayor M (1996a) A non-constructive elementary Sotomayor M (2007a) Connecting the cooperative and
proof of the existence of stable marriages. Games competitive structures of the multiple partners assign-
Econom Behav 13(1):135–137 ment game. J Econom Theory 134(1):155–174
Sotomayor M (1996b) Admission mechanisms of students Sotomayor M (2007b) Core structure and comparative
to colleges. A game-theoretic modeling and analysis. statics in a hybrid matching market. Games Econom
Braz Rev Econom 16(1):25–63 Behav 60(2):357–380
400 Two-Sided Matching Models
Sotomayor M (2007c) The stability of the equilibrium Tan J (1991) A necessary and sufficient condition for the
outcomes in the admission games induced by stable existence of a complete stable matching. J Algorithms
matching rules. Int J Game Theory 36(3–4):621–640 12(1):154–178
Sotomayor M (2008) The pareto-stability concept is a Tarski A (1955) A lattice-theoretical fixpoint theorem and
natural solution concept for the discrete matching mar- its applications. Pacific J Math 5(2):285–309
kets with indifferences, working paper. Universidade Thompson G (1980) Computing the core of a market game.
São Paulo, São Paulo. http://www.econ.fea.usp.br/ In: Fiacco AA, Kortane K (eds) Extremal methods and
marilda/artigos/ROLE_PLAYED_SIMPLE_OUTCOM systems analysis, vol 174. Springer, New York,
ES_STABLE_COALITI_3.pdf pp 312–324
more recently by Abdulkadiroğlu and Sönmez
Market Design (2013), Roth (2008a, b), Sönmez and Ünver
(2009), Pathak (2015), Kojima and Troyan
Fuhito Kojima, Fanqi Shi and Akhil Vohra (2011), and Kojima (2015), and many others.
Department of Economics, Stanford University, Given the rich set of existing surveys, in this article,
Stanford, CA, USA we try to balance between the basic models and
recent applications. We also try to differentiate by
choosing several specific topics which we regard as
Article Outline promising for further investigation.
The rest of this paper goes as follows.
Introduction Section “Two-Sided Matching” presents the stan-
Two-Sided Matching dard models of two-sided matching. In section
One-Sided Matching “One-sided Matching,” we describe the models of
Applications one-sided matching. Section “Applications” dis-
Conclusion cusses various applications. Section “Conclusion”
Bibliography concludes by discussing several future research
directions.
Introduction
depends only on the identity of her/its partners relative desirability of sets of students does not
(if we ignore the differences in fellowship and depend on the composition of the current assign-
other flexible terms). By comparison, labor con- ment of college c. More formally, c is respon-
tracts may vary much, with wage differences as a sive if:
prominent example.1 As such, a worker/firm not
only cares about who he/it is matched with but 1. For any J I with |J| < qc and any i I \J,
also the contracting details as well. Because of this (J [ i)c J , ic∅
additional complication, the firm-worker assign- 2. For any J I with |J| < qc and any i , j I\J,
ment is naturally more involved than the college (J [ i)c(J [ j) , ic j. (To simplify nota-
admissions problem. tion, we denote a singleton set {s} as s when-
We will begin by presenting the basic Gale and ever there is no confusion.)
Shapley (1962) two-sided matching model in sec-
tion “Basic Two-Sided Matching Model,” with We write the set of all strict responsive prefer-
the college admissions problem as the leading ence profiles as:
example. In section “Matching with Contracts,”
R ¼ ðl Þl I[G j c is responsive, 8c C :
we discuss the matching with contract model
(Hatfield and Milgrom 2005), with the firm- A matching is a function m : I ! C [ {∅}.
worker assignment in mind. For each c C, we define m(c) = {i I|m(i)
= c}. We say that a matching/t is feasible if
Basic Two-Sided Matching Model |m(c)|qc for all c C. Simply put, in a feasible
Adopting the language of Gale and Shapley matching, each college is matched with a set of
(1962) and Roth (1985), we describe the basic students not exceeding its capacity. For the rest of
model in terms of colleges and students. However, the discussion, we only look at feasible matchings
we note that it can be applied to any matching and will simply refer to them as matchings. Let M
model where both sides of the markets have pref- be the set of all (feasible) matchings.
erences, and the contracting details are standard- As mentioned at the beginning of the section,
ized (e.g., medical residency matching). our goal is to find a systematic procedure that can
There is a finite set I of students and a finite set help match students with colleges in a fair and
C of colleges. A student can be matched to at most efficient manner. Before proceeding, we first
one college, while each college c has capacity qc, make precise what we mean by “a systematic
i.e., the college can be matched to at most qc procedure” and “a fair and efficient manner.” For-
students. Each student i has a strict preference i mally, a (direct) mechanism is a function that
over C [ {∅}, where ∅ denotes the outcome in produces a (random) matching (outcome) for
which the student is unmatched, and each college each preference profile, or ’ : R ! DM. For
has a strict preference c over sets of students 2I. fairness and efficiency, one possible criterion is
For student i, we write ciic2 if and only if c1ic2 stability. Formally, a matching m is individually
or c1 = c2. Similarly, for college c, we write rational if mðiÞ i ∅ , 8i I and i c ∅ ,
J1c J2 if and only if J1c J2 or J1 = J2. Note 8i m(c), c C. A matching m is blocked by
we implicitly assume a student/college only a pair (i, c) I C if:
cares about his/its own match and that indiffer-
ences do not occur. 1. ci m(i) .
Throughout the current section, we assume the 2. |m(c)|<qc and ic∅ or |m(c)|=qc and ic for
preference of each college c is responsive, or the some j m(c).
1
We say a matching is stable if it is individually
Note, however, terms may not vary substantially in some
rational and not blocked by any pair; a mechanism is
labor markets, especially in standardized entry-level mar-
kets. In such a case, the matching model in section “Basic stable if it produces a stable matching for any pref-
Two-Sided Matching Model” may be appropriate. erence profile in any realization. Intuitively, a
Market Design 403
matching is individually rational if the assignment of • Step 1: Each student applies to his most pre-
each student and college is acceptable (at least as ferred college. Each college tentatively keeps
good as being unmatched). A matching is not its most preferred acceptable students up to the
blocked by any pair if whenever a student prefers a capacity (on hold) and rejects all others.
college to his current assignment, either the student
is not acceptable to the college or the college has In general, for any t = 1, 2,...
reached its capacity, and it prefers every current
student to the new student (note that we have implic- • Step t: Each student who is not currently on hold
itly made use of the assumption that preferences are applies to his next preferred acceptable college
responsive). A matching is stable if both conditions (he does not make an application if there is
are satisfied. Stability is a desirable criterion for at none). Each college considers all new applicants,
least two reasons: to begin with, in an unstable together with the students on hold, and tenta-
mechanism, a student/college or a pair may want to tively keeps its most preferred acceptable stu-
deviate from the proposed outcome, which is prob- dents up to the capacity while rejecting all others.
lematic if we require voluntary participation. In
addition, stability implies standard concepts of effi- The algorithm terminates when there is no new
ciency and fairness. application. (Clearly, it terminates in a finite num-
To see this, recall a matching m is (strongly) ber of steps because the number of students and
Pareto efficient if there is no other matching V colleges are both finite.)
such that n(l) lm(l) for all l I [ C and
n(l)lm(l) for some l I [ C. Given a matching Theorem 1 (Theorem 1 in Gale and Shapley
m, a student i has justified envy toward student 1962) For any (strict, responsive) preference
j m(c), if cim(i) and icj. We say m is justified profile, the (student-proposing) deferred accep-
envy-free if it is individually rational and no tance algorithm gives a stable matching. In
student has justified envy. Roughly speaking, other words, it is a stable mechanism.
(strong) Pareto efficiency says the assignment of Given the student-proposing deferred accep-
a student or a college can only be improved at the tance algorithm, one may naturally imagine a
expense of another student or college. Justified corresponding version of the college-proposing
envy-freeness says if a student prefers the college deferred acceptance algorithm. Such a mechanism
of another student to his current matching, then it does exist and is also stable. In fact, the set of
must be the case the college ranks the other stu- stable matchings is not necessarily a singleton set,
dents higher. If we take (strong) Pareto efficiency and the student-proposing and college-proposing
and envy-freeness as the natural efficiency and deferred acceptance algorithms can give rise to
fairness requirement, then the following proposi- different stable matchings. Given the (potential)
tion shows they are both implied by stability. multiplicity of stable matchings, one may wonder
which one to implement in practice. The follow-
Proposition 1 If a matching m is stable, then it is ing proposition suggests the answer depends on
both (strongly) Pareto efficient and envy-free. our evaluation of the relative welfare of the two
Having established a desirable property of a sides of the markets.
matching (and thus a mechanism), one may won-
der whether a stable matching can be achieved Proposition 2 (Theorem 2* in Roth 1985)
with every preference profile. Gale and Shapley
(1962) gave a positive answer with the construc- 1. There exists a student-optimal stable matching,
tion of the following mechanism. Since then, the i.e., a stable matching that every student likes
mechanism and its variations have played impor- at least as well as any other stable matching.
tant roles in real-life matching markets. Moreover, the student-proposing deferred
(Student-Proposing) Deferred Acceptance acceptance algorithm always yields the
Algorithm: student-optimal stable matching.
404 Market Design
2. There exists a college-optimal stable matching, is “safe” to play for students: it is always optimal
i.e., a stable matching that every college likes for students to simply report their true preferences.
at least as well as any other stable matching. The rough intuition is that because of “deferred”
Moreover, the college-proposing deferred acceptance, a student stands nothing to lose by
acceptance algorithm always yields the applying to his most preferred (remaining) college
college-optimal stable matching. in each step. Furthermore, even though it is not
3. The student-optimal stable matching is the always in the colleges’ interests to report truthfully,
least preferred stable matching for each col- the problem is not confined to the particular mech-
lege. Likewise, the college-optimal stable anism, but stable mechanisms in general. In other
matching is the least preferred stable matching words, incentive compatibility (on the college side)
for each student. is not compatible with stability.
To see how colleges may gain by misreporting
Roughly speaking, Proposition 2 says that under the student-proposing deferred acceptance
despite the competition among students/colleges, algorithm, consider the following example:
there is considerable coincidence of interests on
either side of the market if we restrict to stable Example 1 There are two students i1 and i2 and
matchings. By comparison, the interests of the two colleges c1 and c2. Each college has capacity 1
two sides of the markets are not always aligned. qc1 ¼ qc2 ¼ 1 1, and the preferences are as fol-
In fact, they are almost opposite if we restrict our lows: (When denoting an agent’s preference, we list
attention to stable matchings. her acceptable choices in order of her preference.
Another important concern for mechanisms is Similar notation is used throughout the paper.)
incentives. Apart from ease of implementation, an
incentive compatible mechanism induces students i1 : c1 , c2 i2 : c2 , c1
and colleges to reveal their preferences truthfully. c1 : i2 , i1 c2 : i1 , i2
Such a property is necessary for our fairness and
efficiency criteria to be well-grounded.2 We say a It is easy to see the outcome of the student-
mechanism is strategy proof if it is a weakly proposing deferred acceptance algorithm; under
dominant strategy for every player to (always) the true preferences is:
report his/her true preferences. With this definition,
we have the following result for incentive proper- i1 i2
ties of the student-proposing deferred acceptance c1 c2
algorithm (and any stable mechanism).
Now suppose college 1 misreports by stating
Theorem 2 (Theorem 5* in Roth 1985) The that only student 2 is acceptable, while all other
student-proposing deferred acceptance algorithm agents continue to report their true preferences.
is strategy proof for students. However (when Then, the outcome of the student-proposing
colleges have responsive preferences), no stable deferred acceptance algorithm under this prefer-
mechanism is strategy proof for colleges. ence profile is:
Simply put, Theorem 2 suggests that the
student-proposing deferred acceptance algorithm i1 i2
c2 c1
(2005). It is a generalization of the basic model the worker side as CI(X0) = [i ICi(X0). The
described in section “Basic Two-Sided Matching remaining offers in X0 are the rejected set by
Model.” It also incorporates, as special cases, the worker side: RI(X0) = X0 CI(X0). Similarly,
some other matching/auction models in the the chosen and rejected sets by the firm side are
existing literature. The primary example we have denoted by CF(X0) = [f FCf (X0) and RF (X0) =
in mind is labor market matching, and so we X0 CF (X0).
describe the model in terms of workers and If we imagine some sort of stable matchings
firms. However, it can be applied in any setting similar to the one in the basic model, the chosen
where contract terms play an important role. and rejected sets will prove useful. Intuitively,
There is a finite set I of workers, a finite set F of given a starting set of contracts, the rejected sets
firms, and a finite set of contracts X. Each contract (by either side) cannot be in any stable alloca-
x X is bilateral, so it is associated with one tion, so we know, at the very least, a stable
worker xI I and one firm xF F. For instance, allocation is a fixed point of the “chosen set
in the labor market matching model, a contract (by either side)” operator. Moreover, we will
specifies a firm, a worker, and a wage. So we have need to require that there is no coalition of
X = I F W, where W is a (finite) set of pos- workers and firms who prefer to sign contracts
sible wages. among themselves rather than follow the pre-
Each worker i can sign at most one contract, scribed allocation.
and his preferences over possible contracts (plus To make this precise, we present the following
the outcome in which she signs no contract, i.e., definition of a stable allocation: a set of contracts
the empty set ∅, which we sometimes refer to as X0 X is a stable allocation if:
the null contract) are described by the strict total
order i. We say a contract x is acceptable for 1. CI(X0) = CF (X0) = X0.
worker i if xi ø. With workers’ preferences well 2. There exists no firm f and a set of contracts
defined, we know a worker’s choice when faced X00 6¼ Cf (X0) such that X00 = Cf (X0 [ X00)
with a set of contracts. Formally, given a set of CI(X0 [ X00).
contracts X0 X, define worker i’s chosen set
Ci(X0) of contracts as follows: Intuitively, the first requirement corresponds to
“individual rationality.” The second requirement
ø if xX′ jxI ¼ i,xi ø ¼ ø says there cannot be an alternative set of contracts
Ci X ′ ¼
maxi xX′ jxI ¼ i otherwise such that a particular firm and its matched workers
all prefer, which is similar to the “no blocking”
In other words, a worker’s chosen set is simply condition we discussed in the basic model.
the most preferred, acceptable contract from those In order to guarantee the existence of a stable
that are available. If none is acceptable, then his allocation, we need an additional restriction on the
choice is the null contract (note that a worker’s preferences of firms: substitutability. Elements of
chosen set is either a singleton or an empty set). a set of contracts X are substitutes for firm f; if
On the other hand, each firm f can sign multiple 8X0 X00 X, we have Rf (X0) Rf (X00). In
contracts, so its preferences are over sets of con- words, the restriction says if a particular firm is
tracts. Let firm f’s preference be described by f, a faced with a larger choice set, it has to reject
strict total order over 2X. Given a set of contracts (weakly) more contracts.
X0 X, we can similarly define firm f’s chosen set Now we are ready to present the existence result,
Cf(X0). Notice that a firm can sign at most one based on an iterative algorithm. As it turns out, the
contract with any given worker, so for all iteration we shall apply is on the product set X X
f F , X0 X , and x , x0 Cf (X0) , if x 6¼ x0, instead of the set of all contracts X. Similar to most
then xI 6¼ x0I . iteration procedures, we hope to obtain monotonic-
Given X0 X and the chosen set of each ity. For monotonicity to be well defined, a partial
worker, we can define the contracts chosen by order on X X is needed. We define it as follows:
406 Market Design
given XI , X0I , XF , X0F X , ðXI , XF Þ X0I , X0F student-proposing deferred acceptance algorithm
0
, XI XI , and XF X0F ). (where students first propose to their most pre-
With the order , and given a starting set (XI, ferred colleges). Accordingly, we land at the
XF), we can define the generalized deferred worker-optimal stable allocation. Similarly, the
acceptance algorithm as the iterated applications generalized deferred acceptance algorithm
of the function F : X X ! X X, defined by: starting with the contract tuple ðø, XÞ gives
us the firm-optimal stable allocation. For Propo-
F 1 ðX 0 Þ ¼ X R F ðX 0 Þ sition 3, the first part of why XI \ XF and XI \ XF
F2 ðX0 Þ ¼ X RI ðX0 Þ is stable that needs some additional reasoning
FðXI , XF Þ ¼ ðF1 ðXF Þ, F2 ðF1 ðXI ÞÞÞ: (which we shall not give here), but the second
part incorporates the insight from the basic
With the generalized deferred acceptance algo- model: if we focus on stable allocations, there is
rithm in hand, the following theorem and propo- little conflict of interests among agents on the
sition tell us how they can direct us to find stable same side of the market, but there is substantial
allocations and shed light on the trade-off in wel- conflict of interests between the worker side and
fare between the workers and firms. the firm side.
Similar to the basic model, incentives are also
Theorem 3 (Theorem 3 in Hatfield and an important concern in the matching with con-
Milgrom 2005) Suppose elements of X are sub- tracts model, but we shall omit it here due to space
stitutes for all the firms. Then, constraint.
than or equal to one) of agents and houses is Step 2: i1 points to h3, i3 and i4 point to h1, and the
removed in each step, so the mechanism must three remaining houses point to their respective
terminate in a finite number of steps.) owners. There are two cycles: i1 is assigned h3
and i3 is assigned h1.
Theorem 4 (Theorem in Shapley and Scarf Step 3: i4 points to h4, which points backward.
1974 and Theorem 2 in Roth and Postlewaite There is a one cycle: i4 is assigned h4.
1977) For any strict preference profile, the top
trading cycle algorithm gives the unique strong It follows that the outcome of the top trading
core allocation. cycle algorithm is:
Intuitively, the top trading cycle gives a strong
core allocation roughly because (given the cycles
i1 i2 i3 i4
removed earlier), the group of agents removed in h3 h2 h1 h4
each step receive their best available houses.
As in two-sided markets, another important
concern in house exchange is incentives. After House Allocation with No Existing Owner
all, it is only with truthful revelation that the In a pure house allocation problem, no agent has a
strong core requirement has important welfare house to begin with, and there are a number of
implications. Fortunately, the following theorem houses to be allocated. A similar problem was first
says the top trading cycle algorithm has good studied by Hylland and Zeckhauser (1979). The
incentive properties. model we present here is a simplified version of
theirs.
There is a finite set H of empty houses to be
Theorem 5 (Theorem in Roth 1982) The top
allocated among a finite set I of agents. Each agent
trading cycle mechanism is strategy proof.
i demands exactly one house and has a strict
To obtain intuition for Theorem 5, recall truth-
preference i over H. (We assume all houses are
ful revelation is a weakly dominant strategy for
acceptable.) It may be the case that |H| > |I|,
students in the student-proposing deferred accep-
|H| < |I|, or |H| = |I|, so the houses may be in
tance algorithm. The insight here is somewhat
oversupply and undersupply or just balance with
similar: a house will not leave the market unless
the number of agents. Similar to the house
its (initial) owner gets her most preferred
exchange problem, we write h1 i h2 if and only
(remaining) house. Therefore, an agent will not
if h1 i h2 or h1 = h2. Implicit in the assumption
“lose” a house unless she cannot get it anyway.
is that an agent only cares about her own assign-
Hence, she may as well report her most preferred
ment and that indifferences between houses do not
(remaining) house in each step.
occur. We write the set of all strict preference
Given the relative complexity of the top trading
profiles as R = {(i)i I}. A matching is a
cycle algorithm, we end the section with an example:
one-to-one function m : I ! H [ ∅ (as the num-
ber of houses and agents need not balance, in
Example 2 There are four house owners i1, i2, general, a matching here is not bijective and can-
i3, i4, with their respective (initial) houses h1, h2, not be equivalently represented as a permutation
h3, h4. The preferences of the house owners are as on {1, 2, . . . , |I|}). Let M be the set of all
follows: matchings. A (direct) mechanism is a function
’ : R ! DM.
i 1 : h2 , h3 , h4 , h1 i2 : h2 , h1 , h3 , h4 Given the lack of existing owners, the primary
i 3 : h2 , h1 , h3 , h4 i4 : h2 , h1 , h3 , h4 criterion here is (strong) Pareto efficiency. Never-
theless, randomization may be particularly useful
Step 1: All agents point to h2 and the houses in the current setup if there are fairness concerns.
point to their respective owners. There is a one Once we introduce randomization, at least two
cycle: i2 is assigned h2. versions of (strong) Pareto efficiency arise.
Market Design 409
A mechanism is ex ante Pareto efficient if its The resulting mechanism is called random serial
assignment of lotteries is Pareto efficient relative dictatorship. The following theorem gives the desir-
to agents’ preferences over lotteries. By compar- able properties of random serial dictatorship.
ison, a mechanism is ex post Pareto efficient if its
final allocation is Pareto efficient given any strict Theorem 6 (Variation of Lemma 1 in
preference profile. It can be readily shown that ex Abdulkadiroğlu and Sönmez 1998) The ran-
ante Pareto efficiency implies ex post Pareto effi- dom serial dictatorship algorithm is ex post
ciency but not vice versa.3 Pareto efficient. Moreover, two agents with the
Before introducing a desirable algorithm, one same preferences receive the same random allo-
more definition is needed: a (rank) ordering is a cation to each other.
permutation of I, or a one-to-one correspondence In other words, the random serial dictatorship
s : {1, 2, . . . , |I|} ! I. The following mecha- mechanism has decent fairness and efficiency
nism and its variations are widely used in real- properties. Intuitively, ex post efficiency is
life house allocation problems: achieved because an agent is made as well off as
Serial Dictatorship Algorithm: possible given the allocation of the agents in ear-
lier steps. (This is true in every realization, so “ex
• Step 0: Fix a rank ordering s. post” with randomization.) The second part of this
• Step 1: Assign s(1) her most preferred house. theorem describes a fairness property of this
mechanism, and it follows immediately from uni-
In general, for any t = 1, 2, ... form randomization over rank orderings used in
Step 0’ of the algorithm.
• Step t: Assign s(t) her most preferred The following theorem says that random serial
remaining house. dictatorship also has good incentive properties.
The algorithm terminates when there is no Theorem 7 The random serial dictatorship
agent or house left. If there are still agents left, mechanism is strategy proof.
then they are not assigned a house. (Given each The main insight of Theorem 7 is that each
step reduces the number of agents and houses both agent is essentially the dictator when it comes to
by 1, the algorithm must terminate in a finite her turn, so she cannot gain by misreporting.
number of steps.) Unfortunately, even random serial dictatorship
Intuitively, the mechanism works as if an agent is not without its own problems. For one, the
is the dictator when it is her turn to choose. At that mechanism is not ex ante Pareto efficient. Some
time, she picks her most preferred house out of research has been done to tackle the problem.
those available (note that the agent does not care Nevertheless, it is found that ex ante Pareto effi-
about the allocation of any other agent). As men- ciency and fairness (as defined in Theorem 6) are
tioned earlier, randomization is often introduced incompatible with strategy proofness. (See
when implementing the mechanism in practice. Bogomolnaia and Moulin (2001).) Partly because
This can be done by modifying Step 0 as follows: of this, random serial dictatorship is still probably
among the most popular mechanisms when it
• Step 0: Pick a rank ordering uniformly at ran- comes to real-life object allocation.
dom from the set of all rank orderings.
House Allocation with Existing Owners
Given the discussions of sections “House
3
Ex-ante Pareto efficiency implies ex-post Pareto effi- Exchange” and “House Allocation with No
ciency because if any final allocation resulting from a Existing Owner,” one may imagine a situation
lottery is not ex-post Pareto efficient, then the lottery can
where existing house owners and new entrants
be improved by replacing the particular allocation with a
more efficient one, implying that the lottery is not ex-ante coexist. A few more specific real-life examples
Pareto efficient. are college dorm allocations and office
410 Market Design
assignment. The problem was first studied by agent points to her most preferred house left.
Abdulkadiroğlu and Sönmez (1999). Each remaining occupied house points to its
There are a finite set of houses H and a finite set of owner, and each available house points to the
agents I. Of all the houses in H, a subset HO is remaining agent with the highest priority (s(j),
currently occupied, each belonging to a distinct where j is the smallest among the remaining
member of the existing house owners IE I (so | agents). Remove all agents and houses in a
HO| = |IE|). The remaining houses HV = H HO cycle (at least one cycle exists). For any agent
are currently vacant and can be freely allocated. The removed, assign her the house she points to.
remaining agents IN = I IE are new entrants and
do not have a house. Each agent i I demands The algorithm terminates when there is no
exactly one house and has a strict preference i agent or house left. If there are still agents left,
over H. (We assume for simplicity that all houses then they are not assigned a house. Since each step
are acceptable for all the agents.) We write h1 i h2 if reduces the number of agents and houses both by
and only if h1 i h2 or h1 = h2. Implicit in the at least 1, the algorithm must terminate in a finite
assumption is that an agent only cares about her number of steps.
own assignment and that indifferences between You Request My House-I Get Your Turn
houses do not occur. We write the set of all strict (YRMH-IGYT) Algorithm:
preference profiles as R = {(i)i I}. A matching
is a one-to-one function m : I ! H [ ∅ (Similar to • Step 0: Fix a rank ordering s.
the pure house allocation problem, a matching here • Step 1: Agent s(1) points to her most preferred
need not be bijective.) Let M be the set of all house. If the house she points to is vacant
matchings. A (direct) mechanism is a function (in HV) or her own house, she is assigned the
’ : R ! DM. house she points to. Otherwise, modify s so
Given the presence of both existing house that the owner is at the top of the list (the other
owners and new entrants, one possible desirable relative orderings unchanged) and proceed to
criterion of a matching is Pareto efficiency. In light the next step.
of the top trading cycle and serial dictatorship algo-
rithms of previous sections, we have the following In general, for any t = 1, 2, ...
two generalizations as natural candidates. Indeed,
Abdulkadiroğlu and Sönmez (1999) show for any • Step t: The remaining agent with the highest
preference profile; outcomes from these two mech- priority (s(j), where j is the smallest among the
anisms coincide and satisfy Pareto efficiency. remaining agents) points to her most preferred
(Generalized) Top Trading Cycle house. If the house she points to is currently
Algorithm: vacant (which may or may not be in HV) or her
own house, she is assigned the house she points
• Step 0: Fix a rank ordering s. to. If the house she points to is occupied by
• Step 1: Define the set of available houses to be another remaining agent, modify s so that the
the vacant houses (HV). Each agent points to owner is at the top of the list (the other relative
her most preferred house. Each occupied house orderings unchanged). At this point, if a loop
points to its owner, and each available house forms (no house is a assigned in the process
points to s(1). Remove all agents and houses in where the rank ordering is back to an earlier
a cycle (at least one cycle exists). For any agent one), every agent is assigned the house she points
removed, assign her the house she points to. to. Otherwise, proceed to the next step.
In general, for any t = 1, 2, ... The algorithm terminates when there is no agent
or house left. If there are still agents left, then they
• Step t: Update the set of available houses to be are not assigned a house. (It can be shown the
the current vacant houses. Each remaining algorithm terminates in a finite number of steps.)
Market Design 411
Intuitively, the (generalized) top trading cycle Generalized Top Trading Cycles:
algorithm is a direct generalization of top trading
cycles in section “House Exchange,” with all Step 1: i1 points h3, and the remaining agents point
remaining vacant houses pointing to the to h1. h1 points to i1, h2 points to i2, and h3
remaining agent with the highest priority. On the points to i3. There are two cycles: i1 is assigned
other hand, YRMH-IGYT is a direct generaliza- h3 and i3 is assigned h1.
tion of serial dictatorship in section “House Allo- Step 2: i2 and i4 both point to h2 and h2 points to i2.
cation with No Existing Owner,” with the added There is a one cycle: i2 is assigned h2.
twist that the owner is granted the opportunity to
choose before her house is gone. It follows that the outcome of the generalized
top trading cycle algorithm is:
Theorem 8 (Theorem 3 in Abdulkadiroğlu and
Sönmez 1999) Given a rank ordering s and for i1 i2 i3 i4
any (strict) preference profile, the YRMH-IGYT h3 h2 h1 ∅
algorithm yields the same matching as the gener-
YRMH-IGYT:
alized top trading cycle algorithm.
Step 1: i3 points to h1, which is currently occu-
Theorem 9 (Propositions 1 and 2 in
pied. The ranking ordering s is modified to
Abdulkadiroğlu and Sönmez 1999) Given a
(1, 3, 4, 2).
rank ordering s and for any (strict) preference
Step 2: i1 points to h3, which is currently vacant
profile, the matching given by YRMH-IGYT and
and i1 is assigned h3.
generalized top trading cycle algorithms is
Step 3: i3 points to h1, which is currently vacant
strongly Pareto efficient.
and i3 is assigned h1.
Moreover, the following theorem reveals that
Step 4: i4 points to h2, which is currently occu-
incentives do not pose a problem either.
pied. The ranking ordering s is modified to
(1, 3, 2, 4).
Theorem 10 (Theorem 1 in Abdulkadiroğlu
Step 5: i2 points to h2, which is her own house and
and Sönmez (1999) For any rank ordering s,
i2 is assigned h2.
both the YRMH-IGYT and the generalized top
trading cycle algorithms are strategy proof. It follows that the outcome of the YRMH-
Given the close relationships, the intuitions of IGYT algorithm is also:
Theorems 9 and 10 are very similar to the coun-
terparts of top trading cycle and serial dictatorship i1 i2 i3 i4
algorithms (Theorems 4 through 7). h3 h2 h1 ∅
To illustrate the two algorithms and the main
insights of Theorem 8, we conclude the section Applications
with an example.
The theories described in the previous sections
Example 3 There are four agents and three have found applications in a wide variety of
houses. Agents i1 and i2 are current house owners, areas. While we are not able to survey all of
with their respective houses h1 and h2. Agents i3 them extensively, we have selected some of the
and i4 are new entrants. House h3 is currently most prominent examples in order to highlight
available. There preferences of the agents are as how the theory can be utilized. We begin by
follows: discussing the following topics:
2. Kidney Exchange (most closely related to the stability is not guaranteed. Take the following
model in section “House Allocation with example (from Roth 1984):
Existing Owners”)
3. School Choice (most closely related to the Example 4 There are four medical school gradu-
models in sections “Basic Two-Sided ates i1, i2, i3, and i4 and four hospitals h1, h2, h3, and h4
Matching Model” and “House Allocation each with capacity 1 qh1 ¼ qh2 ¼ qh3 ¼ qh4 ¼ 1Þ.
with Existing Owners”) (i1, i2) and (i3, i4) are couples with preferences over
ordered pairs of hospitals. The exact preferences of
Then, building on an understanding of the the couples and the hospitals are as follows:
problems encountered when applying the tools
to the situations above, we transition to a rela- ði1 , i2 Þ : ðh1 , h2 Þ, ðh4 , h1 Þ, ðh4 , h3 Þ, ðh4 , h2 Þ, ðh1 , h4 Þ,
tively new area of matching theory called ðh1 , h3 Þ, ðh3 , h4 Þ, ðh3 , h1 Þ, ðh3 , h2 Þ, ðh2 , h3 Þ, ðh2 , h4 Þ,
“matching with constraints.” ðh2 , h1 Þ
ði3 , i4 Þ : ðh4 , h2 Þ, ðh4 , h3 Þ, ðh4 , h1 Þ, ðh3 , h1 Þ,
ðh3 , h1 Þ, ðh3 , h2 Þ, ðh3 , h4 Þ, ðh2 , h1 Þ, ðh2 , h3 Þ, ðh1 , h2 Þ,
Medical Residency Matching ðh1 , h4 Þ, ðh1 , h3 Þ
Among the most common application of two- h1 : i4 , i2 , i1 , i3 h2 : i4 , i3 , i2 , i1
sided matching algorithms is the medical resi- h3 : i2 , i3 , i1 , i4 h4 : i2 , i4 , i1 , i3
dency programs. In 2016, roughly 43,000 medical
It is straightforward, if tedious, to check that no
school graduates registered for the National Res-
stable matching exists in this example. Given that
ident Match Program (NRMP), where students are
an increasing number of medical students marry
matched to teaching hospitals through a variant of
other medical students, it would seem then that
a deferred acceptance algorithm (for more
finding a stable matching for NRMP would be
detailed statistics, one can visit http://www.nrmp.
impossible. Even determining, in a given
org/match-data/main-residency-match-data/). This
instance, whether a stable matching exists is a
service has been in operation since 1952, and its
computationally hard problem.5 As a result, to
longevity is ascribed to the fact that the matchings
replace the original method, Roth and Peranson
produced are stable (Roth 1984; Roth and
(1999) proposed a heuristic modification of the
Sotomayer 1990).
deferred acceptance algorithm in place to accom-
What makes NRMP’s matching problem com-
modate couples’ preferences. Although this algo-
plex, though, is the existence of “couples.” While
rithm is not guaranteed to always produce a match
some students apply independently and rank their
that is stable with respect to the reported prefer-
preferences accordingly, individuals who have a
ences, it has done so in almost all instances.
significant other in the residency match program
Why does the algorithm in NRMP find a stable
are allowed to apply together as couples so that
matching despite the theoretical possibility of
they can work in areas close to one another. In this
nonexistence? Kojima et al. (2013) show that in
context, stability requires there be no coalition of
a setting where applicant preferences are drawn
students and hospitals who prefer to match among
independently from a distribution, as the size of
themselves than follow the prescribed matching.4
the market increases and the proportion of couples
The presence of couples who submit joint prefer-
approaches 0, the Roth and Peranson algorithm
ence lists complicates the problem significantly as
terminates in a stable matching with high proba-
bility. Thus, one of the reasons the NRMP algo-
4
The main difference of this definition from the one in the
basic model of section “Basic Two-Sided Matching
Model” is that we consider a coalition composed of a 5
More precisely, this problem is in the class of “NP-hard”
couple of doctors and two hospitals each of which seeks problems. NP-hardness is a notion in computational com-
to match with a member of the couple. See Roth (1984) for plexity theory describing the complexity of computation,
detail. which we will not describe in detail here.
Market Design 413
rithm finds stable matchings in most cases may be “necessary” for guaranteeing the existence of a
because the size of NRMP is large, while the stable matching (see Hatfield and Kojima (2008)
proportion of couples in the market is small, and Sönmez and Ünver (2010) for formal state-
roughly between 5% and 10%. By contrast, Biro ments). The existence of couples leads to a viola-
and Klijn (2013) and Ashlagi et al. (2014) have tion of substitutability because a pair of positions
shown, in separate settings, that as the proportion close to each other works as complements for the
of couples increases, this algorithm frequently couple. Recent research by Che et al. (2017) and
fails to terminate in a stable matching. This may Azevedo and Hatfield (2017) examine matching
be important given that residency matching is not with complementarities in large market settings
the only environment with a “couples’” issue. In with a continuum of agents. They have found
other such settings, couples could make up much positive results describing sufficient conditions
more of the market. (Biro and Klijn (2013) pro- for the existence of stable matchings.
vide the example of assigning high school
teachers in Hungary to majors, where almost all Kidney Exchange
teachers need to be assigned to two majors; in this The application of matching theory to kidney
setting, the percentage of “couples” is nearly exchange has been discussed often and is quite
100%). thorough, so we will be relatively brief in our
Given these difficulties in the “couples’” prob- exposition. For an extensive survey, we refer the
lem, Nguyen and Vohra (2017) propose an alter- reader to Sönmez and Ünver (2011). In the kidney
native approach. They allow for perturbations of “market” (using the term loosely), the National
hospital capacities to find a “nearby” instance of Organ Transplant Act of 1984 made it illegal to
the matching problem that is guaranteed to have a buy or sell a kidney in the USA. Similar legal
stable matching. They find that the necessary per- prohibitions are nearly universal around the
turbations are small, especially when hospital/ globe. Thus, donation is the only viable option
school/firm capacities are large. Specifically, for kidney transplantation for most patients.
given capacities qh for each hospital h, there is a The initial foundational contribution to kidney
redistribution of the slots, q0h , satisfying qh q0h exchange came with Roth et al. (2004). They used
P P 0 P
2 for all hospitals h and qh qh qh a variation of the Shapley-Scarf house exchange
þ4. Thus, the perturbations change the capacity of model (section “House Exchange”) to represent
each individual hospital by at most 2 and increase the kidney exchange market. In their model,
the total number of positions in hospitals by no agents enter in pairs composed of a patient and
more than 4 while never decreasing it.6 his potential donor. Applying the top trading cycle
The complication surrounding matching with (TTC) mechanism where potential donors substi-
couples turns out to be a specific instance of a tute “houses” of the original Shapley-Scarf model,
more general issue economists have sought to one can produce a matching between donors and
understand: matching with complementarities. In patients in a Pareto-efficient and strategy-
two-sided matching markets, substitutability of proof way.
agent preferences (see section “Matching with The way economists model kidney exchange
Contracts”), i.e., the lack of complementarity, is has progressed as we now know many ways in
which the assumptions in the original 2004 paper
6
do not seem to be the best representation of the
How the authors proceed from the setup is notable as they
real kidney market. As economists have advanced
approach the problem from a linear programming perspec-
tive. Formulating the matching problem as a linear pro- into the area of matching under general constraints
gram and applying the celebrated Scarf’s lemma, they find and dynamic matching, they have attempted to
a random matching that satisfies a notion of stability. They employ other mechanisms different from TTC.
then use an iterative rounding method to find an actual
For instance, because all transplantations in any
matching (corresponding to a 0–1 solution) such that the
resulting matching satisfies stability. Such rounding corre- kidney exchange need to be carried out simulta-
sponds to the perturbation of the capacities. neously, long cycles that could be conducted
414 Market Design
using the TTC mechanism might not be feasible in enroll their children in private schools or have
practice. Roth et al. (2005) provided strategy- the ability to move to a different district entirely.
proof, constrained-efficient mechanisms of kid- Supporters have argued that school choice helps
ney exchange where only pairwise exchanges lower income families by providing them the
are permitted. They showed that finding a freedom to send their children to different schools
constrained-efficient matching in their model within and across districts. In addition, the
relates to the cardinality matching problem increased competition schools face under school
discussed in the graph theory literature choice should incentivize them to increase their
(in addition, the 2005 paper assumed that each quality (for a more extensive survey on empirical
patient is indifferent among all kidneys that are and theoretical literature around school choice,
compatible to her, based on certain medical evi- see Pathak (2011)). Since it is not possible to
dence). In a 2007 paper, these same authors assign each student to her top choice school, a
showed that under certain conditions on kidney central issue in school choice is the design of a
supply and demand levels that could normally be student assignment mechanism. One of the first to
expected, full efficiency can be extracted by using rigorously and formally tackle this issue with
exchanges that involve no more than four pairs. matching theory is Abdulkadiroğlu and Sönmez
In the papers discussed above, agents and the (2003). The model they propose, which has been
market itself are static. What if the exchange pool regarded as the canonical model, consists of a set
changes over time? Should we conduct exchanges of students and schools where:
immediately, or if there is no urgency, is it more
efficient to wait? These issues are not addressed 1. Each student i has a preference relation i over
formally in the aforementioned papers. Ünver the schools.
(2010) tackles the question of how to conduct 2. Each school c has capacity qc and priority
barter exchanges in a centralized mechanism ordering c over the students.
when the agent pool evolves over time: he char-
acterizes the efficient two-way and multi-way One of the reasons we refer to the school’s
exchange mechanisms that maximize total ordering as a priority ordering is because in
exchange surplus. The study of dynamic matching some school choice programs, orderings are
environments has attracted the interest of not only given exogenously (e.g., mandated by law).
economists but computer scientists and operation Pathak (2011) describes a variety of orderings in
research specialists as well. Notable contributions different districts as follows: “In Boston’s school
include Anderson et al. (2015) and Akbarpour choice plan, for instance, elementary school appli-
et al. (2016). There are still many questions left cants obtain walk-zone priority if they reside
to be addressed, which makes the kidney within 1 mile of the school. In other districts,
exchange market one of the great interests schools construct an ordering of students, as in
among researchers and practitioners today. two-sided problems. In Chicago, for instance, stu-
dents applying for admissions to selective high
School Choice schools take an admissions test.” (Later, Boston’s
The third prominent area matching theory is school choice plan implemented a reform which
applied to that of school choice and student eliminated the use of walk-zone priority). When
assignment policy. School choice has become evaluating a matching in this setting, two notions
one of the most important and contentious debates are of primary interest: Pareto efficiency and sta-
in modern education policy. School choice is a bility. Stability is defined in the standard manner
policy that allows parents the opportunity to as in section “Basic Two-Sided Matching Model,”
choose the school their child will attend. Tradi- while Pareto efficiency only considers students’
tionally, children are assigned to public schools allocations and does not take into account the
according to where they live. Wealthy parents schools’ priority ordering (as the literature has
already have school choice, because they can grown and evolved, generalizations of the notion
Market Design 415
of stability have been discussed, which we will students, i.e., families have an incentive to strate-
examine in section “Matching with Constraints”). gically misreport their preferences. Variations of
Abdulkadiroğlu and Sönmez (2003) compare this mechanism, however, are common in many
three mechanisms: the student-proposing deferred other school districts.
acceptance algorithm, an adaptation of the top The next procedure presented by Abdulkadiroğlu
trading cycle mechanism (referred to as TTC in and Sönmez (2003) is the TTC mechanism, which is
this section), and the Boston mechanism. implemented in the following manner (the descrip-
As the deferred acceptance mechanism is tion of the mechanism is taken directly from
familiar to the reader by now, we describe the Abdulkadiroğlu and Sönmez (2003)):
other two mechanisms. We start with the Boston
mechanism, which, as its name suggests, was in • Step 1: Assign a counter for each school which
use in school choice programs in the city of Bos- keeps track of how many seats are still avail-
ton (before being replaced by the deferred accep- able at the school. Initially set the counters
tance algorithm): equal to the capacities of the schools. Each
student points to her favorite school under her
• Step 0: Each school orders students by priority announced preferences. Each school points to
block.7 Within each block, students are ordered the student who has the highest priority for the
via a lottery system. school. Since the number of students and
• Step 1: In this step, only the first choices of the schools are finite, there is at least one cycle.
students are considered. For each school, con- Moreover, each school can be part of at most
sider the students who have listed it as their one cycle. Similarly, each student can be part
first choice, and assign seats of the school to of at most one cycle. Every student in a cycle is
these students one at a time following their assigned a seat at the school she points to and is
priority order until either there is no seat left removed. The counter of each school in a cycle
or there is no student left who has listed it as his is reduced by one, and if it reduces to zero, the
first choice. school is also removed. The counters of the
schools not in a cycle remain the same.
In general, for any t = 1, 2,...
In general, for any t = 1, 2,...
• Step t: In this step, only the tth choices of the
students are considered. For each school, con- • Step t: Each remaining student points to her
sider the students who have listed it as their tth favorite school among the remaining schools,
choice, and assign seats of the school to these and each remaining school points to the student
students one at a time following their priority with highest priority among the remaining stu-
order until either there is no seat left or there is dents. There is at least one cycle. Every student
no student left who has listed it as his tth choice. in a cycle is assigned a seat at the school that
she points to and is removed. The counter of
In Boston, the Boston mechanism was origi- each school in a cycle is reduced by one, and if
nally implemented in July 1999 but was aban- it reduces to zero, the school is also removed.
doned in 2005. One of the central reasons it was
abandoned is that it is not strategy proof for This algorithm is very similar to the top trading
cycle mechanisms described in sections “House
Exchange” and “House Allocation with Existing
7
In Boston, first priority consisted of students who lived in Owners,” except that agents are not initially endo-
a proximal neighborhood and had a sibling that attended wed with any good. In this adaptation, students
the school. The second tier consisted of students with a
are essentially swapping priority orderings with
sibling at the school. Third priority is of the students who
live in the “relevant” area. Finally, the remaining students each other. Note that if every school has the same
are grouped within the last priority block. priority ordering, this mechanism reduces to serial
416 Market Design
dictatorship where the rank ordering is determined current notions of stability and Pareto efficiency are
by the priority ranking. the most relevant measures by which to evaluate
Some of the main properties of TTC are differ- school choice mechanisms.
ent from those of the deferred acceptance mecha-
nism although both mechanisms are strategy
Matching with Constraints
proof. The student-proposing deferred acceptance
We now proceed to a discussion of a relatively new
mechanism is stable, but the resulting outcome is
area of research within matching theory and market
not necessarily Pareto efficient for students, while
design application, matching with constraints. This
the top trading cycle mechanism is not stable but
field seeks to study allocations and matching when
produces a Pareto-efficient outcome for students.
characteristics and constraints other than the com-
Whether efficiency or stability is more important
mon individual capacity limits are regarded as
is a question that may be an important determinant
desirable or required for feasibility. Schools, hos-
for the choice of the mechanism. Also, it is then
pitals, or firms (to use the language of our previous
natural to ask whether one can construct an effi-
models) may be not only worried about the obvious
cient, strategy-proof mechanism which also pro-
limit on total individuals they can accept but also
duces a stable outcome whenever it exists. Kesten
about the quantity of types of individuals that are
(2010) shows that this is impossible.
admitted. With the prevalence of affirmative action
Much work within school choice literature has
and the goal of creating a diverse student/employee
expounded on the results of Abdulkadiroğlu and
body, understanding the implementation and
Sönmez (2003). It is important, though, to address
impact of such policies is crucial. The desire for
criticisms and weakness of the model as well as
diversity ranges beyond just race and gender: in
some difficulties in application of matching the-
universities, for instance, having students all inter-
ory to the analysis of the policy.
ested in one or two academic areas is often consid-
One of the central assumptions of the above
ered disadvantageous because it may stymie the
model is that students have an exogenous pref-
intellectual growth of its student population.
erence over schools that is independent of the
Abdulkadiroğlu and Sönmez (2003) model a
other students who are assigned to the same
simple affirmative action policy of type-specific
school. This is rather problematic if the quality
quotas and propose mechanisms that satisfy the
of a school is affected by the composition of the
affirmative action constraints. Under the same type
student body (this is referred to as peer effect).
of affirmative action policy, Abdulkadiroğlu (2005)
The second issue is that the effect of a school
shows that a stable matching can be found using a
choice mechanism on school quality is exoge-
strategy-proof, student-proposing deferred accep-
nously given and fixed in the canonical model,
tance algorithm. These papers pushed affirmative
although the issue of improving schools takes a
action into mainstream matching literature, whereas
center stage of school choice debate in practice
traditional papers on affirmative action were based
(see Hatfield et al. (2016) for an analysis of this
on “classical” mechanism design theory.8 Kojima
topic). Another major difficulty is that the infor-
(2012) demonstrated various impossibility results
mation submitted by students is ordinal and
that can arise when attempting to implement affir-
does not necessarily convey information on
mative action policies in a matching environment.
preference intensities. Abdulkadiroğlu et al.
There are situations where affirmative action
(2011, 2015) and Carroll (2017) analyze this
issue theoretically. Agarwal and Somaini’s
(2016) empirical analysis on strategic reporting 8
The study of employment discrimination began in the
in school choice mechanisms highlighted the second half of the 20th century. The two main theories of
importance of further study on mechanisms discrimination are a theory based on tastes, pioneered by
Becker (1957), and a statistical theory, pushed forth by
that use the intensity of student preferences.
Phelps (1972) and Arrow (1973). Economists such as
Given these issues (and others we are not discussing Glenn Loury and Roland Fryer have further developed
here), it is still not completely clear whether the the literature around race-based affirmative action.
Market Design 417
policies inevitably hurt every minority student under To begin with, almost all research in the
any stable matching mechanism. Furthermore, sim- existing literature defines stability under the
ilar impossibility results hold when using TTC. assumption of complete information, but this is
Hafalir et al. (2013) further expound on these phe- at best a rough approximation of reality. Liu et al.
nomena and show that the use of a “quota” versus (2014) investigate stability under incomplete
“reserve” affirmative action system can have signif- information in two-sided matching markets with
icant consequences on the resulting allocation. With transfer, while Bikhchandani (2017) studies a
minority reserves, schools give higher priority to similar concept in the no-transfer setting.
minority students up to the point that the minorities Once incomplete information is taken seri-
fill the reserves. They show that the deferred accep- ously, it is natural to consider “informational
tance algorithm with minority reserves is Pareto externality,” i.e., interdependence in valuations.
superior for students to the one with majority quotas. Chakraborty et al. (2010, 2015) study two-sided
Kamada and Kojima (2015) advance the idea by matching with interdependent values, while Che
looking at matching environments with more gen- et al. (2015) study one-sided matching with
eral distributional constraints. One example is the interdependent values. In both cases, the possibil-
Japan Residency Matching Program which imposes ity of extending desirable matching mechanisms
regional caps on the numbers of residents so as to from the standard private value models proved to
limit the concentration of residents in urban areas be severely limited. Designing satisfactory mech-
such as Tokyo. They point out that the mechanisms anisms under interdependent values is a promis-
used in that market and others with constraints suffer ing, if challenging, avenue for future research (see
from instability and inefficiency. To remedy this Hashimoto (2016) and Pakzad-Hurson (2016) for
problem, they create a modified version of the notable advances).
deferred acceptance algorithm which is strategy Another important limitation of the existing
proof for students, constrained efficient, and stable literature is that the models tend to be
in an appropriate sense. Kamada and Kojima (2016, static. Although some matching markets could
2017) and Goto et al. (2017) further explore various be approximated well by a static model (e.g.,
stability concepts and characterize environments in yearly medical residency matching or school
which stability and other desirable properties such choice), others may be better modeled as a
as strategy proofness can be guaranteed. dynamic market (e.g., day care slot assignment
There are still many issues and problems in the with arrival and departure of children and the
area that are unresolved and worth pursuing. How ongoing kidney exchange program). In addition
to address more general types of constraints, espe- to papers on dynamic kidney exchange already
cially lower-bound constraints, is still a difficult discussed, there is a burgeoning literature on
problem and being actively studied (see dynamic two-sided matching markets. Kurino
Fragiadakis and Troyan (2016) for instance). New (2009), Du and Livne (2016), Doval (2017), and
mathematical tools from discrete convex analysis Kadam and Kotowski (2017) propose concepts of
have been applied to matching with constraints dynamic stability and analyze existence under
(Kojima et al. 2016), but the use of such mathe- various assumptions on commitment technologies
matical tools may warrant further investigation. and preferences. This literature is so young that
several alternative stability concepts are being
studied, but a consensus on the appropriate defi-
Conclusion nition has not been reached yet. In the future, a
consensus on the appropriate stability definition
As indicated throughout this article, matching the- may emerge, but it is also possible that different
ory has expanded vastly since the seminal work by stability concepts are appropriate in different
Gale and Shapley (1962). Although the theory has types of dynamic markets. Reaching conclusions
advanced considerably, there are many new ques- on this and other questions awaits further
tions and issues waiting to be explored further. research.
418 Market Design
Kojima F, Troyan P (2011) Matching and market design: an physicians and surgeons in the United Kingdom. Am
introduction to selected topics. Jpn Econ Rev 62(1):82–98 Econ Rev 81(3):415–440
Kojima F et al (2013) Matching with couples: stability and Roth AE (2008a) Deferred acceptance algorithms: history,
incentives in large markets. Q J Econ 128(4):1585–1632 theory, practice, and open questions. Int J Game Theory
Kojima F, Tamura A, Yokoo M (2016) Designing matching 36:537–569
mechanisms under constraints: an approach from dis- Roth AE (2008b) What we have learned from market
crete convex analysis. Working paper design. Econ J 118(527):285–310
Kurino M (2009) Credibility, efficiency, and stability: a Roth AE, Peranson E (1999) The redesign of the matching
theory of dynamic matching markets. Working paper market for American physicians: some engineering
Liu Q, Mailath GJ, Postlewaite A, Samuelson L (2014) aspects of economic design. Am Econ Rev
Stable matching with incomplete information. 89(4):748–780
Econometrica 82(2):541–587 Roth AE, Postlewaite A (1977) Weak versus strong dom-
Nguyen T Vohra R (2017) Near feasible stable matchings ination in a market with indivisible goods. J Math Econ
with couples. Working paper 4(2):131–137
Pakzad-Hurson B (2016) Crowdsourcing and optimal mar- Roth AE, Sotomayer MAO (1990) Two-sided matching.
ket design. Working Paper Cambridge University Press, Cambridge
Pathak PA (2011) The mechanism design approach to Roth AE, Sönmez T, Ünver MU (2004) Kidney exchange.
student assignment. Ann Rev Econ 3(1):513–536 Q J Econ 119(2):457–488
Pathak PA (2015) What really matters in designing school Roth AE, Sönmez T, Ünver MU (2005) Pairwise kidney
choice mechanisms. Advances in economics and exchange. J Econ Theory 125(2):151–188
econometrics. Cambridge University Press, Cambridge Roth AE, Sönmez T, Ünver MU (2007) Efficient kidney
Phelps ES (1972) The statistical theory of racism and exchange: coincidence of wants in markets with
sexism. Am Econ Rev 62(4):659–661 compatibility-based preferences. Am Econ Rev
Rees-Jones A (2017) Mistaken play in the deferred accep- 97(3):828–851
tance algorithm: implications for positive assortative Shapley L, Scarf H (1974) On cores and indivisibility.
matching. American Economic Review Papers and J Math Econ 1(1):23–37
Proceedings 107:225–229 (forthcoming) Sönmez T, Ünver MU (2009) Matching, allocation, and the
Roth AE (1982) Incentive compatibility in a market with exchange of discrete resources. In: Benhabib J et al
indivisible goods. Econ Lett 9(2):127–132 (eds) The handbook of social economics. Elsevier,
Roth AE (1984) The evolution of the labor market for Amsterdam
medical interns and residents: a case study in game Sönmez T, Ünver MU (2010) Course bidding at business
theory. J Polit Econ 92(6):991–1016 schools. Int Econ Rev 51(1):99–123
Roth AE (1985) The college admissions problem is not Sönmez T, Ünver MU (2011) Market design for kidney
equivalent to the marriage problem. J Econ Theory exchange. In: Neeman Z et al (eds) The handbook of
36(2):277–288 market design. Oxford University Press, Oxford
Roth AE (1991) A natural experiment in the organization Ünver MU (2010) Dynamic kidney exchange. Rev Econ
of entry-level labor markets: regional markets for new Stud 77(1):372–414
Demand revelation game It is a strategic game
Cost Sharing in Production where agents announce their maximal contri-
Economies bution strategically.
Game theory It is the branch of applied mathemat-
Maurice Koster ics and economics that studies situations where
University of Amsterdam, Amsterdam, players make decisions in an attempt to maximize
The Netherlands their returns. The essential feature is that it pro-
vides a formal modeling approach to social situ-
ations in which decision-makers interact.
Article Outline Strategic game An ordered triple G = hN,
(Ai)i N, (≾i)i Ni where
Glossary • N = {1, 2, . . ., n} is the set of players.
Definition of the Subject • Ai is the set of available actions for player i.
Introduction • ≾i is a preference relation over the set of
Cooperative Cost Games possible consequences C of action.
Noncooperative Cost Games
Continuous Cost Sharing Models
Stochastic Cost Sharing Models Definition of the Subject
Future Directions
Bibliography Throughout we will use a fixed set of agents
N = {1, 2, . . ., n} where n is a given natural
Glossary number. For subsets S, T of N, we write S T if
each element of S is contained in T; T\S denotes
Core The core of a cooperative cost game hN, ci the set of agents in T except those in S. The power
is the set of all coalitionally stable vectors of set of N is the set of all subsets of N; each coalition
cost shares. S N will be identified with the element
Cost function A cost function relates each level 1S {0, 1}N, the vector with i-th coordinate
of output of a given production technology to 1 precisely when i S. Fix a vector x ℝN
the total of minimally necessary units of input and S N. The projection of x on ℝS is denoted xS,
to generate it. It is a non-decreasing function, and xN\S is sometimes more conveniently denoted
c : X ! ℝ+, where X is the (ordered) space of xS. For any y ℝS, (xS, y) stands for the vector
outputs. z ℝN such that zi = xi if i N\S and zi = yi if
Cost sharing problem A cost sharing problem is i S. We denote x(S) = i Sxi. The vector in ℝS
an ordered pair (q, c), where q ℝNþ is a profile with all coordinates equal zero is denoted 0S.
of individual demands of a fixed and finite Other notation will be introduced when necessary.
group of agents N = {1, 2, . . ., n} and c is a This entry focuses on different approaches in
cost function. the literature through a discussion of a couple of
Cost sharing rule A cost sharing rule is a map- basic and illustrative models, each involving a
ping that assigns to each cost sharing problem single facility for the production of a finite set
under consideration a vector of nonnegative M of outputs, commonly shared by a fixed set
cost shares. N ≔ {1, 2, . . ., n} of agents. The feasible set of
Demand game It is a strategic game where outputs for the technology is identified with a set
agents place demands for output strategically. X ℝM þ . It is assumed that the users of the tech-
nology may freely dispose over any desired quan- mechanism design or implementation paradigm.
tity or level of the outputs; each agent i has some The state-of-the-art literature shows for a couple
demand xi X for output. Each profile of demands of simple but illustrative cost sharing models that
x XN is associated with its cost c(x), i.e., the one cannot push these principles too far, as there is
minimal amount of the idiosyncratic input com- often a trade-off between the degree of distribu-
modity needed to fulfill the individual demands. tive justice and economic efficiency. Then this is
This defines the cost function c : XN ! ℝ+ for the what makes choosing “the” right solution an
technology, comprising all the production external- ambiguous task, certainly without a profound
ities. A cost sharing problem is an ordered pair (x, understanding of the basic allocation principles.
c) of a demand profile x and a cost function c. The Now first some examples will be discussed.
interpretation is that x is produced, and the
resulting cost c(x) has to be shared by the collective Example 4.1 The water resource management
N. Numerous practical applications fit this general problem of the Tennessee Valley Authority (TVA)
description of a cost sharing problem. in the 1930s is a classic in the cost sharing litera-
In mathematical terms, a cost sharing problem ture. It concerns the construction of a dam in a river
is equivalent to a production sharing problem to create a reservoir, which can be used for different
where output is shared based on the profile of purposes like flood control, hydroelectric power,
inputs. However, although many concepts are irrigation, and municipal supply. Each combination
just as meaningful as they are in the cost sharing of purposes requires a certain dam height, and
context, results are not at all easily established accompanying construction costs have to be shared
using this mathematical duality. In this sense con- by the purposes. Typical for the type of problem is
sider (Leroux 2008) as a warning to the reader, that up to a certain critical height there are econo-
showing that the strategic analysis of cost sharing mies of scale as marginal costs of extra height are
solutions is quite different from surplus sharing decreasing. Afterward, marginal costs increase due
solutions. This monograph will center on cost to technological constraints. The problem here is to
sharing problems. For further reference on pro- allocate the construction costs of a specific dam
duction sharing, see Israelsen (1980), Leroux among the relevant purposes. ⊲
(2004, 2008), and Moulin and Shenker (1992).
Example 4.2 Another illustrative cost sharing
problem dating back from the early days in the
Introduction cost sharing literature (Littlechild and Owen
1973; Littlechild and Thompson 1977) deals
In many practical situations, managers or policy- with landing fee schedules at airports, so-called
makers deal with private or public enterprises with airport problems. These were often established to
multiple users. A production technology facili- cover the costs of building and maintaining the
tates its users, causing externalities that have to runways. The cost of a runway essentially
be shared. Applications are numerous, ranging depends on the size of the largest type of airplane
from environmental issues like pollution and fish- that has to be accommodated – a long runway can
ing grounds to sharing multipurpose reservoirs, be used by smaller types as well. Suppose there
road systems, communication networks, and the are m types of airplanes and that ci is the cost of
Internet. The essence in all these examples is that a constructing a landing strip suitable for type i.
manager cannot directly influence the behavior of Moreover, index the types from small to large so
the users but only indirectly by addressing the that 0 = c0 < c1 < c2. . .cm. In the above terminol-
externalities through some decentralization ogy, the technology can be described by
device. By choosing the right instrument, the X = {0, 1, 2, . . ., m}, and the cost function
manager may help to shape and control the nature c : XN ! ℝ+ is defined by c(x) = ck where
of the resulting individual and aggregate behavior. k = max {xi|i N} is the maximal service level
This is what is usually understood as the required in x. Suppose that in a given year, Nk is
Cost Sharing in Production Economies 423
the set landings of type k airplanes; then the set of edges toward . The edge left to the k-th node is
users of the runway is N = [kNk. The problem is called ek = (k 1, k), and the corresponding cost
now to apportion the full cost c(x) of the runway to is c(ek) = ck ck 1. The demand of an airplane at
the users in N, where x is the demand vector given node k is now described by the edges on the path
by xi = ‘ if i N‘. from node k to . ⊲
Airport problems describe a wide range of cost
sharing problems, ranging from sharing the main- Example 4.5 In more general network design
tenance cost of a ditch system for irrigation pro- problems, a link facilitates a flow; for instance, in
jects (Aadland and Kolpin 1998) to sharing the telecommunication it is data flowing through the
dredging costs in harbors (Bergantino and network; in road systems it is traffic. Henriet and
Coppejans 1997). ⊲ Moulin (1996) discusses a model where a network
planner allocates the fixed cost of a network based
Example 4.3 A joint project involves a number on the individual demands being flows.
of activities for which the estimated durations and Matsubayashi et al. (2005) and Skorin-Kapov
precedence relations are known. Delay in each of and Skorin-Kapov (2005) discuss congested tele-
these components affects the period in which the communication networks, where the cost of a link
project can be realized. Then a cost sharing prob- depends on the size of the corresponding flow.
lem arises when the joint costs due to the accu- Then these positive network externalities lead to
mulated delay are shared among the individuals a concentration of flow and thus to hub-like net-
causing the delays. See Br^anzei et al. (2002). ⊲ works. Economies of scale require cooperation of
the users, and the problem now is to share the cost
Example 4.4 In many applications the produc- of these so-called hub-like networks. ⊲
tion technology is given by a network G = (V, E)
with nodes V and set of costly edges E V V
Example 4.6 As an insurance against the uncer-
and cost function c : E ! ℝ+. The demands of the
tainty of the future net worth of its constituents,
agents are now parts of the infrastructure, i.e.,
firms are often regulated to hold an amount of
subsets of E. Examples include the sharing the
riskless investments, i.e., its risk capital. Given
cost of infrastructure for supply of energy and
that returns of normal investments are higher, the
water,
difference with the riskless investments is consid-
For example, the above airport problem can be ered as a cost. The sum of the risk capitals of each
modeled as such with constituent is usually larger than the risk capital of
the firm as a whole, and the allocation problem is
V = {1, 2, . . . , m} ∪ { }, to apportion this diversification effect observed in
E = {( , 1), (1, 2), . . . , (m − 1, m)}.
risk measurements of financial portfolios. See
Denault (2001). ⊲
e1 e2 e3 em
1 2 3 m
budget-balancing condition. Central issue Secondly, there is the literature on cost sharing
addressed in the cost sharing literature is how to models where individual preferences are explicitly
determine the appropriate y. The vast majority of modeled and demands are elastic. The focus is on
the cost sharing literature is devoted to a mecha- noncooperative demand games in which the agents
nistic way of sharing joint costs; given a class of are assumed to choose their demands strategically
cost sharing problems P, a (simple) formula com- (see, e.g., Kolpin and Wilbur 2005; Moulin and
putes the vector of cost shares for each of its Shenker 1992; Watts 2002).
elements. This yields a cost sharing rule m : As an interested reader will soon find out, in
P ! ℝN where m(P) is the vector of cost shares the literature there is no shortage of plausible cost
for each P P . sharing techniques. Instead of presenting a kind of
At this point it should be clear to the reader that summary, this entry focuses on the most basic and
many formulas will see to a split of joint cost, and most interesting ones and in particular their prop-
heading for the solution to cost sharing problems is erties with respect to strategic interplay of the
therefore an ambiguous task. The least we want agents.
from a solution is that it is consistent with some
basic principles of fairness or justice and, more- Outline
over, that it creates the right incentives. Clearly, the The entry is organized as follows: section
desirability of solution varies with the context in “Cooperative Cost Games” discusses cost shar-
which it is used and so will the sense of appropri- ing problems from the perspective of coopera-
ateness. Moreover, the different parties involved in tive game theory. Basic concepts like core,
the decision-making process will typically hold Shapley value, nucleolus, and egalitarian solu-
different opinions; accountants, economists, pro- tion are treated. Section “Noncooperative Cost
duction managers, regulators, and others all are Games” introduces the basic concepts of nonco-
looking at the same institutional entity from differ- operative game theory including dominance
ent perspectives. The existing cost sharing litera- relations, preferences, and Nash equilibrium.
ture is about exploring boundaries of what can be Demand games and demand revelation games
thought of desirable features of cost sharing rules. are introduced for discrete technologies with
More important than the rules themselves are the concave cost function. This part is concluded
properties that each of them is consistent with. with two theorems, the strategic characteriza-
Instead of building a theory on single instances of tion of the Shapley value and constrained egal-
cost sharing problems, the cost sharing literature itarian solution as cost sharing solution,
discusses structural invariance properties over clas- respectively. Section “Continuous Cost Sharing
ses of problems. Here the main distinction is made Models” introduces the continuous production
on the basis of the topological properties of the model, and it consists of two parts. First the
technology, whether the cost sharing problem simple case of a production technology with
allows for a discrete or continuous formulation. homogeneous and perfectly divisible private
For each type of models, divisible or indivisible goods is treated. Prevailing cost sharing rules
goods, the state-of-the-art cost sharing literature like proportional, serial, and Shapley-Shubik
has developed into two main directions, based on are shortly introduced. We then give a well-
the way individual preferences over combinations known characterization of additive cost sharing
of cost shares and (levels of) service are treated. rules in terms of corresponding rationing
Firstly, there is a stream of research in which indi- methods and discuss the related cooperative
vidual preferences are not explicitly modeled and and strategic games. The second part is devoted
demands are considered to be inelastic. Roughly, it to the heterogeneous output model and famous
accommodates the large and vast growing axiom- solutions like Aumann-Shapley, Shapley-
atic literature (see, e.g., Moulin 2002; Sprumont Shubik, and serial rules. In section “Stochastic
and Moulin 2007) and the theory on cooperative Cost Sharing Models” the focus is on a rather
cost games (Peleg and Sudhölter 2004; Sudhölter new direction in the cost sharing literature, in
1998; Tijs and Driessen 1986; Young 1985c). which the determinants of the cost sharing
Cost Sharing in Production Economies 425
Cost Sharing in 12 8 13
Production Economies, 1 2 3
Fig. 2 Airport game
problem are uncertain and modeled as stochastic Example 5.1 The following numerical example
variables. Two simple models are discussed in will be frequently referred to. An airport is visited
which the deterministic cost sharing model is by three airplanes in the set N = {1, 2, 3} which
generalized to a more realistic and practical one can be accommodated at cost c1 = 12, c2 = 20,
where both the outcomes of the production tech- and c3 = 33, respectively. The situation is
nology as well as the costs corresponding to depicted in Fig.2.
output levels are random variables. The final
The corresponding cost game c is determined by
section “Future Directions” is looking at future
associating each coalition S of airplanes to the min-
directions of research.
imum cost of the runway needed to accommodate
each of its members. Then the corresponding cost
game c is given by the table below. Slightly abusing
Cooperative Cost Games notation we denote c(i) to indicate c({i}), c(ij) for c
({i, j}), and so forth.
A discussion of cost sharing solutions and incen-
tives needs a proper framework wherein the
incentives are formalized. In the seminal work S ∅ 1 2 3 12 13 23 123
of von Neumann and Morgenstern (1944), the c(S) 0 12 20 33 20 33 33 33
notion of a cooperative game was introduced as
to model the interaction between actors and Note that, since we identified coalitions of
players who coordinate their strategies in order players in N with elements in 2N, we may write
to maximize joint profits. Shubik (1962) was one c to denote the cooperative cost game. By the
of the first to apply this theory in the cost sharing binary nature of the demands, the cost function
context. for the technology formally is a cooperative cost
game. For x = (1, 0, 1) the corresponding cost
Cooperative Cost Game game cx is specified by
A cooperative cost game among players in N is a
function c : 2N ! ℝ with the property that
c(∅) = 0; for non-empty sets S N, the value S ∅ 1 2 3 12 13 23 123
c(S) is interpreted as the cost that would arise cx (S) 0 12 0 33 12 33 33 33
should the individuals in S work together and
serve only their own purposes. The class of all Player 2 is a dummy player in this game; for all
cooperative cost games for N will be S N\{2}, it holds cx(S) = cx(S [ {2}). ⊲
denoted CG.
Any general class P of cost sharing problems Example 5.2 Consider the situation as depicted
can be embedded in CG as follows. For the cost in Fig. 3, where three players, each situated at a
sharing problem ðx, cÞ P among agents in N, different node, want to be connected to the special
define the stand-alone cost game cx CG by node using the indicated costly links. In order to
connect themselves to , a coalition S may use
cðxS , 0N∖S Þ if S N, S 6¼ ∅ only links with and the direct links between its
cx ðSÞ≔ ð1Þ
0 if S ¼ ∅: members and then only if the costs are paid for.
For instance, the minimum cost of connecting
So cx(S) can be interpreted as the cost of serv- player 1 in the left node to is 10, and the cost
ing only the agents in S. of connecting players 1 and 2 to is 18 – the cost
426 Cost Sharing in Production Economies
10 4
2 3
4 12
8 9 1 4
1 2 3 4 12
10 4 4
10 10
of the direct link from 2 and the indirect link larger coalitions is non-increasing, i.e., the tech-
between 1 and 2. Then the associated cost game nology exhibits positive externalities. Concave
is given by. games are also frequently found in the network
literature (see Moulin and Shenker 2001; Sharkey
S ∅ 1 2 3 12 13 23 123 1995; Koster et al. 2002; Maschler et al. 1996).
c(S) 0 10 10 10 18 20 19 27
Example 5.3 Although sub-additive, minimum
cost spanning tree games are not always concave.
Notice that in this case the network technology
Consider the following example due to Bird (1976).
exhibits positive externalities. The more players
The numbers next to the edges indicate the
want to be connected, the lower the per capita cost. ⊲
corresponding cost. We assume a complete graph
For those applications where the cost c(S) can be
and that the invisible edges cost 4. Note that in this
determined irrespective of the actions taken by its
game, every three-player coalition is connected at
complement N\S, the interpretation of c implies sub-
cost 12, whereas c(34) = 16. Then c(1234) c(234)-
additivity, i.e., the property that for all S, T N with
= 16 12 = 4, whereas c(134) c(34) = 4. So
S \ T = ∅ implies c(S [ T) c(S) + c(T). This is,
the marginal cost of player 1 is not decreasing with
for instance, an essential feature of the technology
respect to larger coalitions (Fig. 4). ⊲
underlying natural monopolies (see, e.g., Baumol
et al. 1988; Sharkey 1982). Note that the cost
games in Examples 5.1 and 5.2 are sub-additive. Incentives in Cooperative Cost Games
This is a general property for airport games as well The objective in cooperative games is to share the
as minimum cost spanning tree games. profits or costs savings of cooperation. Similar to
Sometimes the benefits of cooperation are even the general framework, a vector of cost shares for
stronger. A game is called concave (or sub- a cost game c CG is a vector x ℝN such that
modular) if for all S, T N we have x(N) = c(N). The question is what cost share
vectors make sense if (coalitions of) players
cðS [ T Þ þ cðS \ T Þ cðSÞ þ cðT Þ: ð2Þ have the possibility to opt out thereby destroying
cooperation on a larger scale.
At first this seems a very abstract property, but In order to ensure that individual players join, a
one may show that it is equivalent with the fol- proposed allocation x should at least be individual
lowing, that rational so that xi c(i) for all i N. In that case
cðS [ figÞ cðSÞ cðT [ figÞ cðT Þ ð3Þ no player has a justified claim to reject x as pro-
posal, since going alone yields a higher cost. The
for all coalitions S T N\{i}. This means that set of all such elements is called the imputation
the marginal cost of a player i with respect to set. If, in a similar fashion, x(S) c(S) for all
Cost Sharing in Production Economies 427
S N, then x is called stable; under proposal x no The Separable Cost Remaining Benefit Solution
coalition S has a strong incentive to go alone, as it Common practice among civil engineers to allo-
is not possible to redistribute the cost shares after- cate costs of multipurpose reservoirs is the follow-
ward and make every defector better off. The core ing solution. The separable cost for each player
of a cost game c, notation core (c), consists of all (read purpose) i N is given by si = c(N) c(N
stable vectors of cost shares for c. If cooperation \{i}) and the remaining benefit by ri = c(i) si.
on a voluntary basis by the grand coalition N is The separable cost remaining benefit solution
conceived as a desirable feature, then the core and charges each player i for the separable cost si,
certainly the imputation set impose reasonable and the non-separable costs c(N) j Nsj are
conditions for reaching it. Nevertheless, the core then allocated in proportion to the remaining ben-
of a game can be empty. efits ri, leading to the formula
Call a collection B of coalitions balanced if " #
r X
i
there is a vector of positive weights ðlS ÞS B such SCRBi ðcÞ ¼ si þ P c ðN Þ sj : ð4Þ
that for all i N j N rj jN
X
lS ¼ 1: In this formula it is assumed that c is at least sub-
S B, S∍i additive to ensure that the ri’s are all positive. For the
two-player game c in Example 5.5, the solution is
A cost game c is balanced if it holds for each given by SCRBðcÞ ¼ 2 þ 2 ð10 9Þ, 7þ 12
1
balanced collection B of coalition that
ð10 9ÞÞ ¼ 2 12, 7 12 . In earlier days the solution
X was well known as “the alternate cost avoided
lS cðSÞ cðN Þ: method” or “alternative justifiable expenditure
SB
method.” For references, see Young (1985b).
It is the celebrated theorem below which char-
acterizes all games with non-empty cores. Shapley Value
One of the most popular and oldest solution con-
Theorem 5.4 Bondareva-Shapley (Bondareva cepts in the literature on cooperative games is due
1963; Shapley 1967) to Shapley (1953), named Shapley value. Roughly
The cost game c is balanced if and only if the it measures the average marginal impact of
core of c is non-empty. players. Consider an ordering of the players
s : N ! N so that s(i) indicates the i-th player
Concave cost games are balanced (see Shapley in the order. Let s(i) be the set of the first
1971). Concavity is not a necessary condition for i players according to s, so s(1) = {s(1)},
non-emptiness of the core, since minimum cost s(2) = {s(1), s(2)}, etc. The marginal cost
spanning tree games are balanced as well. share vector ms(c) ℝN is defined by
mssð1Þ ð1Þ ¼ cðsð1ÞÞ, and for i = 2, 3,. . ., n
Example 5.5 Consider the two-player game
c defined by c(12) = 10, c(1) = 3, and c(2) = 8. mssðiÞ ðcÞ ¼ cðs ðiÞÞ cðs ði 1ÞÞ: ð5Þ
Then core(c) = {(x, 10 x)|2 x 3}. Note that,
So according to ms, each player is charged
opposed to the general case, for two-player
with the increase in costs when joining the coali-
games, sub-additivity is equivalent with non-
tion of players before her. Then the Shapley value
emptiness of the core. ⊲
for c is defined as the average of all n! marginal
vectors, i.e.,
Cooperative Solutions
1X s
A solution on a subclass A of CG is a mapping FðcÞ ¼ m ðcÞ: ð6Þ
n! s
m : A ! ℝN that assigns to each c A a vector
of cost shares m(c); mi(c) stands for the charge Example 5.6 Consider the airport game in Exam-
to player i. ple 5.1. Then the marginal vectors are given by
428 Cost Sharing in Production Economies
possible coalitions increases. Then it is more contributions axiom states that the gains and/or
than reasonable that this division should not be losses by other player’s withdrawal from the coa-
penalized. In a broader context, this envisions the lition should be the same.
idea that each player in the cost game should be
credited with the merits of “uniform” technolog- Theorem 5.9 Myerson (1980)
ical advances. There is a unique solution on CG that satisfies
the balanced contributions axiom, and that is F.
Strong Monotonicity
Solution m is strongly monotonic if for any two The balanced contribution property can be
cost games c, c it holds for all i N that interpreted in a bargaining context as well. In the
cðS [ figÞ cðSÞ cðS [ figÞ for all S N\{i} game c and with solution m, a player i can object
implies mi ðcÞ mi ðcÞ. against player j to the solution m(c) when the cost
Anonymity is the classic property for solutions share for j increases when i steps out of the coop-
declaring independence of solution with respect to eration, i.e., mj(N, c) mj(N\{i}). In turn, a coun-
the name of the actors in the cost sharing problem. ter objection by player j to this objection is an
See, e.g., Albizuri and Zarzuelo (2007), Moulin assertion that player i would suffer more when
and Shenker (1992), and Pérez-Castrillo and j ends cooperation, i.e., mj(N, c) mj(N\{i})
Wettstein (2006). Formally, the definition is as mi(N, c) mi(N\{j}). The balanced contribution
follows. For a given permutation p : N ! N and property is equivalent to the requirement that each
c CG, define pc CG by pc(S) = c(p(S)) for objection is balanced by a counter objection. For
all S N. an excellent overview of ideas developed in this
spirit, see Maschler (1992).
Anonymity Another marginalistic approach is by Hart and
Solution m is anonymous if for all permutations p Mas-Colell (1989). Denote for c CG the game
of N, and all i N, mp(i)(pc) = mi(c) for all cost restricted to the players in S N by (S, c). Given a
games c. function P : CG ! ℝ which associates a real
number P (N, c) to each cost game c with player
Theorem 5.8 Young (1985b) set N, the marginal cost of a player i is defined to
The Shapley value is the unique anonymous be DiP(c) = P(N, c) P(N\{i}, c). Such a func-
and strongly monotonic solution. tion P with P(∅, c) = 0 is called potential if
i NDiP(N, c) = c(N).
Myerson (1980) introduced the balanced con-
tributions axiom for the model of nontransferable Theorem 5.10 Hart and Mas-Colell (1989)
utility games, or games without side payments (see There exists a unique potential function P, and
Shapley 1969). Within the present context of CG, a for every c CG, the resulting payoff vector DP
solution m satisfies the balanced contributions (N, c) coincides with F(c).
axiom, if for any cost game c and for any non-
empty subset S N, {i, j} S N it holds that
Egalitarian Solution
The Shapley value is one of the first solution
mi ðS, cÞ mi ðS∖fjg, cÞ
concepts proposed within the framework of coop-
¼ mj ðS, cÞ mj ðS∖fig, cÞ: ð8Þ
erative cost games, but not the most trivial. This
would be to neglect all asymmetries between the
The underlying idea is the following. Suppose players and split total costs equally between them.
that players agree on using solution m and that But as one can expect, egalitarianism in this pure
coalition S forms. Then mi(S, c) mi(S\{j}, c) is form will not lead to a stable allocation. Just
the amount player i gains or loses when S is consider the two-player game in Example 5.5
already formed and player j resigns. The balanced where pure egalitarianism would dictate the
430 Cost Sharing in Production Economies
allocation (5, 5), which violates individual ratio- solution. The original idea of constrained egalitar-
nality for player 1. ianism as in Dutta and Ray (1989) focuses on the
In order to avoid these problems, of course we Lorenz core instead of the core. It is shown that
can propose to look for the most egalitarian alloca- there is at most one such allocation that may exist
tion within the core (see Arin and Iñarra 2001; Dutta even when the core of the underlying game is
and Ray 1991). Then in this line of thinking, what is empty.
needed in Example 5.5 is a minimal transfer of cost For concave cost games c, the allocation is
2 to player 2, leading to the final allocation (3, 7) – well-defined and denoted mE(c). In particular this
the constrained egalitarian solution. Although in holds for airport games. Intriguingly, empirical
the former example it was clear what allocation to studies (Aadland and Kolpin 1998, 2004) show
choose, in general we need a tool to evaluate allo- there is a tradition in using the solution for this
cations for the degree of egalitarianism. The earlier type of problems.
mentioned papers all suggest the use of Lorenz For concave cost games c, there exists an algo-
order (see, e.g., Atkinson 1970). More precisely, rithm to compute mE(c). This method, due to Dutta
consider two vectors of cost shares x and x0 such and Ray (1989), performs the following consecu-
that x(N) = x0(N). Assume that these vectors are tive steps. First determine the maximal set S1 of
ordered in decreasing order so that x1 x2 . . . xn players minimizing the per capita cost c(S)/|S|,
and x01 x02 . . . x0n. Then x Lorenz-dominates where |S| is the size of the coalition S. Then each
x0 – read x is more egalitarian than x0 – if for all of these players in S1 pays c(S1)/ |S1|. In the next
k = 1, . . ., n 1 it holds that step, determine the maximal set S2 of players in
N\S1 minimizing c2(S)/|S|, where c2 is the cost
X
k X
k game defined by c2(S) = c(S1[S) c(S1). The
xi x0i , ð9Þ players in S2 pay c2(S2)/|S2| each. Continue in this
i¼1 i¼1
way just as long as not everybody is allocated a
cost share. Then in at most n steps, this procedure
with at least one strict inequality. That is, x is
results in an allocation of total cost, the
better for those paying the most.
constrained egalitarian solution. In short the algo-
rithm is as follows:
Example 5.11 Consider the three allocations of
cost 15 among three players x = (6, 5, 4),
x0 = (6, 6, 3), and x00 = (7, 4, 4). Firstly, • Stage 0: Initialization, put S0 ¼ ∅, x ¼ 0N ,
x Lorenz-dominates x00 since x1 ¼ 6 < 7 ¼ x001 and go to stage t = 1.
and x1 þ x2 ¼ x01 þ x02 . Secondly, x Lorenz- • Stage t: Determine
dominates x0 since x1 ¼ x01 , x1 þ x2 < x01 þ x02 .
Notice, however, that on the basis of only (9),
c S [ St1 c St1
we cannot make any judgment what is the more St arg max :
S6¼∅ jSj
egalitarian of the allocations x0 and x00 . Since we
have x01 ¼ 6 < 7 ¼ x001 but x01 þ x02 ¼ 6 þ 6 > Put St ¼ St1 [ St and for i St,
7 þ 4 ¼ x001 þ x002 . The Lorenz order is only a
partial order. ⊲ c St c St1
xi ≔ :
j St j
The constrained egalitarian solution is the set If St ¼ N, we are finished; put mE(c) = x. Else
of Lorenz-undominated allocations in the core of
a game. Due to the partial nature of the Lorenz repeat the stage with t ≔ t + 1.
order, there may be more than one Lorenz-
undominated element in the core. And what if For example, this algorithm can be used to
the core is empty? The constrained egalitarian calculate the constrained egalitarian solution for
solution is obviously not a straightforward the airport game in Example 5.1. In the first step
Cost Sharing in Production Economies 431
we determine S1 = {1, 2}, together with cost dissatisfaction of S under proposal x. Arrange the
shares 10 for the players 1 and 2. Player 3 is excesses of all coalitions S 6¼ N, ∅ in decreasing
n
allocated the remaining cost in the next step; order and call the resulting vector #ðxÞ ℝ2 2 .
hence the corresponding final allocation is A vector of cost shares x will be preferred to a
mE(c) = (10, 10, 13). vector y, notation x y, whenever #(x) is smaller
than #(y) in the lexicographic order, i.e., there
Example 5.12 Consider the case where six exists i such that for i i 1, it holds
players share the cost of the following tree net- #i(x) = #i(y) and #i ðxÞ ¼ #i ðyÞ . Schmeidler
work that connects them to . The standard fixed (1969) showed that in the set of individual ratio-
tree game c for this network associates to each nal cost sharing vectors, there is a unique element
coalition of players the minimum cost of that is maximal with respect to , which is called
connecting each member to , where it may use the nucleolus. This allocation, denoted by n(c), is
all the links of the tree. This type of games is based on the idea of egalitarianism that the larg-
known to be concave; we can use the above algo- est complaints of coalitions should consistently
rithm to calculate mE(c). In the first step, we deter- be minimized. The concept gained much popu-
mine S1 = {1, 3, 4} and each herein pays 8. Then larity as a core selector, i.e., it is a one-point
in the second step, the game remains where the solution contained in the core when it is non-
edges e1, e3, and e4 connecting S1 have been paid empty. This contrasts with the constrained egal-
for. Then it easily follows that S2 = {2, 5}, so that itarian solution which might not be well-defined
players 2 and 5 pay 9 each, leaving 10 as cost and the Shapley value which may lay outside
share for player 6. Thus, we find mE(c) = (8, 9, 8, the core.
8, 9, 10) (Fig. 5). ⊲
Example 5.13 Consider in Example 5.1 the
Nucleolus excesses of the different coalitions with respect to
Given a cost game c CG the excess of a the constrained egalitarian solution mE(c) = (10,
coalition S N with respect to a vector x ℝN 10, 13) and the nucleolus n(c) = (6, 7, 20):
is defined as e(S, x) = x(S) c(S); it measures
S 1 2 3 12 13 23
E
e(S, μ (c)) -2 10 -20 0 -10 -10
5 6 e(S, ν(c)) -6 -13 -13 -7 -7 -6
12
The nucleolus of standard fixed tree games
may be calculated as a particular home-down allo-
cation, as was pointed out by Maschler et al.
(1996). ⊲
Cost Sharing in Production Economies, For standard fixed tree games and minimum
Fig. 5 Standard fixed tree cost spanning tree games, the special structure of
432 Cost Sharing in Production Economies
the technology makes it possible to calculate the theoretical rules according to Sudhölter (1998),
nucleolus in polynomial time, i.e., with a number will be most useful below.
of calculations bounded by a multiple of n2 (see
Granot and Huberman 1984). Sometimes one may
even express the nucleolus through a nice for- Noncooperative Cost Games
mula; Legros (1986) showed a class of cost shar-
ing problems for which the nucleolus equals the Formulating the cost sharing problem through a
SCRB solution. But in general calculations are cooperative cost game assumes inelastic demands
hard and involve solving a linear program with a of the players. It might well be that for some
number of inequalities which is exponential in n. player, the private merits of service do not out-
Skorin-Kapov and Skorin-Kapov (2005) suggests weigh the cost share that is calculated by the
to use the nucleolus on the cost game planner. She will try to block the payment when
corresponding to hub games. no service at no cost is a preferred outcome.
Instead of the direct comparison of excesses like Another aspect is that the technology may operate
above, the literature also discusses weighted at a sub-optimal level if benefits of delivered
excesses as to model the asymmetries of justifiable services are not taken into account. Below the
complaints within coalitions. For instance, the per focus is on a broader framework with elastic
capita nucleolus minimizes maximal excesses demands, which incorporates preferences of a
which are divided by the number of players in the player are defined over combinations of service
coalition (see Peleg and Sudhölter 2004). levels and cost shares. The theory of noncoopera-
tive games will provide a proper framework in
Cost Sharing Rules Induced by Solutions which we can discuss individual aspirations and
Most of the above numerical examples deal with efficiency of outcomes on a larger scale.
cost sharing problems which have a natural and
intuitive representation as a cost game. Then basi- Strategic Demand Games
cally on such domains of cost sharing problems, At the heart of this noncooperative theory is the
there is no difference between cost sharing rules notion of a strategic game, which models an inter-
and solutions. It may seem that the cooperative active decision-making process among a group of
solutions are restricted to this kind of situations. players whose decisions may impact the conse-
But recall that each cost sharing problem (x, c) is quences for others. Simultaneously, each player
associated its stand-alone cost game cx CG, as i independently chooses some available action ai,
in (1). Now let m be a solution on a subclass of and the so-realized action profile a = (a1, a2, . . .,
A CG and B a class of cost sharing problems (x, an) is associated with some consequence f(a).
c) for which cx A. Then a cost sharing rule m Below we will have in mind demands or offered
(Fig. 6) is defined on B through contributions as actions, and consequences are
combinations of service levels with cost shares.
Throughout we will assume that players have The literature discusses several refinements of
preferences over the different consequences of this equilibrium concept. One that will play a role
action. Moreover, such preference relation can in the games below is that of strong Nash equilib-
be expressed by a utility function ui : C ! ℝ rium due to Aumann (1959); it is a Nash equilib-
such that for z, z0 C it holds ui(z) ui(z0) if rium a in a strategic game G such that for all SN
agent i weakly prefers z0 to z. Below the set of and action profile aS, there exists a player i S
consequences for agent i N will consist of pairs such that ui aS , aN∖S ui ða Þ. This means that
(x, y) where x is the level of service and y a cost
share, so that utilities are specified through multi- a strong Nash equilibrium guarantees stability
variable functions, (x, y) 7! ui(x, y). against coordinated deviations, since within the
deviating coalition there is at least one agent who
does not strictly improve.
Preferences over Action Profiles
In turn define for each agent i and all a A,
Example 6.1 Consider the following two-player
Ui(a) = ui(f(a)); then Ui assigns to each action
strategic game with N = {1, 2}, A1 = {T, B} and
profile the utility of its consequence. We will say
A2 = {L, M, R}. Let the utilities be as in the table
that the action profile a0 is weakly preferred to
below
a by agent i if Ui(a) Ui(a0 ); Ui is called agent i’s
utility function over action profiles.
L M R
T 5,4 2,1 3,2
B 4,3 5,2 2,5
Strategic Game and Nash Equilibrium
A strategic game is an ordered triple G = hN,
Here player 1 chooses a row and player 2 a
(Ai)i N, (Ui)i Ni where
column. The numbers in the cells summarize the
individual utilities corresponding to the action
• N = {1, 2, . . ., n} is the set of players.
profiles; the first number is the utility of player
• Ai is the set of available actions for player i.
1, the second that of player 2. In this game there is
• Ui is player i’s utility function over action
a unique Nash equilibrium, which is the action
profiles.
profile (T, L). ⊲
Example 6.2 In Example 6.1 action M of player i’s utility at receiving service level qi and cost
2 is strictly dominated by L and R. Player 1 has no share xi; ui is increasing in the level of received
dominated actions. Now eliminate M from the service xi and decreasing in the allocated cost yi.
actions for player 2. Then the reduced game is Now assume a cost function c and a cost sharing
rule m. Then given a demand profile a = (a1,
a2, . . .an), the cost sharing rule determines a vec-
L R tor of cost shares m(a, c) and in return also the
T 5,4 3,2 corresponding utilities over demands Ui(a) =
B 4,3 2,5
ui(ai, mi(a, c)). Observe that agents influence
each other’s utility via the cost component. The
Notice that action L for player 1 was not
demand game for this situation is then the strate-
dominated in the original game, for the reason
gic game
that B was the better of the two actions against
M. But if M is never played, T is strictly better
u1 ðq1 , x1 Þ ¼ 8q1 x1 ,
Then O 1 is the set of all action profiles surviving the
u2 ðq2 , x2 Þ ¼ 6q2 x2 ,
successive elimination of overwhelmed actions.
This notion is due to Friedman and Shenker u3 ðq3 , x3 Þ ¼ 30q3 x3 :
(1998) and Friedman (2002). In Example 6.1 the
action M is overwhelmed by L, not by R. Moreover,
the remaining actions in O 1 are B, T, L, and R. Here qi takes values 0 (no service) or 1 (service)
and xi stands for the allocated cost. So player 1 pre-
Demand Games fers to be served at unit cost instead of not being
Strategic games in cost sharing problems arise served at zero cost, u1(0, 0) = 0 < 7 = u1(1, 1). The
when we assume that the users of the production infrastructure is seen as an excludable public good,
technology choose their demands strategically so those with demand 0 do not get access to the
and a cost sharing rule sees to an allocation of technology. Each player now actively chooses to be
the corresponding service costs. The action pro- served or not, so her action set is specified by
files Ai are simply specified by the demand spaces Ai = {0, 1}. Recall the definition of cx as in (1).
of the agents, and utilities are specified over com- Then given a profile of such actions a = (a1, a2, a3)
binations of (level of) received service and accom- and cost shares Fða, cÞ, utilities of the players in
panying cost shares. Hence utilities are defined terms of action profiles become Ui(a) = ui(ai,
over consequences of action, and ui(qi, xi) denotes Fi(a, c)), so that
Cost Sharing in Production Economies 435
determine a suitable service level on the basis of shares as in Example 5.1. Moreover assume the
these labeled contributions (see Koster et al. 2003; planner uses the Shapley cost sharing rule F as
Young 1998). in Example 6.3and that she receives the true
Many mechanisms come to mind, but in order profile of preferences from the players,
to avoid too much arbitrariness from the planner’s a = (8, 6, 30). Calculate for each coalition
side, the more sensible ones will grant the players S the net benefits at a:
some control over the outcomes. We postulate the
following properties: S ∅ 1 2 3 12 13 23 N
π(S, α) 0 -4 -14 -3 -6 8 3 11
• Voluntary participation (VP). Each agent
i can guarantee herself the welfare level
ui(0, 0) (no service, no payment) by reporting
Not surprisingly, the net benefits are the
truthfully the maximal willingness to pay,
highest for the grand coalition. But if N were
which is ai under (12).
selected by the mechanism, the corresponding
• Consumer sovereignty (CS). For each agent i,
cost shares are given by Fð1N , cÞ ¼ ð4, 8, 21Þ ,
a report yi exists so that she receives service,
and player 2 is supposed to contribute more than
irrespective of the reports by others.
she is willing to. Then the second highest net
benefits are generated by serving S = {1, 3},
Now suppose that the planner receives the mes- with cost shares Fð1S , cÞ ¼ ð6, 0, 27Þ . Then {1,
sage = a, known to her as the profile of true player 3} is the solution to (13).
characteristics. Then for economic reasons she could What happens if some of the players misrepre-
choose to serve a coalition S of players that maxi- sent their preferences, for instance, like in
mizes the net benefit at a, p(S, a) = a(S) c(S). = (13, 6, 20)? The planner determines the con-
However, problems will arise when some player i is ceived net benefits
supposed to pay more than ai, so the planner should
be more careful than that. She may want to choose a
S ∅ 1 2 3 12 13 23 N
coalition S with maximal p(S, a) such that m-
(1S, c) a holds; such set S is called efficient. But π(S, η) 0 1 -14 -13 -1 0 -7 6
in general the planner cannot tell whether the players
reported truthfully or not; what should she do then? Again, if the planner served the coalition with the
One option is that she applies the above procedure highest net benefit, N, then player 2 would refuse to
thereby naively holding each reported profile for pay. Second highest net benefit corresponds to the
the true player characteristics. In other words, she singleton S = {1}, and this player will get service
will pick a coalition that solves the following opti- under M F since 3 = 13 > 12 = c(1, 0, 0). ⊲
mization problem
Example 6.4 Consider the airport problem and Here only the empty coalition S = ∅ satisfies
utilities of players over service levels and cost the requirement mE ð1S , cÞ ¼ ð0, 0, 0Þ ð8, 6, 30Þ;
Cost Sharing in Production Economies 437
simultaneously and independently decide upon that is defined through the optimization problem
requesting service or not, and costs are shared (13) will not implement an efficient coalition of
using the rule m among those agents receiving served players, due to the extra constraint on the
service. Suppose that for each profile of utility cost shares.
functions as in (12) the resulting game G(m, c) For instance, in Example 5.1, the value of the
has a unique (strong) Nash equilibrium. Then this grand coalition at a = (8, 6, 30) is given by v-
equilibrium can be taken to define a mechanism. (N, a) = a(N) c(N) = 44 33 = 11. At the
That is, the mechanism elicits u and chooses the same profile,
the implemented outcome by mecha-
unique equilibrium outcome of the reported nism M F gives rise to a total surplus of 38–30 = 8
demand game. Then this mechanism is equivalent for the grand coalition – which is not optimal. The
with the demand revelation mechanism. Observe mechanism MðmE Þ performs even worse as it leads
that indeed
the strong equilibrium (1, 0, 1) in the to the stand-alone surplus 0; none is served.
game G F, c in Example 6.3 corresponds to This observation holds for far more general
players chosen by M F under truthful reporting. settings, and, moreover, it is a well-known result
And where no player is served in the strong equi- from implementation theory that – under non-
librium of GðmE , cÞ, none of the players is selected constant marginal cost – any strategy-proof mech-
by MðmE Þ. It is a general result in implementation anism based on full coverage of total costs will not
theory due to Dasgupta et al. (1979) that a direct always implement efficient outcomes. For the con-
mechanism constructed in this way is (group) stant marginal cost case, see Leroux (2004) and
strategy-proof provided the underlying space of Maniquet and Sprumont (1999). Then, if there is
preferences is rich. Bochet and Klaus (2007) an unavoidable loss in using demand revelation
shows that for the general result, richness in the mechanisms, can we still tell which mechanisms
sense of Maskin and Sjöström (2002) is needed are more efficient? Is it a coincidence that in the
opposed to the version in Dasgupta et al. (1979). It above examples the Shapley value performs better
is easily seen that the above sets of preferences than the egalitarian solution?
meet the requirements. To stress importance of The welfare loss due to M(m) at a profile of true
such a structural property as richness, it is instruc- preferences a is given by
tive to point at what is yet to come in section
“Uniqueness of Nash Equilibria in P1-Demand Lðm, aÞ ¼ vðN, aÞ faðSðm, aÞÞ cðSÞg: ð15Þ
Games.” Here, the strategic analysis of the demand
game induced by the proportional rule shows
For instance, with a = (8, 6, 30)
in the above
uniqueness of Nash equilibrium on the domain of
examples, we calculate L F, a ¼ 11 8 ¼ 3
preferences ℒ if only costs are convex. However,
and LðmE , aÞ ¼ 11 0 ¼ 11. An overall measure
this domain is not rich, and the direct mechanism
of quality of a cost sharing rule m in terms of
defined in the same fashion as above by the Nash
efficiency loss is defined by
equilibrium selection is not strategy-proof.
Efficiency and Strategy-Proof Cost Sharing
gðmÞ ¼ supLðm, aÞ: ð16Þ
Mechanisms a
Suppose cardinal utility for each agent, so that
intercomparison of utility is allowed. Proceeding Theorem 6.7 Moulin and Shenker (2001)
on the net benefit of a coalition, we may define its Among all mechanisms M(m) derived from
value at a by cross-monotonic cost sharing rules m, the Shapley
rule F has the
unique smallest maximal efficiency
vðN, aÞ ¼ max pðS, aÞ, ð14Þ loss, or g F < gðmÞ if m 6¼ F.
SN
where p(S, a) is the net benefit of S at a. Notice that this makes a strong case for the
A coalition S such that v(N, a) = p(S, a) is called Shapley value against the egalitarian solution.
efficient. It will be clear that a mechanism M(m) The story does however not end here.
Cost Sharing in Production Economies 439
Mutuswami (2004) considers a model where given a profile of demands, the cost associated
the valuations of the agents for the good are inde- with the joint production must be shared by the
pendent random variables, drawn from a distribu- users. Then this model generalizes the binary
tion function F satisfying the monotone hazard good model discusses so far, and it is a launch
condition. This means that the function defined by pad to the continuous framework in the next
h(x) = f(x)/(1 F(x)) is non-decreasing, where section. In this discrete good setting, Moulin
F has f as density function. It is shown that the (1999) characterizes the cost sharing rules
constrained egalitarian solution maximizes the which induce strategy-proof social choice func-
probability that all members of any given coali- tions defined by the equilibria of the
tion accept the cost shares imputed to them. More- corresponding demand game. As it turns out,
over, Mutuswami (2004) characterized the these rules are basically the sequential stand-
solution in terms of efficiency. Suppose for the alone rules according to which costs are shared
moment that the planner calculated cost share in an incremental fashion with respect to a fixed
vector x for the coalition S and that its members ordering of the agents. This means that such a
are served conditional on acceptance of the pro- rule charges the first agent for her stand-alone
posed cost shares. The probability that all mem- costs, the second for the stand-alone cost for the
bers of S accept the shares is given by first two users minus the stand-alone cost of the
P(x) = ∏i S(1 F(xi)), and if we assume that first, etc. Here the word “basically” refers to all
the support of F is (0, m), then the expected discrete cost sharing problems other than those
surplus from such an offer can be calculated as with binary goods. Then here the sufficient con-
follows: dition for strategy-proofness is that the underly-
ing cost sharing rule be cross-monotonic, which
W ð x Þ ¼ Pð x Þ admits other rules than the sequential ones – like
" ð # mE and F.
X m ui
d Fð u i Þ c ð S Þ :
i S xi
1 Fð x i Þ
ð17Þ
Continuous Cost Sharing Models
The finding of Mutuswami (2004) is that for Continuous Homogeneous Output Model, P 1
log-concave f, i.e., x 7! ln (f(x)) is concave This model deals with production technologies
(An 1998), the mechanism based on the for one single perfectly divisible output com-
constrained egalitarian solution not only maxi- modity. Moreover, we will restrict ourselves to
mizes the probability that a coalition accepts the private goods. Many ideas below have been
proposal, but it maximizes its expected surplus as studied for public goods as well; for further
well. Formally, references, see, e.g., Fleurbaey and Sprumont
(2009), Maniquet and Sprumont (2004), and
Theorem 6.8 (Mutuswami 2004) Moulin (1994).
If the profile of valuations (ui)i N are inde- The demand space of an individual is given by
pendently drawn form a common distribution X = ℝ+. The technology is described by a non-
function F with log-concave and differentiable decreasing cost function c : ℝ+ ! ℝ such that
density function f, then W ðmE ð1S , cÞÞ c(0) = 0, i.e., there are no fixed costs. Given a
W ðmð1S , cÞÞ for all cross-monotonic solutions m profile of demands x ℝNþ , costs c(x(N)) have to
and all SN. be shared. Moreover, the space of cost functions
will be restricted to those c being absolutely con-
tinuous. Examples include the differentiable and
Extension of the Model: Discrete Goods Lipschitz continuous functions. Absolute continu-
Suppose the agents consume idiosyncratic ity implies that aggregate costs for production can
goods produced in indivisible units. Then be calculated by the total of marginal costs
440 Cost Sharing in Production Economies
ðy
the agents by increasing demands such that
c ðy Þ ¼ c0 ðtÞdt:
0
x1 x2 . . . xn. The intermediate pro-
duction levels are
Denote the set of all such cost functions by C 1
and the related cost sharing problems by P 1 .
Several cost sharing rules on P 1 have been pro- y1 ¼ nx1 , y2 ¼ x1 þ ðn 1Þx2 , . . . , yk
posed in the literature. X
k1
¼ xj þ ðn k þ 1Þxk , . . . , yn ¼ xðN Þ:
Average Cost Sharing Rule j¼1
x3
x2
y 1 −→
x1 y 2 −→ +
y 3 −→ + +
1 2 3
y0 = 0, y1 = 30, y2 = 50, and y3 = 60. Then the each agent contributes to the negative externality.
cost shares are calculated as follows: It seems fairly reasonable to demand a nonnega-
tive contribution in those cases, so that none
cð y1 Þ cð y0 Þ
1 ðx,cÞ ¼
mSR ¼ 150, profits for just being there. The mainstream cost
3 sharing literature includes positivity of cost shares
cðy2 Þ cðy1 Þ into the very definition of a cost sharing rule. Here
mSR
2 ðx,c Þ ¼ m SR
1 ð x,c Þ þ
2 we will add it as a specific property:
1250 450
¼ 150 þ ¼ 550,
2 Positivity m is positive if m(x, c) 0N for all (x, c)
3 ðx,cÞ ¼ m2 ðx,cÞ þ c y
mSR c y2
SR 3
in its domain. All earlier discussed cost sharing
¼ 550 þ 550 ¼ 1100:
⊲ rules have this property, except for the decreasing
The serial rule has attracted much attention serial rule.
lately in the network literature and found its way
in fair queuing packet scheduling algorithms in The decreasing serial rule is far more intuitive
routers (Demers et al. 1990). in case of economies of scale, in the presence of a
concave cost function. The larger agents now are
Decreasing Serial Rule credited with a lower price per unit of the output
de Frutos (1998) proposes serial cost shares good. Hougaard and Thorlund-Petersen (2001)
where demands of agents are put in decreasing and Koster (2002) propose variations on the
order. Resulting is the decreasing serial rule. serial rule that coincide with the increasing
Consider a demand vector x ℝNþ such that (decreasing) serial rule in case of a convex
x1 x2 . . .xn. Define recursively the numbers (concave) cost function, meeting the positivity
y‘ for ‘ = 1, 2, . . ., n by y‘ = ‘x‘ + x‘ + 1 +
+ xn, requirement.
and put yn + 1 = 0. Then the decreasing serial rule
is defined by Marginal Pricing Rule
X n
c y‘ c y‘þ1 A popular way of pricing an output of a production
mi ðx, cÞ ¼
DSR
: ð19Þ facility is marginal cost pricing. The price of the
‘
‘¼i output good is set to cover the cost producing one
extra unit. It is frequently used in the domain of
Example 7.2 For the cost sharing problem in public services and utilities. However, a problem is
Example 7.1, calculate y1 = 90, y2 = 70, that for concave cost functions, the method leads to
y1 = 60, and then budget deficits. An adapted form of marginal cost
pricing splits these deficits equally over the agents.
cðy3 Þ cðy4 Þ 4050 0 The marginal pricing rule is defined by
mDSR
3 ðx, cÞ ¼ ¼ ¼ 1350,
3 3
2 3
cðy Þ cðy Þ
mDSR
2 ðx, cÞ ¼ mDSR
3 ðx, cÞ þ 0 1
2 i ðx, cÞ ¼ xi c ðxðN ÞÞ þ
mMP
2450 4050 n
¼ 1350 þ ¼ 550, ½cðxðN ÞÞ xðN Þc0 ðxðN ÞÞ: ð20Þ
2 1
m1 ðx, cÞ ¼ m2 ðx, cÞ þ c y c y2
DSR DSR
¼ 550 þ ð1800 2450Þ ¼ 100: Note that in case of convex cost functions,
agents can receive negative cost shares, just like
⊲ it is the case with decreasing serial cost sharing.
Notice that here the cost share of agent 1 is
negative, due to the convexity of c! This may be
considered as an undesirable feature of the cost Additive Cost Sharing and Rationing
sharing rule. Not only are costs increasing in the The above cost sharing rules for homogeneous
level of demand, in case of a convex cost function production models share the following properties:
442 Cost Sharing in Production Economies
Additivity m(x, c1 + c2) = m(x, c1) + m(x, c2) for So each monotonic rationing method relates to
all relevant cost sharing problems. This property a cost sharing rule and vice versa. In this way mP is
carries the same flavor as the homonymous prop- directly linked with the proportional rationing
erty for cost games. method, mSR to the uniform gains method, and
mSS to the random priority method. Properties of
Constant Returns m(x, c) = #x for linear cost rationing methods lead to properties of cost shar-
functions c such that c(y) = #y for all y. So if the ing rules and vice versa (Koster 2012).
agents do not cause any externality, the fixed
marginal cost is taken as a price for the good. Incentives in Cooperative Production
Theorem 7.3 Moulin and Shenker (Moulin Theorem 7.5 Koster (2007)
For any cost sharing problem (x, c), it holds
2002; Moulin and Shenker 1994)
Consider the following mappings associating core cx ¼ fmðx, cÞjm Mg.
rationing methods with cost sharing rules and
vice versa:
In particular this means that for m M , it holds
ð xðN Þ that m(x, c) core(cx) whenever c is concave,
r 7! m : mðx,cÞ ¼ c0 ðt Þdrðx,t Þ,m 7! r : rðx,t Þ ¼ mðx, Gt Þ:
0 since this implies c = c. This result appeared
earlier as a corollary to Theorem 7.3 (see Moulin
These define an isomorphism between M and 2002). Hougaard and Thorlund-Petersen (2001)
the space of all monotonic rationing methods. and Koster (2002, 2012) show nonlinear cost
Cost Sharing in Production Economies 443
sharing rules yielding core elements for concave cost agent 1 between €30 and €40, then both agents
functions as well, so additivity is only a sufficient will have profited by such merging of demand.
condition in the above statement. For average cost Example 7.7 Consider the 5-agent cost sharing ⊲
sharing, one can show more, mP(x, c) core(cx) for problems (x, c) and ðx, cÞ with x = (1, 2, 3, 0, 0),
all x precisely when the average cost c(y)/y is x ¼ ð1, 2, 1, 1, 1Þ and convex cost function
decreasing in y. cðyÞ ¼ 12 y2. ðx, cÞarises out of (x, c) if agent 3 splits
her demand over agents 4 and 5 as well. Then
Strategic Manipulation Through Reallocation of
Demands mP ðx, cÞ ¼ ð6, 12, 18, 0, 0Þ
In the cooperative production model, there are mP ðx, cÞ ¼ ð6, 12, 6, 6, 6Þ,
other ways that agents may use to manipulate the mSR ðx, cÞ ¼ ð3, 11, 22, 0, 0Þ
final allocation. In particular, note that the serial
mSR ðx, cÞ ¼ ð5, 16, 5, 5, 5Þ:
procedure gives the larger demanders an advantage
in case of positive externalities; as marginal costs
decrease, the price paid by the larger agents per unit The aggregate of average cost shares for
of output is lower than that of the smaller agents. In agents 3, 4, and 5 does not change. But notice
the other direction, larger demanders are punished that according to the serial cost shares, there is a
if costs are convex. Then as the examples below clear advantage for the agents. Instead of paying
show, this is just why the serial ideas are vulnerable 22 in the original case, now the total of payments
to misrepresentation of demands, since combining equals 15. Agent 3 may consider a transfer
demands and redistributing the output afterward between 0 and 7 to 3 and 4 for their collaboration
can be beneficial. and still be better off. In general, in case of a
convex cost function, the serial rule is vulnerable
with respect to manipulation of demands through
Example 7.6 Consider the cost function c given
splitting.
by c(y) = min{5y,60 + 2y}. Such cost functions
are part of daily life, whenever one has to decide
upon telephone or energy supply contracts: usu- ⊲
ally customers get to choose between a contract Note that in the above cases, the proportional
with high fixed cost and a low variable cost and cost sharing rule does prescribe the same cost
another with low or no fixed cost and a high shares. It is a non-manipulable rule: reshuffling
variable price. Now consider the two cost sharing of demands will not lead to different aggregate
problems (x, c) and (x0 ,c), where x = (10, 20, 30) cost shares. The rule does not discriminate
and x0 = (0, 30, 30). The cost sharing problem (x0 , between units, when a unit produced is irrelevant.
c) arises from (x, c) if agent 2 places a demand on It is actually a very special feature of the cost
behalf of agent 1 – without letting agent 3 know. sharing rule that is basically not satisfied by any
The corresponding average and serial cost shares other cost sharing rule.
are given by
The second property shows even a stronger consequences, if these are equivalent. Such utility
property than merging and splitting in that agents functions ui are called quasi-concave. An example
may redistribute the demands in any preferred way of convex preferences are those related to linear
without changing the aggregate cost shares of the utility functions of type ui(x, y) = ax y. More-
agents involved. The third property says that in over, strictly convex preferences are those with
such cases, the cost shares of the other agents do strict inequality in (22) for 0 < t < 1; the
not change. Then this makes proportional cost corresponding utility functions are strictly quasi-
sharing compelling in situations where one is not concave. Special classes of preferences are the
capable of detecting the true demand characteris- following:
tics of individuals.
• L: the class of all convex and continuous prefer-
Demand Games for P 1 ences utility functions that are non-decreasing in
Consider demand games G(m, c) as in (11), section the service component x, non-increasing in the
“Demand Games,” where now m is a cost sharing cost component y, non-locally satiated, and
rule on P 1 . These games with uncountable strat- decreasing on (x, c(x)) for x large enough. The
egy spaces are more complex than the demand latter restriction is no more than assuring that
games that we studied before. agents will not place requests for unlimited
The set of consequences for players is now amounts of the good (Fig. 8).
given by C ¼ ℝ2þ, combinations of levels of pro- • L : the class of bi-normal preferences in L .
Basically, if such a preference is represented
duction and costs (see section “Strategic Demand
by a differentiable utility function u, then the
Games”). Then an individual i’s preference rela-
slope dy/dx of the indifference contours is non-
tion is convex if for the corresponding utility func-
increasing in x and non-decreasing in y. For a
tion ui and all pairs z, z0 C it holds
concise definition, see Watts (2002). Examples
ui ðzÞ ¼ ui ðz0 Þ ) uðtz þ ð1 tÞzÞ include Cobb-Douglas utility functions and also
ui ðzÞfor all t ½0, 1: ð22Þ those of type ui(x, y) = a(x) b(y) where a and
b are concave and convex functions, respec-
This means that a weighted average of the con- tively. A typical plot of level curves of such
sequences is weakly preferred to both utility functions is in Fig.9. Note that the
Cost Sharing in 0 1
Production Economies, 1 1 2
Fig. 8 Linear, convex y
preferences, u(x, y) =
2x y. The contours 1.5
indicate indifference
curves, i.e., sets of type
{(x, y)|u(x, y) = k}, the 1
k-level curve of u
u value color scale
0.5
−0.5
0 0 -1
0 1
x
Cost Sharing in Production Economies 445
Cost Sharing in 0 1 2 3
Production Economies, 3 3
Fig. 9 Strictly convex y
preferences, uðx, yÞ ¼ 1
pffiffiffi
x e0:5y. The straight line
connecting any two points
on the same contour lies in 0.5
2 2
the lighter area – with
higher utility. Fix the
-1.5
0 0
0 1 2 3
x
approach differs from the standard literature In a Nash equilibrium x of G(mP, c), each
where agents have preferences over endow- player i gives a best response on xi , the action
ments. Here costs are “negative” endowments. profile of the other agents. That is, player
In the latter interpretation, the condition can be i chooses xi arg max t UPi t, xi . Then first-
read as that the marginal rate of substitution is order conditions imply for an interior solution
nonpositive. At equal utility, an increase of the
1 1
level of output has to be compensated by a ai x ðN Þ xi ¼ 0 ð24Þ
decrease in the level of input good. 2 2
for all i N. Then x ðN Þ ¼ 12 ða1 þ a2 þ a3 Þ and
Nash Equilibria of Demand Games in a xi ¼ 2ai 12 ða1 þ a2 þ a3 Þ.
Simple Case
Consider a production facility shared by three Serial Demand Game
agents N = {1, 2, 3} with cost function Consider the same production facility and the
cðyÞ ¼ 12 y2 . Assume that the agents have demand game G(mSR, c), corresponding to the
quasi-linear utilities in L, i.e., ui(xi, yi) = aixi yi serial rule. Then the utilities over actions are
i, yi) = aixi yi for all pairs ðxi , yi Þ ℝ2þ . given by
Below the Nash equilibrium in the serial and
proportional demand game is calculated in two USR
i ðxÞ ¼ ai xi m ðx, cÞ:
SR
ð25Þ
special cases. This numerical example is based
on Moulin and Shenker (1992).
Now suppose x is a Nash equilibrium of this
game, and assume without loss of generality that
Proportional Demand Game x1 x2 x3 . Then player 1 with the smallest
Consider the corresponding proportional demand equilibrium demand maximizes the expression
game, G(mP, c), with utility over actions given by
3 2
USR
1 ððt, x2 , x3 Þ ¼ a1 t cð3tÞ=3 ¼ a1 t 2 t
U Pi ðxÞ ¼ ai xi mP ðx, cÞ
1 at x1, from which we may conclude that a1 ¼ 3x1.
¼ ai xi xi xðN Þ: ð23Þ
2 In addition, in equilibrium, player 2, maximizes
446 Cost Sharing in Production Economies
1 1 of equilibrium in the induced cost games, relative
U SR
2 ðx1 , t, x3 Þ ¼ a2 t cð3x1 Þ þ ðcðx1 þ 2tÞ cð3x1 Þ ,
3 2 to specific domains of preferences and cost func-
tions. Below we will discuss the major findings of
for t x1 , yielding a2 ¼ x1 þ 2x2 . Finally, the
Watts (2002). These results concern a broader cost
equilibrium condition for player 3 implies a3 ¼
sharing model with the notion of a cost function as
xðN Þ. Then it is not hard to see that actually this
a differentiable function ℝ+ ! ℝ+. So in this
constitutes the serial equilibrium.
paragraph, such cost function can decrease; fixed
cost need not be 0. This change in setup is not
Comparison of Proportional and Serial crucial to the overall exposition since the charac-
Equilibria (I) terizations below are easily interpreted within the
Now let’s compare the serial and the proportional context of P 1 .
equilibrium in the following two cases:
Demand Monotonicity The mapping t 7!
ð i Þ a1 ¼ a2 ¼ a3 ¼ a mi((t, xi), c) is non-decreasing; m is strictly
ðiiÞ a1 ¼ a2 ¼ 2, a3 ¼ 4: demand monotonic if this mapping is increasing
whenever c is increasing.
Case (i): We get x ¼ 12 a, 12 a, 12 a and xi ¼
1
1 1 Smoothness The mapping x 7! m(x, c) is con-
3 a, 3 a, 3 a for all i. The resulting equilibrium
payoff vectors are given by tinuously differentiable for all continuously dif-
ferentiable c C 1 .
1 2 1 2 1 2
U P ðx Þ ¼ a , a , a and U SR ðxÞ
8 8 8
Recall that L is the domain of all binormal
1 2 1 2 1 2
¼ a , a , a : preferences.
6 6 6
Theorem 7.10 Moulin and Shenker (1992) Let c be a strictly concave continuously differ-
Let c be a strictly convex continuously differentia- entiable cost function, and let m be a regular cost
ble cost function, and let m be a regular cost sharing sharing rule. The following statements are
rule. The following statements are equivalent: equivalent:
448 Cost Sharing in Production Economies
Theorem 7.12 Moulin (1996) Denote the set of Nash equilibria in the demand
Assume agents have preferences in L . The game G(m, c) with profile of preferences U by
serial cost sharing rule is the unique continuous, NE(m, c, U). Given c, m the guaranteed (relative)
cross-monotonic, and anonymous cost sharing rule surplus of the cost sharing rule m for N is defined by
for which the Nash equilibria of the corresponding
demand games all pass the no-envy test. P
i N ui ð x i Þ
cðxðN ÞÞ
gðc, mÞ ¼ inf :
U, x NEðm, c, UÞ uðc, UÞ
Comparison of Serial and Proportional ð28Þ
Equilibria (II)
Just as in the earlier analysis in section “Efficiency Here the infimum is taken over all utility profiles
and Strategy-Proof Cost Sharing Mechanisms,” discussed above. This measure is also called the
performance of cost sharing rules can be price of anarchy of the game (see Koutsoupias and
Cost Sharing in 0 1 2
Production Economies, 3 3 4
Fig. 10 Scenario (ii). The y
indifference curves of agent
1 together with the curve 3
1 ððt, x 1Þ, cÞ.
k : t 7! mSR μsr ∗
1 ((x, a−1 ), c)
Best response ofplayer 2
u1 value color scale
0 0 -3
0 1 2
x
Cost Sharing in Production Economies 449
Papadimitriou 1999). Let C be the set of all convex implement this serial choice function by an indi-
increasing cost functions with limy ! 1c(y)/y = 1. rect mechanism. It is defined through a multi-
Then Moulin (2010) shows that for the serial and stage game which mimics the way the serial
the proportional rule, the guaranteed surplus is at Nash equilibria are calculated. It is easily seen
least 1/n. But sometimes the distinction is eminent. that here each agent has a unique dominant strat-
Define the number egy, in which demands result from optimization
of the true preferences. Then this gives rise to a
yc00 ðyÞ strategy-proof mechanism.
dðyÞ ¼ ,
c0 ðyÞ
c 0 ð 0Þ Note that the same approach cannot be used for
the proportional rule. The strategic properties of
which is a kind of elasticity. The below theorem
the proportional demand game are weaker than
shows that on certain domains of cost functions
that of the serial demand game in several aspects.
with bounded d, the serial rule prevails over the
First of all, it is not hard to find preference profiles
proportional rule. For large n the guaranteed sur-
in L leading to multiple equilibria. Whereas
plus at mSR is of order 1/ln(n), that of mAV of order
uniqueness of equilibrium can be repaired by
1/n. Write K n ¼ 1 þ 13 þ . . . þ 2n11
1 þ ln2 n ,
restricting L to L , the resulting equilibria are in
then we obtain the following.
general not strong (like the serial counterparts). In
the proportional equilibria, there is over-
Theorem 7.13 Moulin (2010) production; see, e.g., the example in section “Pro-
portional Demand Game” where a small uniform
For any convex increasing cost function c with reduction of demands yields higher utility for all
lim cðyÞ=y ¼ 1, it holds that the players. Besides, a single-valued Nash equi-
y!1
librium selection corresponds to a strategy-proof
mechanism provided the underlying domain of
• If c0 is concave and inf{d(y) | y 0} = p > 0,
preferences is rich, and L is not. Though richness
then
is not a necessary condition, the proportional
p 4 rule is not consistent with a strategy-proof
g c, mSR g c, mAV demand game.
Kn nþ3
Scale Invariance
Extensions of Cost Sharing Rules
A cost sharing rule m on P n is scale invariant if
The single-output model is connected to the multi-
(29) holds for all linear transforms f. Under a scale
output model via the homogeneous cost sharing
invariant cost sharing rule, final cost shares do not
problems. Suppose that for c C n there is a func-
change by changing the units in which the goods
tion c0 C such that c(x) = c0(x(N)) for all x. For
are measured.
instance, such functions are found if we distin-
guish between production of blue and red cars; the
Path-Generated Cost Sharing Rules
color of the car does not affect the total production
Many cost sharing rules on P n calculate the cost
costs. Essentially, a homogeneous cost sharing
shares for ðx, cÞ P n by the total of marginal
problem (x, c) may be solved as if it were in P 1 .
costs along some production path from 0 toward
If m is the compelling solution on P 1, then any cost
x. Here a path for x is a non-decreasing mapping
sharing rule on P n should determine the same
gx : ℝþ ! ℝNþ such that g(0) = 0N and there is a
solution on the class of homogeneous problems
T ℝ+ with gx(T) = x. The cost sharing rule
therein. Formally the cost sharing rule on P n
extends m on P 1 if for all homogeneous cost
generated by
x N
the path collection g ¼
g j x ℝþ is defined by
sharing problems (x, c) it holds mðx, cÞ ¼
mðx, c0 Þ . In general a cost sharing rule m on P 1 ð1
0
allows for a whole class of extensions. Below we mgi ðx, cÞ ¼ @ i cðgx ðtÞÞ gxi ðtÞ dt: ð30Þ
will focus on extensions of mSR, mP, mSS. 0
ð xi
the paths are no more than the projections of g(t)
i ðx, cÞ
mFM ¼ @ i cðgx ðtÞÞ dt ð32Þ
on the cube [0, x]. Below we will see many exam- 0
ples of (combinations of) such fixed-path
This fixed-path cost sharing rule is demand
methods.
monotonic. As far as invariance with the choice
Aumann-Shapley Rule of unit is concerned, the performance is bad as it is
The early characterizations by Billera and Heath not a scale invariant cost sharing rule.
(1982) and Mirman and Tauman (1982) on this
rule set off a vast growing literature on cost shar- Moulin-Shenker Rule
ing models with variable demands. Billera et al. This fixed-path cost sharing rule is proposed as an
(1978) suggested to use the Aumann-Shapley rule ordinal serial extension by Sprumont (1998). Sup-
to determine telephone billing rates in the context pose that the partial derivatives of c C n are
of sharing the cost of a telephone system. This bounded away from 0, i.e., there is a such that
extension of proportional cost sharing calculates @ ic(x) > a for all x ℝNþ. The Moulin-Shenker
marginal costs along the path gAS(t) = tx for t rule mMS is generated by the path gMS as solution to
[0, 1]. Then the system of ordinary differential equations
ð1
8P
mAS
i ð x, c Þ ¼ x i @ i cðtxÞ dt: ð31Þ < j N @ j cðgðtÞÞ
if gi ðtÞ < xi ,
0
g0i ðtÞ ¼ @ i cðgðtÞÞ
:
0 else:
The Aumann-Shapley rule can be interpreted
as the Shapley value of the non-atomic game ð33Þ
where each unit of the good is a player (see
Aumann and Shapley 1974). It is the uniform The interpretation of this path is that at each
average over marginal costs along all increasing moment the total expenditure for production of
paths from 0N to x. The following is a classic result extra units for the different agents is equal; if good
in the cost sharing literature: 2 is twice as expensive as good 1, then the pro-
duction device gMS will produce twice as much of
Theorem 7.14 Mirman and Tauman (1982), the good 1. The serial rule embraces the same
Billera and Heath (1982) idea – as long as an agent desires extra production,
There is only one additive, positive, and scale the corresponding incremental costs are equally
invariant cost sharing rule on P n that extends the split. This makes mMS a natural extension of the
proportional rule, and this is mAS. serial rule. Call ti the completion time of produc-
i ðtÞ < xi if t < ti and
tionfor agent i, i.e., gMS
MS
gi ti ¼ xi . Assume without loss of generality
Example 7.15 If c is positively homogeneous,
that these completion times are ordered such that
i.e., c(ay) = ac(y) for a 0 and all y ℝNþ ,
0 ¼ t0 t1 t2 . . . tn , then the
then
Moulin-Shenker rule is given by
@ic
i ðx, cÞ ¼
mAS
@xi
ðxÞ,
i.e., mAS calculates the marginal costs of the i-th X
i
c gMS t‘ c gMS t‘1
i ðx, cÞ
mMS ¼
n ‘ þ 1
:
good at the final production level x. The risk ‘¼1
measures (cost functions) as in Denault (2001)
ð34Þ
are of this kind. ⊲
Friedman-Moulin Rule Note that the path varies with the cost function
This serial extension (Friedman and Moulin 1999) and that this is the reason why mMS is a non-additive
calculates marginal costs along the diagonal path, solution. Such solutions – though intuitive – are in
i.e., gFM(t) = t1N ^ x general notoriously hard to analyze. There are two
452 Cost Sharing in Production Economies
5 γ fm
shares are given by
1 1
2 ðx , cÞ ¼
mMS
2
cðg ð5ÞÞ ¼ cð5, 10Þ
2
1 20
¼ e 1 ,
2
mMS
1 ð x , c Þ ¼ 2 ðx , cÞ þ cðg ð10ÞÞ cðg ð5ÞÞ
mMS
1 0
¼ e30 e20 þ 1 : 0 5 10
2
Good 1
For x the cost sharing rules mAS and mFM use Cost Sharing in Production Economies, Fig. 12 Paths
essentially the same symmetric path g(t) = (t, t), for mMS, mAS, mFM induced by q = (5; 10)
454 Cost Sharing in Production Economies
Then the latter expression is not monotonic Strategic Properties of Fixed-Path Rules
in x1. One may show that the combinations of Friedman (2002) shows that the fixed-path cost
properties in Theorem 7.14 are incompatible sharing rules essentially have the same strategic
with demand monotonicity. Now what kind of properties as the serial rule. The crucial result in
rules is demand monotonic? The classification of this respect is the following.
all such rules is too complex. We will restrict our
attention to the additive rules with the dummy Theorem 7.18 Friedman (2002)
property, which comprises the idea that a player Consider the demand game G(m, c) where m is
pays nothing if her good is for free: a fixed-path cost sharing rule and c Cn has
strictly increasing partial derivatives. Then the
Dummy If @ ic(y) = 0 for all y, then mi(x, c) = 0 corresponding set O1 of action profiles surviving
for all cost sharing problems ðx, cÞ Pn . the successive elimination of overwhelmed
actions consists of a unique element.
Theorem 7.17 Friedman (2004)
large projects, where time alone and changing getting service, that is, proportional to p(2 p) for
prices may lead to uncertain outcomes and costs. Ann and p2 for Bob. Then according to this cost
Below two models will be discussed, where both sharing rule, cost shares are 2 p for Ann and p for
sources of uncertainty are treated differently. Bob. In particular, this means that Ann and Bob are
In practice agents commit themselves to a cost considered equally liable toward both projects.
sharing rule ex ante, before the uncertainty is A more subtle rule is the following ex post rule,
revealed. Then ex post there may be asymmetries which considers cost shares in relation to what is
in service levels for agents and/or differences in actually realized successfully. The ex post cost
liabilities toward total costs. The two models sharing rule proposes to share the costs equally if
show that fairness allows for ex ante and ex post both Ann and Bob are served or none is served, and
interpretations. In the former sections, it was clear Ann pays if she is served but not Bob. This rule
that pinning down fairness of a solution can takes the expectation over the deterministic cost
already be an ambiguous task if only agents sharing problems after realization turns out to be
share heterogeneous characteristics which do not success or failure (and this is the property of a rule
allow for a straightforward comparison in terms of that will be referred to as independence of timing).
a complete or partial ordering. This also holds for Then the corresponding cost shares are:
the combination of heterogeneous agents in those
models with uncertainty. XP XP 2 1 1
mA , mB ¼ p þ 1 p2 , þ 2pð1 pÞð1, 0Þ
2 2
1 1
Sharing Cost of Success and Failure of Projects ¼ þ pð1 pÞ, pð1 pÞ
2 2
The first type of uncertainty in cost sharing prob-
lems is addressed in (Hougaard and Moulin The needs priority cost sharing rule deter-
2017), which draws on ideas for the deterministic mines for each realized combination of projects
case in (Hougaard and Moulin 2014). Each agent per project which of the agents’ demands are
has a binary inelastic demand for service of certain fulfilled iff the project is successful and charges
projects. Ex ante it is not clear which of the projects each of these agents equally for its cost. For
will actually work successfully and can be used to instance, in the situation at hand, after successful
satisfy the agents’ demand. The agents face the realization of only project a only for Ann, the
problem of sharing the cost of these unreliable outcome is critical (Bob is not served anyways,
(discrete) projects. Ex post it is clear what projects since b has failed), and she is charged with the
or parts of projects are functioning and which full cost 1. In the same fashion, Bob is charged
agents get full, partial, or no service. Two cost for the full cost when both projects a and b are up
sharing rules are presented which take different and running, since each is critical to Bob, but not
positions regarding ex ante and ex post properties. to Ann. If no project is a success, then costs are
This will be illustrated using the simple but illustra- split equally. Then this leads to the following
tive example as in of the authors: expected cost shares:
A , mB
mNP ¼ ℙ½:a ^ :bð1, 1Þ
NP
Example 8.1 Consider the following cost sharing
problem among agents Ann and Bob: Ann is satis- þðℙ½:a ^ b þ ℙ½a ^ :bÞð2,0Þ
fied with service of only one of the projects a or b, þℙ½a ^ b ð0,2Þ ¼ ð1 pÞ2 ð1,1Þ
denoted DA = {a _ b}, whereas Bob needs both þ2p
ð1 pÞð1,0Þ2 þ p ð0,2Þ
2
projects to succeed, i.e., DB = {a ^ b}. Suppose that ¼ 1 þ 2p 3p , 1 2p þ 3p2
each project is of cost 1 and that probability that the
probabilities that the projects are successful are Notice that Ann and Bob are charged the same
independent and equal to p. In particular, the prob- costs in case p ¼ 23. ⊲
ability that Bob gets the required service equals p2. The model and the above two cost sharing rules
A pure ex ante idea of sharing costs would be to will now be formally defined. Let A be a set of
share the cost proportional to the probabilities of risky and costly projects and c(a) is the cost of
456 Cost Sharing in Production Economies
carrying out project a. A subset X A of projects (ii) The needs priority rule is characterized by No
is realized successfully with probability p(X). Charge for No Service and Useless is
Now the set of agents N has to share the aggregate Free.
cost c(A) ≔ a Ac(a), no matter which of the Here Liable for Flexibility states that when an
projects are successfully running. agent i’s needs are easier to fulfill than j’s, and i is
A cost allocation problem under risk for agents set therefore more likely to be served, then i is con-
N is an ordered triple(Q,p, c) where Q ¼ ðN, A, DÞ sidered (weakly) more liable than j. Liable for
and where D ¼ Di i N is the profile of Single-Minded Needs demands that those agents
(heterogeneous) demands. Here Di 2A is the service having demand for a single project are assigned
set of i, meaning that i is served iff projects in Di are the highest liabilities to that project. In contrast,
successfully carried out. The ex post cost sharing rule Useless is Free states that the liability of an agent
mXP calculates equal cost shares for the agents served i toward a project a is 0 whenever there is an agent
at X, i.e., S(Q; X) = {i N | X Di}, and determines j being single-minded about a.
the expected cost shares with respect to p:
!
X
m ðQ, p, cÞ ¼
XP
pðXÞ
e½SðQ, XÞ cðAÞ: Sharing a Random Cost Under Limited
X
Liabilities
The setting of Koster and Boonen (2019) is that of
Here E[S] stands for the vector in the unit
a multi-divisional firm with a central service unit –
simplex D(N) over N such that E(S)i = 1/jSj if
to which all divisions have equal access. Running
i S and 0 otherwise; in addition we put
this shared facility is costly, and the divisions are
E[∅]i = E[N]i = 1/jNj for all i N.
charged for the full and uncertain cost. So the
Using additional notation e[S1, S2] ≔ e[S1] if
distinction with the previous model in terms of
S 6¼ ∅ and e[∅; S2] = e[S2], the needs priority
1
uncertainty is that demands are all the same, but
cost sharing rule mXP is defined as
the only stochastic variable is the cost itself. Again
mNP ðQ,p,cÞ ¼ we seek to share the ex post cost by cost sharing
0 1 rules to which the agent commit themselves, before
X X
@ pðX Þ
e S ðQ;X ÞnSðQ; X na ; SðQ; X A cðaÞ: realization of the project. New element here is that
aA X PðAÞ
the benevolent board of the firm puts bounds on the
maximal liability of a division, subject to a feasible
It should immediately be clear that a major dif- allocation of costs. The maximal liabilities of the
ference between mXP and mNP is that the latter divisions may differ due to risk capital allocations
exploits the full structure of which projects are crit- within the firm that limit the capacity to bear risk
ical for serving an agent upon realization or not. The for the divisions (see, e.g., Kamiya and Zanjani
result of Hougaard and Moulin (2017) is that a 2017; Myers and Read 2001; Zanjani 2002).
couple of structural properties (like the counterparts
of additivity and separability in the former section)
Example 8.3 For instance, consider a three-
combined with rather weak fairness criteria single
agent cost sharing problem concerning a project
out the two cost sharing rules mXP and mNP.
with a random C
Un(0, 10). Suppose that for
Theorem 8.2 (Hougaard and Moulin 2017) the three agents the vector of liabilities is given,
Among the cost sharing rules satisfying Cost i.e., L = (2, 3, 8). This means that agent 1, 2, and
Additivity, Independence of Timing, and Separa- 3, respectively, cannot be charged more than 2,
ble Across Projects 3, and 8, respectively. So these liabilities are
high enough such that even the possible realiza-
(i) The ex post service rule is characterized by No tion of the maximum cost Cmax = 10 can be
Charge for No Service, Liable for Flexibil- shared, but not in a strict egalitarian way for
ity, and Liable for Single-Minded Needs each realization. ⊲
Cost Sharing in Production Economies 457
Utility The objective of each agent is to mini- The structure of the rule c is simple, as it
mize V ðXÞ ¼ FX ½X, where X L 1 is interpreted determines a vector of transfers allocated to the
as a future cost and FX is a probability measure divisions in absence of costs, and the uniform
that may depend on the ordering of X. V is a “gains” rationing method rUG is applied to the
coherent risk measure as in Artzner et al. (1999). remaining variable component of the costs.
An example of such coherent risk measure is the Importantly there is a unique vector of so-called
expected value, but also all dual utility functions transfers t ℝN such that
as in Yaari (1987). This allows for a much wider
spectrum of utilities, especially those reflecting
cðL, C, V Þ ¼ t þ r UG ðL t, CÞ
risk averse agents.
4
ψiE (L, C, V ) →
0
γ1 γ2 γ3
-1 C→
Br^anzei R, Ferrari G, Fragnelli V, Tijs S (2002) Two Haviv M (2001) The Aumann-Shapley price mechanism for
approaches to the problem of sharing delay costs in allocating congestion costs. Oper Res Lett 29:211–215
joint projects. Ann Oper Res 109:359–374 Henriet D, Moulin H (1996) Traffic-based cost allocation
Chen Y (2003) An experimental study of serial and aver- in a network. RAND J Econ 27:332–345
age cost pricing mechanisms. J Public Econ Hougaard JL, Moulin H (2014) Sharing the cost of redun-
87:2305–2335 dant projects. Games Econ Behav 87:339–352
Clarke EH (1971) Multipart pricing of public goods. Public Hougaard JL, Moulin H (2017) Sharing the cost of risky
Choice 11:17–33 projects. Economic Theory. https://doi.org/10.1007/
Dasgupta P, Hammond P, Maskin E (1979) The imple- s00199-017-1034-3
mentation of social choice rules: some general results Hougaard JL, Thorlund-Petersen L (2000) The stand-alone
on incentive compatibility. Rev Econ Stud test and decreasing serial cost sharing. Economic The-
46:185–216 ory 16:355–362
Davis M, Maschler M (1965) The kernel of a cooperative Hougaard JL, Thorlund-Petersen L (2001) Mixed serial
game. Nav Res Logist Q 12:223–259 cost sharing. Math Soc Sci 41:51–68
de Frutos MA (1998) Decreasing serial cost sharing under Hougaard JL, Tind J (2007) Cost allocation and convex
economies of scale. J Econ Theory 79:245–275 data envelopment. Eur J Oper Res 194:939–947
Demers A, Keshav S, Shenker S (1990) Analysis and Iñarra E, Isategui JM (1993) The Shapley value and aver-
simulation of a fair queueing algorithm. age convex games. Int J Game Theory 22:13–29
J Internetworking 1:3–26 Israelsen D (1980) Collectives, communes, and incentives.
Denault M (2001) Coherent allocation of risk capital. J Comp Econ 4:99–124
J Risk 4:1 Jackson MO (2001) A crash course in implementation
Dewan S, Mendelson H (1990) User delay costs and inter- theory. Soc Choice Welf 18:655–708
nal pricing for a service facility. Manag Sci Joskow PL (1976) Contributions of the theory of marginal
36:1502–1517 cost pricing. Bell J Econ 7:197–206
Dutta B, Ray D (1989) A concept of egalitarianism under Kalai E (1977) Proportional solutions to bargaining situa-
participation constraints. Econometrica 57:615–635 tions: interpersonal utility comparisons. Econometrica
Dutta B, Ray D (1991) Constrained egalitarian allocations. 45:1623–1630
Games Econ Behav 3:403–422 Kaminski M (2000) ‘Hydrolic’ rationing. Math Soc Sci
Flam SD, Jourani A (2003) Strategic behavior and partial 40:131–155
cost sharing. Games Econ Behav 43:44–56 Kamiya S, Zanjani G (2017) Egalitarian equivalent capital
Fleurbaey M, Sprumont Y (2009) Sharing the cost of a allocation. N Am Actuar J 21:382–396
public good without subsidies. J Public Econ Theory Kolpin V, Wilbur D (2005) Bayesian serial cost sharing.
11:1–9 Math Soc Sci 49:201–220
Friedman E (2002) Strategic properties of heterogeneous Koster M (2002) Concave and convex serial cost sharing.
serial cost sharing. Math Soc Sci 44:145–154 In: Borm P, Peters H (eds) Chapters in game theory.
Friedman E (2004) Paths and consistency in additive cost Kluwer, Dordrecht
sharing. Int J Game Theory 32:501–518 Koster M (2005) Sharing variable returns of cooperation.
Friedman E, Moulin H (1999) Three methods to share joint CeNDEF working paper 05-06. University of Amster-
costs or surplus. J Econ Theory 87:275–312 dam, Amsterdam
Friedman E, Shenker S (1998) Learning and implementa- Koster M (2006) Heterogeneous cost sharing, the directional
tion on the Internet. Working paper 1998–21. Depart- serial rule. Math Methods Oper Res 64:429–444
ment of Economics, Rutgers University Koster M (2007) The Moulin-Shenker rule. Soc Choice
González-Rodríguez P, Herrero C (2004) Optimal sharing Welf 29:271–293
of surgical costs in the presence of queues. Math Koster M (2012) Consistent cost sharing, Math Meth Oper
Methods Oper Res 59:435–446 Res 75:1–28. https://doi.org/10.1007/s00186-011-0372-3
Granot D, Huberman G (1984) On the core and nucleolus Koster M, Boonen T (2019) Constrained stochastic cost
of minimum cost spanning tree games. Math Program allocation. Math Soc Sc 101:20–30
29(1984):323–347 Koster M, Tijs S, Borm P (1998) Serial cost sharing
Green J, Laffont JJ (1977) Characterization of satisfactory methods for multicommodity situations. Math Soc Sci
mechanisms for the revelation of preferences for public 36:229–242
goods. Econometrica 45:427–438 Koster M, Molina E, Sprumont Y, Tijs ST (2002) Sharing
Groves T (1973) Incentives in teams. Econometrica the cost of a network: core and core allocations. Int
41:617–663 J Game Theory 30:567–599
Haimanko O (2000) Partially symmetric values. Math Koster M, Reijnierse H, Voorneveld M (2003) Voluntary
Oper Res 25:573–590 contributions to multiple public projects. J Public Econ
Harsanyi J (1967) Games with incomplete information Theory 5:25–50
played by Bayesian players. Manag Sci 14:159–182 Koutsoupias E, Papadimitriou C (1999) Worst-case equi-
Hart S, Mas-Colell A (1989) Potential, value, and consis- libria.In: 16th annual symposiumon theoretical aspects
tency. Econometrica 57:589–614 of computer science, Trier, pp 404–413
Cost Sharing in Production Economies 461
Legros P (1986) Allocating joint costs by means of the Moulin H (2002) Axiomatic cost and surplus-sharing. In:
nucleolus. Int J Game Theory 15:109–119 Arrow KJ, Sen AK, Suzumura K (eds) Handbook of
Leroux J (2004) Strategy-proofness and efficiency are social choice and welfare. Handbooks in economics,
incompatible in production economies. Econ Lett vol 19. North-Holland Elsevier, Amsterdam,
85:335–340 pp 289–357
Leroux J (2008) Profit sharing in unique Nash equilibrium: Moulin H (2008) The price of anarchy of serial, average
characterization in the two-agent case. Games Econ and incremental cost sharing. Economic Theory
Behav 62(2):558–572 36:379–405
Littlechild SC, Owen G (1973) A simple expression for the Moulin H (2010) An efficient and almost budget balanced
Shapley value in a simple case. Manag Sci 20:370–372 cost sharing method. Games Econ Behav 70:107–131
Littlechild SC, Thompson GF (1977) Aircraft landing fees: Moulin H, Shenker S (1992) Serial cost sharing.
agame theory approach. Bell J Econ 8:186–204 Econometrica 60:1009–1037
Maniquet F, Sprumont Y (1999) Efficient strategy-proof Moulin H, Shenker S (1994) Average cost pricing versus
allocation functions in linear production economies. serial cost sharing; an axiomatic comparison. J Econ
Economic Theory 14:583–595 Theory 64:178–201
Maniquet F, Sprumont Y (2004) Fair production and allo- Moulin H, Shenker S (2001) Strategy-proof sharing of
cation of an excludable nonrival good. Econometrica submodular cost: budget balance versus efficiency.
72:627–640 Economic Theory 18:511–533
Maschler M (1990) Consistency. In: Ichiishi T, Neyman A, Moulin H, Sprumont Y (2005) On demand responsiveness
Tauman Y (eds) Game theory and applications. Aca- in additive cost sharing. J Econ Theory 125:1–35
demic, New York, pp 183–186 Moulin H, Sprumont Y (2006) Responsibility and cross-
Maschler M (1992) The bargaining set, kernel and nucle- subsidization in cost sharing. Games Econ Behav
olus. In: Aumann RJ, Hart S (eds) Handbook of game 55:152–188
theory with economic applications, vol I. North- Moulin H, Vohra R (2003) Characterization of additive
Holland, Amsterdam cost sharing methods. Econ Lett 80:399–407
Maschler M, Reijnierse H, Potters J (1996) Monotonicity Moulin H, Watts A (1997) Two versions of the tragedy of
properties of the nucleolus of standard tree games. Int the commons. Econ Des 2:399–421
J Game Theory 39:89–104 Mutuswami S (2004) Strategyproof cost sharing of a
Maskin E, Sjöström T (2002) Implementation theory. In: binary good and the egalitarian solution. Math Soc
Arrow KJ, Sen AK, Suzumura K (eds) Handbook of Sci 48:271–280
social choice and welfare, vol I. North-Holland, Myers SC, Read JA (2001) Capital allocation for insurance
Amsterdam companies. J Risk Insur 68:545–580
Matsubayashi N, Umezawa M, Masuda Y, Nishino Myerson RR (1980) Conference structures and fair alloca-
H (2005) Cost allocation problem arising in hub- tion rules. Int J Game Theory 9:169–182
spoke network systems. Eur J Oper Res 160:821–838 Myerson RR (1991) Game theory: analysis of conflict.
McLean RP, Pazgal A, Sharkey WW (2004) Potential, Harvard University Press, Cambridge, MA
consistency, and cost allocation prices. Math Oper Nash JF (1950) Equilibrium points in n-person games.
Res 29:602–623 Proc Natl Acad Sci 36:48–49
Mirman L, Tauman Y (1982) Demand compatible equita- O’Neill B (1982) A problem of rights arbitration from the
ble cost sharing prices. Math Oper Res 7:40–56 Talmud. Math Soc Sci 2:345–371
Monderer D, Shapley LS (1996) Potential games. Games Osborne MJ (2004) An introduction to game theory.
Econ Behav 14:124–143 Oxford University Press, New York
Moulin H (1987) Equal or proportional division of a sur- Osborne MJ, Rubinstein A (1994) A course in game the-
plus, and other methods. Int J Game Theory ory. MIT Press, Cambridge
16:161–186 Peleg B, Sudhölter P (2004) Introduction to the theory of
Moulin H (1994) Serial cost-sharing of an excludable cooperative games. Series C: theory and decision
public good. Rev Econ Stud 61:305–325 library series. Springer-Verlag Berlin Heidelberg
Moulin H (1995a) Cooperative microeconomics: agame- Pérez-Castrillo D, Wettstein D (2006) An ordinal Shapley
theoretic introduction. Prenctice Hall, London value for economic environments. J Econ Theory
Moulin H (1995b) On additive methods to share joint costs. 127:296–308
Jpn Econ Rev 46:303–332 Potters J, Sudhölter P (1999) Airport problems and consis-
Moulin H (1996) Cost sharing under increasing returns: a tent allocation rules. Math Soc Sci 38:83–102
comparison of simple mechanisms. Games Econ Behav Razzolini L, Reksulak M, Dorsey R (2004) An experimen-
13:225–251 tal evaluation of the serial cost sharing rule. Theor
Moulin H (1999) Incremental cost sharing: characteriza- Decis 63:283–314
tion by coalition strategy-proofness. Soc Choice Welf Ritzberger K (2002) Foundations of non-cooperative game
16:279–320 theory. Oxford University Press, Oxford
Moulin H (2000) Priority rules and other asymmetric Rosenthal RW (1973) A class of games possessing pure-
rationing methods. Econometrica 68:643 strategy Nash equilibria. J Econ Theory 2:65–67
462 Cost Sharing in Production Economies
Roth AE (1988) The Shapley value, essays in honor of Sudhölter P (1998) Axiomatizations of game theoretical
Lloyd S. Shapley. Cambridge University Press, Cam- solutions for oneoutput cost sharing problems. Games
bridge, pp 307–319 Econ Behav 24:42–71
Samet D, Tauman Y, Zang I (1984) An application of the Suijs J, Borm P, Hamers H, Koster M, Quant M (2005)
Aumann-Shapley prices for cost allocation in transpor- Communication and cooperation in public network
tation problems. Math Oper Res 9:25–42 situations. Ann Oper Res 137:117–140
Sánches SF (1997) Balanced contributions axiom in the Tauman Y (1988) The Aumann-Shapley prices: a survey.
solution of cooperative games. Games Econ Behav In: Roth A (ed) The Shapley value. Cambridge Univer-
20:161–168 sity Press, Cambridge, pp 279–304
Sandsmark M (1999) Production games under uncertainty. Thomas LC (1992) Dividing credit-card costs fairly. IMA
Comput Econ 14:237–253 J Math Appl Bus Ind 4:19–33
Schmeidler D (1969) The nucleolus of a characteristic Thomson W (1996) Consistent allocation rules. Mimeo, Eco-
function game. SIAM J Appl Math 17:1163–1170 nomics Department, University of Rochester, Rochester
Shapley LS (1953) A value for n-person games. Ann Math Thomson W (2001) On the axiomatic method and its recent
Study 28:307–317. Princeton University Press, Princeton applications to game theory and resource allocation.
Shapley LS (1967) On balanced sets and cores. Nav Res Soc Choice Welf 18:327–386
Logist Q 14:453–460 Tijs SH, Driessen TSH (1986) Game theory and cost allo-
Shapley LS (1969) Utility comparison and the theory of cation problems. Manag Sci 32:1015–1028
games. In: La decision: Aggregation et dynamique des Tijs SH, Koster M (1998) General aggregation of demand
ordres de preference. Editions du Centre National de la and cost sharing methods. Ann Oper Res 84:137–164
Recherche Scientifique, Paris, pp 251–263. Also in Timmer J, Borm P, Tijs S (2003) On three Shapley-like
Roth AE (ed) (1988) The Shapley value, essays in solutions for cooperative games with random payoffs.
honor of Lloyd S. Shapley. Cambridge University Int J Game Theory 32:595–613
Press, Cambridge, pp 307–319 van de Nouweland A, Tijs SH (1995) Cores and related
Shapley LS (1971) Cores of convex games. Int J Game solution concepts for multi-choice games. Math
Theory 1:1–26 Methods Oper Res 41:289–311
Sharkey W (1982) Suggestions for a game-theoretic von Neumann J, Morgenstern O (1944) Theory of games and
approach to public utility pricing and cost allocation. economic behavior. Princeton University Press, Princeton
Bell J Econ 13:57–68 Watts A (2002) Uniqueness of equilibrium in cost sharing
Sharkey W (1995) Network models in economics. In: Ball games. J Math Econ 37:47–70
MO et al (eds) Network routing. Handbook in opera- Weber RJ (1988) Probabilistic values for games. In: Roth
tions research and management science, vol 8. North- AE (ed) The Shapley value. Cambridge University
Holland, Amsterdam Press, Cambridge
Shubik M (1962) Incentives, decentralized control, the Yaari ME (1987) The dual theory of choice under risk.
assignment of joint cost, and internal pricing. Manag Econometrica 55:95–115
Sci 8:325–343 Yeh CH (2008) Secured lower bound, composition up, and
Skorin-Kapov D (2001) On cost allocation in hub-like minimal rights first for bankruptcy problems. J Math
networks. Ann Oper Res 106:63–78 Econ 44:925–932
Skorin-Kapov D, Skorin-Kapov J (2005) Threshold based Young HP (1985a) Producer incentives in cost allocation.
discounting network: the cost allocation provided by Econometrica 53:757–765
the nucleolus. Eur J Oper Res 166:154–159 Young HP (1985b) Monotonic solutions of cooperative
Sprumont Y (1998) Ordinal cost sharing. J Econ Theory games. Int J Game Theory 14:65–72
81:126–162 Young HP (1985c) Cost allocation: methods, principles,
Sprumont Y (2000) Coherent cost sharing. Games Econ applications. North-Holland, Amsterdam
Behav 33:126–144 Young HP (1988) Distributive justice in taxation. J Econ
Sprumont Y (2005) On the discrete version of the Aumann- Theory 44:321–335
Shapley cost-sharing method. Econometrica Young HP (1994) Cost allocation. In: Aumann RJ, Hart
73:1693–1712 S (eds) Handbook of game theory, vol II. Elsevier,
Sprumont Y, Ambec S (2002) Sharing a river. J Econ Amsterdam, pp 1193–1235
Theory 107:453–462 Young HP (1998) Cost allocation, demand revelation, and
Sprumont Y, Moulin H (2007) Fair allocation of produc- core implementation. Math Soc Sci 36:213–229
tion externatlities: recent results, Revue d’Économie Zanjani G (2002) Pricing and capital allocation in catastro-
Politique 2007/1 (117) phe insurance. J Financ Econ 65:283–305
involved in production, and so on. A large
Market Games and Clubs economy has many participants.
Asymptotic negligibility A pregame satisfies
Myrna Wooders asymptotic negligibility if vanishingly small
Department of Economics, Vanderbilt University, groups can have only negligible effects on per
Nashville, TN, USA capita payoffs.
Club A club is a group of agents or players that
forms for the purpose of carrying out come
Article Outline activity, such as providing a local public good.
Core The core of a game is the set (possibly
Glossary empty) of feasible outcomes – divisions of
Definition of the Subject the worths arising from coalition formation
Introduction among the players of the game – that cannot
Transferable Utility Games; Some Standard be improved upon by any coalition of players.
Definitions Game A (cooperative) game (in characteristic
A Market form) is defined simply as a finite set of players
Market-Game Equivalence and a function or correspondence ascribing a
Equivalence of Markets and Games with Many worth (a non-negative real number, interpreted
Players as an idealized money) to each nonempty sub-
Cores and Approximate Cores set of players, called a group or coalition
Nonemptiness and Convergence of Approximate Market games A market game is a game derived
Cores of Large Games from a market. Given a market and a group of
Shapley Values of Games with Many Players agents we can determine the total utility
Economies with Clubs (measured in money) that the group can
With a Continuum of Players achieve using only the endowments belonging
Other Related Concepts and Results to the group members, thus determining
Some Remarks on Markets and More General a game.
Classes of Economies Market A market is defined as a private goods
Conclusions and Future Directions economy in which all participants have utility
Bibliography functions that are linear in (at least) one com-
modity (money).
Glossary Payoff vector A payoff vector is a vector listing
a payoff (an amount of utility or money) for
An economy We use the term ‘economy’ to each player in the game.
describe any economic setting, including econ- Per capita boundedness A pregame satisfies per
omies with clubs, where the worth of club capita boundedness if the supremum of the
members may depend on the characteristics average worth of any possible group of players
of members of the club, economies with pure (the per capita payoff) is finite.
public goods, local public goods (public goods Pregame A pair, consisting of a set of player
subject to crowding and/or congestion), econ- types (attributes or characteristics) and a func-
omies with production where what can be pro- tion mapping finite lists of characteristics
duced and the costs of production may depend (repetitions allowed) into the real numbers. In
on the characteristics of the individuals interpretation, the pregame function ascribes a
worth to every possible finite group of players, (or disutility) from engaging in processes that
where the worth of a group depends on the lead to the eventual exchange of commodities.
numbers of players with each characteristic in The question is when are such economic struc-
the group. A pregame is used to generate tures equivalent to markets with concave utility
games with arbitrary numbers of players. functions.
Price taking equilibrium A price taking equi- This paper summarizes research showing that a
librium for a market is a set of prices, one for broad class of large economies generate balanced
each commodity, and an allocation of com- market games. The economies include, for exam-
modities to agents so that each agent can afford ple, economies with clubs where individuals may
his part of the allocation, given the value of his have memberships in multiple clubs, with indivis-
endowment. ible commodities, with nonconvexities and with
Shapley value The Shapley value of a game is non-monotonicities. The main assumption are:
feasible outcome of a game in which all players (1) that an option open to any group of players is
are assigned their expected marginal contribu- to break into smaller groups and realize the sum of
tion to a coalition when all orders of coalition the worths of these groups, that is, essential super-
formation are equally likely. additivity is satisfied and: (2) relatively small
Small group effectiveness A pregame satisfies groups of participants can realize almost all
small group effectiveness if almost all gains to gains to coalition formation.
collective activities can be realized by cooper- The equivalence of games with many players
ation only within arbitrarily small groups and markets with many participants indicates that
(coalitions) of players. relationships obtained for markets with concave
Totally balanced game A game is totally bal- utility functions and many participants will also
anced if the game and every subgame of the hold for diverse social and economic situations
game (a game with player set taken as some with many players. These relationships include:
subset of players of the initially given game) (a) equivalence of the core and the set of compet-
has a nonempty core. itive outcomes; (b) the Shapley value is contained
in the core or approximate cores; (c) the equal
treatment property holds – that is, both market
Definition of the Subject equilibrium and the core treat similar players sim-
ilarly. These results can be applied to diverse
The equivalence of markets and games concerns economic models to obtain the equivalence of
the relationship between two sorts of structures cooperative outcomes and competitive, price tak-
that appear fundamentally different – markets and ing outcomes in economies with many partici-
games. Shapley and Shubik (1969) demonstrates pants and indicate that such results hold in yet
that: (1) games derived from markets with con- more generality.
cave utility functions generate totally balanced
games where the players in the game are the
participants in the economy and (2) every totally Introduction
balanced game generates a market with concave
utility functions. A particular form of such a mar- One of the subjects that has long intrigued econ-
ket is one where the commodities are the partici- omists and game theorists is the relationship
pants themselves, a labor market for example. between games, both cooperative and noncooper-
But markets are very special structures, more ative, and economies. Seminal works making
so when it is required that utility functions be such relationships include Shubik (1959b),
concave. Participants may also get utility from Debreu and Scarf (1963), Aumann (1964),
belonging to groups, such as marriages, or clubs, Shapley and Shubik (1969, 1975) and Aumann
or productive coalitions. It may be that partici- and Shapley (1974), all connecting outcomes of
pants in an economy even derive utility price-taking behavior in large economies with
Market Games and Clubs 465
cores of games. See also Shapley and Shubik game have nonempty cores. A subgame of a game
(1977) and an ongoing stream of papers is simply a group of players S N and the worth
connecting strategic behavior to market behavior. function restricted to that group and the smaller
Our primary concern here, however, is not with groups that it contains.
the equivalence of outcomes of solution concepts Given a market any feasible assignment of
for economies, as is Debreu and Scarf (1963) or commodities to the economic participants gener-
Aumann and Dreze (1974) for example, but rather ates a total worth of each group of participants.
with equivalences of the structures of markets and The worth of a group of participants (viewed as
games. Solution concepts play some role, how- players of a game) is the maximal total utility
ever, in establishing these equivalences and in achievable by the members of the group by allo-
understanding the meaning of the equivalence of cating the commodities they own among them-
markets and games. selves. In this way a market generates a game – a
In this entry, following Shapley and Shubik set of players (the participants in the economy)
(1969), we focus on markets in which utility func- and a worth for each group of players.
tions of participants are quasi-linear, that is, the Shapley and Shubik (1969) demonstrate that
utility function u of a participant can be written as any market where all participants have concave,
u(x, x) = û(x) + x where x RLþ is a commodity monotonic increasing utility functions generates a
bundle, x R is interpreted as money and û is a totally balanced game and that any totally bal-
continuous function. Each participant in an econ- anced game generates a market, thus establishing
omy has an endowment of commodities and, an equivalence between a class of markets and
without any substantive loss of generality, it is totally balanced cooperative games. A particular
assumed that no money is initially endowed. The sort of market is canonical; one where each par-
price of money is assumed equal to one. A price ticipant in the market is endowed with one unit of
taking equilibrium for a market then consists of a a commodity, his “type”. Intuitively, one might
price vector p ℝL for the commodities. And an think of the market as one where each participant
assignment of commodities to participants such owns one unit of himself or of his labor.
that: the total amounts of commodities assigned to In the last 20 years or so there has been sub-
participants equals the total amount of commodi- stantial interest in broader classes of economies,
ties with which participants are endowed and; including those with indivisibilities, non-
given prices, each participant can afford his monotonicities, local public goods or clubs,
assignment of commodities and no participant, where the worth of a group depends not only on
subject to his budget constraint, can afford a pre- the private goods endowed to members of the
ferred commodity bundle. group but also on the characteristics of the group
We also treat games with side payments, alter- members. For example, the success of the mar-
natively called games with transferable utility or, riage of a man and a woman depends on their
in brief, TU games. Such a game consists of a characteristics and on whether their characteristics
finite set N of players and a worth function that are complementary. Similarly, the output of a
assigns to each group of players S N a real machine and a worker using the machine depends
number v(S) ℝ+, called the worth of the on the quality and capabilities of the machine and
group. In interpretation, v (S) is the total payoff how well the abilities of the worker fit with the
that a group of players can realize by cooperation. characteristics of the machine – a concert pianist
A central game-theoretic concept for the study of fits well with an high quality piano but perhaps not
games is the core. The core consists of those so well with a sewing machine. Or how well a
divisions of the maximal total worth achievable research team functions depends not only on the
by cooperation among the players in N so that members of the team but also on how well they
each group of players is assigned at least its interact. For simplicity, we shall refer to these
worth. A game is balanced if it has a nonempty economies as club economies. Such economies
core and totally balanced if all subgames of the can be modeled as cooperative games.
466 Market Games and Clubs
In this entry we discuss and summarize literature groups gain no market power from size; in other
showing that economies with many participants are words, large groups are inessential. That large
approximated by markets where all participants groups are inessential is equivalent to small
have the same concave utility function and for group effectiveness (Wooders 1992b).
which the core of the game is equivalent to the set A remarkable feature of the results discussed in
of price-taking economic equilibrium payoffs. The this essay is they are independent of any particular
research presented is primarily from Shubik and economic structure.
Wooders (1982a), Wooders (1997) and earlier
papers due to this author. For the most recent results
in this line of research we refer the reader to
Transferable Utility Games; Some
Wooders (2007, 2008a, b). We also discuss other
Standard Definitions
related works throughout the course of the entry.
The models and results are set in a broader context
Let (N, n) be a pair consisting of a finite set N,
in the conclusions.
called a player set, and a function v, called a worth
The importance of the equivalence of markets
function, from subsets of N to the real numbers ℝ
and games with many players relates to the hypoth-
with v(j) = 0. The pair (N, n) is a TU game (also
esis of perfect competition, that large numbers of
called a game with side payments). Nonempty
participants leads to price-taking behavior, or
subsets S of N are called groups (of players) and
behavior “as if” participants took prices as given.
the number of members of the group S is given by
Von Neumann and Morgenstern perceived that
|S|. Following is a simple example.
even though individuals are unable to influence
market prices and cannot benefit from strategic
behavior in large markets, large “coalitions”
Example 1 A glove game: Suppose that we can
might form. Von Neumann and Morgenstern write:
partition a player set N into two groups, say N1
It is neither certain nor probable that a mere increase and N2. In interpretation, a member of N1 is endo-
in the number of participants might lead in fine to the
wed with a right-hand (RH) glove and a member
conditions of free competition. The classical defini-
tions of free competition all involve further postulates of N2 is endowed with a left-hand (LH) glove. The
besides this number. E.g., it is clear that if certain worth of a pair of gloves is $1, and thus the worth
great groups of individuals will -for any reason of a group of players consisting of player i N1
whatsoever-act together, then the great number of and player j N2 is $1. The worth of a single
participants may not become effective; the decisive
exchanges may take place directly between large glove and hence of a one-player group is $0. The
“coalitions”, few in number and not between indi- worth of a group S N is given by v(S) = min {|S
T T
viduals, many in number acting independently. . . . N1|, |S N2|}. The pair (N, n) is a game.
Any satisfactory theory . . . will have to explain when
A payoff vector for a game (N, n) is a vector
such big coalitions will or will not be formed – i.e.,
when the large numbers of participants will become ū ℝN. We regard vectors in finite dimensional
effective and lead to more or less free competition. Euclidean space ℝT as functions from T to ℝ, and
write ūi for the ith component of ū, etc. If S T
The assumption that small groups of individ-
and ū ℝT, we shall write ūS: = (ūi i S) for the
uals cannot affect market aggregates, virtually
restriction of ū to S. We write 1 S for the element of
taken for granted by von Neumann and
ℝS all of whose coordinates are 1 (or simply 1 if
Morgenstern, lies behind the answer to the ques-
no confusion can arise.) A payoff vector u is fea-
tion they pose. The results presented in this entry
sible for a group S N if
suggest that the great number of participants will
become effective and lead to more or less free
def
X X
K
competition when small groups of participants uð SÞ ¼ ui v Sk (1)
cannot significantly affect market outcomes. iS k¼1
Since all or almost all gains to collective activities
can be captured by relatively small groups, large for some partition {S1, . . .,SK) of S.
Market Games and Clubs 467
Given e 0, a payoff vector ū ℝN is in the then it must hold that for any i N, ūi 0 and for
weak e-core of the game (N, n) if it is feasible and any i,j N, ui + uj 1. Moreover, feasibility
if there is a group of players N 0 N such that dictates that ū1 + ū2 + ū3 1. This is impossible;
thus, the core is empty.
NnN 0 Before leaving this example, let us ask whether
e (2)
jN j it would be possible to subsidize the players by
increasing the payoff to the total player set N and,
and, for all groups S N 0, by doing so, ensure that the core of the game with
a subsidy is nonempty. We leave it to the reader to
uðSÞ vðSÞ ejSj (3) verify that if v (N ) were increased to $3/2
(or more), the new game would have a
where |S| is the cardinality of the set S. nonempty core.
(It would be possible to use two different values Let (N, n) be a game and let i, j N. Then
for epsilon in expressions (2) and (3). For simplic- players i and j are substitutes if, for all groups
ity, we have chosen to take the same value for S N with i, j 2= S it holds that
epsilon in both expressions.)
A payoff vector ū is in the uniform e -core
v S [ fig ¼ vðS [ fjg :
(or simply in the e-core) if if is feasible and if
(3) holds for all groups S N. When e = 0, Let (N, n) be a game and let ū RN be a payoff
then both notions of e-cores will be called sim- vector for the game. If for all players i and j who
ply the core. are substitutes it holds that ūi = ūj then u has the
The glove game (N, n) described in Example 1 equal treatment property. Note that if there is a
has the happy feature that the core is always partition of N into T subsets, say N1, . . ., NT, where
nonempty. For the game to be of interest, we all players in each subset N t are substitutes for
will suppose that there is least one player of each each other, then we can ū by a vector u ℝT
type (that is, there is at least one player with a RH where, for each t, it holds that ut = ūi for all i Nt.
glove and one player with a LH glove). If | N1 | = |
N2| any payoff vector assigning the same share of
a dollar to each player with a LH glove and the Essential Superadditivity
remaining share of a dollar to each player with a We wish to treat games where the worth of a group
RH glove is in the core. If there are more players of players is independent of the total player set in
of one type, say | N1 | > | N2| for specificity, then which it is embedded and an option open to the
any payoff vector in the core assigns $1 to each members of a group is to partition themselves into
player of the scarce type; that is, players with a RH smaller groups; that is, we treat games that are
glove each receive 0 while players with a LH essentially superadditive. This is built into our the
glove each receive $1. definition of feasibility above, (1). An alternative
Not all games have nonempty cores, as the approach, which would still allow us to treat situ-
following example illustrates. ations where it is optimal for players to form
groups smaller than the total player set, would be
Example 2 (A simple majority game with an to assume that v is the “superadditive cover” of
empty core) Let N = {1, 2, 3} and define the some other worth function v0. Given a not-
function v as follows: necessarily-superadditive function v0, for each
group S define v (S) by:
0 if j S j¼ 1, X
v ð SÞ ¼ vðSÞ ¼ max v 0 Sk (4)
1 otherwise:
It is easy to see that the core of the game is where the maximum is taken over all partitions
empty. For if a payoff vector ū were in the core, {Sk} of S; the function v is the superadditive cover
468 Market Games and Clubs
A game determined by the pregame (T, C), achieve in any partition of the group and one
which we will typically call a game or a game way to achieve this payoff is by partitioning into
with side payments, is a pair [n; (T, C)] where n is smaller groups.
a profile. A subgame of a game [n; (T, C)] is a pair A payoff vector ū satisfies the equal-treatment
[s; (T, C)] where s is a subprofile of n. property if utq ¼ utq0 for all q, q0 {1, . . ., nt} and
With any game[n; (T, C)] we can associate a for each t = 1, . . ., T.
game (N, n) in the form introduced earlier as Let [n, (T, C)] be a game and let b be a collec-
follows: Let tion of subprofiles of n. The collection is a bal-
anced collection of subprofiles of n if there X are
N ¼ fðt, qÞ : t ¼ 1, . . . , Tand q ¼ 1, . . . , nt g positive real numbers gs for s b such that
sb
be a player set for the game. For each subset gs s ¼ n . The numbers gs are called balancing
S N define the profile of S, denoted by prof(S) weights. Given real number e 0, the game [n;
ZTþ , by its components (T, C)] is e-balanced if for every balanced collec-
tion b of subprofiles of n it holds that
def
prof ðsÞT ¼ S \ t0 , q : t0 ¼ t and q ¼ 1, . . . , nt X
C ðnÞ gs ðCðsÞ ekskÞ (7)
sb
and define
def
where the balancing weights for b are given by
vðsÞ ¼ Cðprof ðSÞÞ: gs for s b. This definition extends that of
Bondareva (1963) and Shapley (1967) to games
Then the pair (N, n) satisfies the usual defini- with player types. Roughly, a game is (e) balanced
tion of a game with side payments. For any S N, if allowing “part time” groups does not improve
define the total payoff (by more than e per player).
A game [n; (T, C)] is totally balanced if every
v ðsÞ ¼ C ðprof ðSÞÞ:
def
subgame [s; (T, C)] is balanced.
The balanced cover game generated by a game
The game (N, n *) is the superadditive cover of [n; (T, C)] is a game [n; (T, Cb)] where
(N, n).
A payoff vector for a game (N, n) is a vector 1. Cb(s) = C(s) for all s 6¼ n and
ū RN. For each nonempty subset S of N define 2. Cb(n) C(n) and Cb(n) is as small as possible
consistent with the nonemptiness of the core of
X
def
uð SÞ ¼ utq : [n; (T, Cb)].
ðt, qÞ S
From the Bondareva-Shapley Theorem it fol-
A payoff vector ū is feasible for S if lows that Cb(n) = C (n) if and only if the game
[n; (T, C)] is balanced (e-balanced, with e = 0).
uðSÞ v ðSÞ ¼ C ðprof ðSÞÞ: For later convenience, the notion of the bal-
anced cover of a pregame is introduced. Let (T, C)
If S = N we simply say that the payoff vector ū be a pregame. For each profile s, define
is feasible if X
def
Cb ðsÞ ¼ max gs CðgÞ, (8)
uðN Þ v ðN Þ ¼ C ðprof ðN ÞÞ:
b
sb
Note that our definition of feasibility is consis- where the maximum is taken over all balanced
tent with essential superadditivity; a group can collections b of subprofiles of s with weights gg
realize at least as large a total payoff as it can for g b. The pair (T, Cb) is called the balanced
Market Games and Clubs 471
cover pregame of (T, C). Since a partition of a Theorem 1 Let (T, C) be a premarket derived
profile is a balanced collection it is immediately from economic data in which all utility functions
clear that Cb(s) C (s) for every profile s. are concave. Then the pregame generated by the
premarket is totally balanced.
Premarkets
In this section, we introduce the concept of a pre- Direct Markets and Market-Game Equivalence
market and re-state results from Shapley and Shubik Shapley and Shubik (1969) introduced the notion of
(1969) in the context of pregames and premarkets. a direct market derived from a totally balanced
Let L + 1 be a number of types of commodities game. In the direct market, each player is endowed
and let {ût(y, x): t = 1, . . ., T} denote a finite with one unit of a commodity (himself) and all
number of functions, called utility functions, of players in the economy have the same utility func-
the form tion. In interpretation, we might think of this as a
labor market or as a market for productive factors,
^
u t ðy, xÞ ¼ ut ðyÞ þ x, (as in (Owen 1975), for example) where each player
owns one unit of a commodity. For games with
player types as in this essay, we take the player
where y RLþ and x R. (Such functions, in
types of the game as the commodity types of a
the literature of economics, are commonly called
market and assign all players in the market the
quasi-linear). Let {at RLþ : t = 1, . . ., T} be
same utility function, derived from the worth func-
interpreted as a set of endowments. We assume
tion of the game.
that ut(at) 0 for each t. For t = 1,. . .,T we define
def
Let (T, C) be a pregame and let [n; (T, C)] be a
ct ¼ ðut ðÞ, at Þ as a participant type and let = derived game. Let N = {(t, q): t = 1, . . ., T and
{ct: t = 1, . . ., T} be the set of participant types. q = 1, . . ., nt for each t} denote the set of players in
Observe that from the data given by ℂ we can the game where all participants {(t0, q): q = 1, . . .,
0 0 0
construct a market by specifying a set of partici- nt } are of type t for each t = 1, . . ., T. To
pants N and a function from N to ℂ assigning construct the direct market generated by a derived
endowments and utility functions – types – to game [n; (T, C)], we take the commodity space as
each participant in N. A premarket is a pair (T, ). RTþ and suppose that each participant in the market
Let (T, ) be a premarket and let s = (s1, . . ., sT) of type t is endowed with one unit of the tth
ZTþ . We interpret s as representing a group of commodity, and thus has endowment
economic participants with s t participants having 1t = (0, . . ., 0, 1, 0, . . ., 0) RTþ where “1” is
utility functions and endowments given by c t for in the tth position The total endowment of the
t = 1, . . ., T; for each t, that is, there are s t economy is then given by nt1t = n.
participants in the group with type ct. Observe For any vectors y RTþ define
that the data of a premarket gives us sufficient
def
X
data to generate a pregame. In particular, given a uðyÞ ¼ max gs CðsÞ, (9)
profile s = (s 1, . . ., sT) listing numbers of partic- sn
ipants of each of T types, define
the maximum running over all {gs 0: s ZTþ,
def
X s n} satisfying
W ðsÞ ¼ max st u t ðy t Þ X
t
gs s ¼ y: (10)
sn
where the maximum is taken over the set {yt
Rþ: t = 1, . . ., T and t st yt = tatyt}.Then the pair
L
As noted by Shapley and Shubik (1969), but
(T, W) is a pregame generated by the premarket. for our types case, it can be verified that the
The following Theorem is an extension to pre- function u is concave and one-homogeneous.
markets or a restatement of a result due to Shapley This does not depend on the balancedness of the
and Shubik (1969). game [n; (T, C)]. Indeed, one may think of u as the
472 Market Games and Clubs
“balanced cover of [n; (T, C)] extended to RTþ ”. rules out, for example situations such as econo-
Note also that u is superadditive, independent of mies with indivisible commodities. It also rules
whether the pregame (T, C) is superadditive. We out club economies; for a given club structure of
leave it to the interested reader to verify that if C the set of players – in the simplest case, a partition
were not necessarily superadditive and C* is the of the total player set into groups where collective
superadditive cover of C then it holds that maxs activities only occur within these groups – it may
n gsC(s) = max s n gsC (s).. be that utility functions are concave over the set of
Taking the utility function u as the utility func- alternatives available within each club, but utility
tion of each player (t, q) N where N is now functions need not be concave over all possible
interpreted as the set of participants in a market, we club structures. This rules out many examples; we
have generated a market, called the direct market, provide a simple one below.
denoted by [n, u; (T, C)], from the game [n;(T,C)]. To obtain the result that with many players,
Again, the following extends a result of games derived from pregames are market games,
Shapley and Shubik (1969) to pregames. we need some further assumption on pregames. If
there are many substitutes for each player, then the
Theorem 2 Let [n, u; (T, C)] denote the direct simple condition that per capita payoffs are
market generated by a game [n; (T, C)] and let [n; bounded – that is, given a pregame (T, C), that
(T, u)] denote the game derived from the direct there exists some constant K such that CksðksÞ < K for
market. Then, if [n; (T, C)] is a totally balanced all profiles s – suffices. If, however, there may be
game, it holds that [n; (T, u)] and [n; (T, C)] are ‘scarce types’, that is, players of some type
identical. (s) become negligible in the population, then a
stronger assumption of ‘small group effective-
Remark 2 If the game [n; (T, C)] and every
ness’ is required. We discuss these two conditions
subgame [s, (T, C)] has a nonempty core – that
in the next section.
is, if the game is ‘totally balanced’ then the game
[n; (T, u)] generated by the direct market is the
Small Group Effectiveness and Per Capita
initially given game [n; (T, C)]. If however the
Boundedness
game [n; (T, C)] is not totally balanced then u-
This section discusses conditions limiting gains to
(s) C(s) for all profiles s n. But, whether
group size and their relationships. This definition
or not [n; (T, C)] is totally balanced, the game [n;
was introduced in Wooders (1983), for NTU, as
(T, n)] is totally balanced and coincides with the
well as TU, games.
totally balanced cover of [n; (T, C)].
PCB A pregame (T, C) satisfies per capita
Remark 3 Another approach to the equivalence of boundedness (PCB) if
markets and games is taken by Garratt and Qin
(1997), who define a class of direct lottery markets. CðsÞ
PCB : sup isfinite (11)
While a player can participate in only one coalition, s ZTþ ks k
both ownership of coalitions and participation in
coalitions is determined randomly. Each player is or equivalently,
endowed with one unit of probability, his own
participation Players can trade their endowments C ðsÞ
sup isfinite:
at market prices. The core of the game is equivalent s ZTþ ks k
to the equilibrium of the direct market lottery.
It is known that under the apparently mild con-
Equivalence of Markets and Games with ditions of PCB and essential superadditivity, in
Many Players general games with many players of each of a finite
number of player types and a fixed distribution of
The requirement of Shapley and Shubik (1969) player types have nonempty approximate cores;
that utility functions be concave is restrictive. It Wooders (1977, 1983). (Forms of these
Market Games and Clubs 473
assumptions were subsequently also used in all other profiles s have worth of zero. In the
Shubik and Wooders (1982c, 1983b); Kaneko superadditive cover game the worth of a profile
and Wooders (1986); and Wooders (1992b, 1994) s is 0 if s 1 < 2 and otherwise is equal to s 2 plus the
among others.) Moreover, under the same condi- largest even number less than or equal to s1.
tions, approximate cores have the property that Now consider a sequence of profiles (sv)v
most players of the same type are treated approxi- where sv1 ¼ 3 and sv2 ¼ v for all n. Given e > 0,
mately equally (Wooders 1977, 2008a; see also for all sufficiently large player sets the uniform
Shubik and Wooders (1982c)). These results, how- e-core is empty. Take, for example, e = 1/4. If the
ever, either require some assumption ruling out uniform e-core were nonempty, it would have to
‘scarce types’ of players, for example, situations contain an equal-treatment payoff vector. (It is
where there are only a few players of some partic- well known and easily demonstrated that the uni-
ular type and these players can have great effects form e-core of a TU game is nonempty if and only
on total feasible payoffs. Following are two exam- if it contains an equal treatment payoff vector.
ples. The first illustrates that PCB does not control This follows from the fact that the uniform
limiting properties of the per capita payoff function e-core is a convex set.) For the purpose of dem-
when some player types are scarce. onstrating
a contradiction, suppose that
uv ¼ uv1 , uv2 represents an equal treatment payoff
Example 3 (Wooders 2008a) Let T = 2 and let vector in the uniform e-core of [sv; (T, C)]. The
(T, C) be the pregame given by following inequalities must hold:
s1 þ s2 when s1 > 0 3uv1 þ vuv2 v þ 2,
Cðs1 , s2 Þ ¼
0 otherwise: 2uv1 þ vuv2 v þ 2, and
3
uv1 :
The function C obviously satisfies PCB. But 4
there is a problem in defining limC(s1,s2)/s1 + s2
as s1 + s2 tends to infinity, since the limit depends on which is impossible. A payoff vector which
how
v vit is approached.
v v Consider the sequence
assigns each player zero is, however, in the weak
s1 , s2 where s1 , s2 = (0, v); then lim C sv1 , sv2 e-core for any e > vþ3
1
. But it is not very appealing,
/sv1 , sv2 = 0. Now suppose
v v in contrast that sv1 , sv2 in situations such as this, to ignore a relatively
= (1, v); then lim C s1 , s2 /sv1 , sv2 = 1. This illus- small group of players (in this case, the players of
trates why, to obtain the result that games with many type 1) who can have a large effect on per capita
players are market games either it must be required payoffs. This leads us to the next concept.
that there are no scarce types or some assumption To treat the scarce types problem, Wooders
limiting the effects of scarce types must be made. (1992a, b, 1993) introduced the condition of
We return to this example in the next section. small group effectiveness (SGE). SGE is appeal-
The next example illustrates that, with only PCB, ing technically since it resolves the scarce types
uniform approximate cores of games with many problem. It is also economically intuitive and
players derived from pregames may be empty. appealing; the condition defines a class of econo-
mies that, when there are many players, generate
Example 4 (Wooders 2008a) Consider a pre- competitive markets. Informally, SGE dictates
game (T, C) where T = {1, 2} and C is the that almost all gains to collective activities can
superadditive cover of the function C0 defined by: be realized by relatively small groups of players.
Thus, SGE is exactly the sort of assumption
0 def jsj if s1 ¼ 2, required to ensure that multiple, relatively small
C ðsÞ ¼
0 otherwise: coalitions, firms, jurisdictions, or clubs, for exam-
ple, are optimal or near-optimal in large
Thus, if a profiles = (s1, s2) has s1 = 2 then the economies.
worth of the profile according to C0 is equal to the A pregame (T, C) satisfies small group effec-
total number of players it represents, s1 + s2, while tiveness, SGE, if:
474 Market Games and Clubs
For each real numbere > 0, per capita boundedness has significant conse-
there is an integer0 ðeÞ quences but is quite innocuous – ruling out the
such that for each profiles,
possibility of average utilities becoming infinite as
SGE
: for some partition sk of swith economies grow large does not seem restrictive.
sk
ðeÞ for each subprofilesk , it holds that But with only per capita boundedness, even the
0P
C ðsÞ k C sk eksk; formation of small coalitions can have significant
(12) impacts on aggregate outcomes. With small group
effectiveness, however, there is no problem of
given e > 0 there is a group size 0(e) such that either large or small coalitions acting together –
the loss from restricting collective activities large coalitions cannot do significantly better then
within groups to groups containing fewer that relatively small coalitions.
0(e) members is at most e per capita (Wooders Roughly, the property of large games we next
1992a). (Exactly the same definition applies to introduce is that relatively small groups of players
situations with a compact metric space of player make only “asymptotic negligible” contributions
types, c.f. Wooders (1988, 1992a).) to per-capita payoffs of large groups. A pregame
SGE also has the desirable feature that if there (O, C) satisfies asymptotic negligibility if, for any
are no ‘scarce types’ – types of players that appear sequence of profiles {f v} where
in vanishingly small proportions- then SGE and
PCB are equivalent. kf v k ! 1as
0v ! 1,
sðf v Þ ¼ s f v for all v and v0 and
v (13)
Theorem 3 (Wooders 1994) With ‘thickness,’ C ðf Þ
limv!1 exists,
SGE = PCB) (1) Let (T, C) be a pregame satisfy- kf v k
ing SGE. Then the pregame satisfies PCB.
(2) Let (T, C) be a pregame satisfying PCB. then for any sequence of profiles {‘v} with
Then given any positive real number r, construct a
new pregame (T, Cr) where the domain of Cr is k‘ v k
lim ¼ 0, (14)
restricted to profiles s where, for each t = 1,. . .,T, v!1 kf v k
either ksstk > r or st = 0. Then (T, Cr) satisfies SGE
on its domain. it holds that
It can also be shown that small groups are
effective for the attainment of nearly all feasible C kf v þ ‘v k
limv!1 exists, and
outcomes, as in the above definition, if and only if kf v þ ‘ v k
(15)
small groups are effective for improvement – any C kf v þ ‘v k C ðf v Þ
limv!1 ¼ lim v!1 :
payoff vector that can be significantly improved kf v þ ‘ v k kf v k
upon can be improved upon by a small group (see
Proposition 3.8 in Wooders 1992b). Theorem 4 (Wooders 1992b, 2008b) A pregame
(T, C) satisfies SGE if and only if it satisfies PCB
Remark 4 Under a stronger condition of strict and asymptotic negligibility Intuitively, asymptotic
small group effectiveness, which dictates that negligibility ensures that vanishingly small per-
(e) in the definition of small group effectiveness centages of players have vanishingly small effects
can be chosen independently of e, stronger results on aggregate per-capita worths. It may seem para-
can be obtained than those presented in this sec- doxical that SGE, which highlights the importance
tion and the next. We refer to Winter and Wooders of relatively small groups, is equivalent to asymp-
(1990) for a treatment of this case. totic negligibility. To gain some intuition, however,
think of a marriage model where only two-person
Remark 5 (On the importance of taking into marriages are allowed. Obviously two-person
account scarce types) Recall the quotation from groups are (strictly) effective, but also, in large
von Neumann and Morgenstern and the discus- player sets, no two persons can have a substantial
sion following the quotation. The assumption of affect on aggregate per-capita payoffs.
Market Games and Clubs 475
Remark 6 Without some assumptions ensuring Then for any x RTþ the limit (16) exists. More-
essential superadditivity, at least as incorporated over, U() is well-defined, concave and
into our definition of feasibility, nonemptiness of 1-homogeneous and the convergence is uniform
approximate cores of large games cannot be in the sense that, given e > 0 there is an integer Z
expected; superadditivity assumptions (or the such that for all profiles s with ksk it holds that
close relative, essential superadditivity) are
heavily relied upon in all papers on large games
U s C ðsÞ e:
cited. In the context of economies, superadditivity ks k ks k
is a sort of monotonicity of preferences or produc-
tion functions assumption, that is, superadditivity
From Wooders (1994) (Theorem 4), if arbi-
of C implies that for all s, s0 ZTþ , it holds that trarily small percentages of players of any type
0
C(s + s ) C(s) + C(s0). Our assumption of small that appears in games generated by the pregame
group effectiveness, SGE, admits non- are ruled out, then the above result holds under per
monotonicities. For example, suppose that ‘two capita boundedness (Wooders 1994) (Theorem 6).
is company, three or more is a crowd,’ by suppos- As noted in the introduction to this paper, for the
ing there is only one commodity and by setting TU case, the concavity of the limiting utility func-
C(2) = 2, C(n) = 0 for n 6¼ 2. The reader can tion, for the model of Wooders (1983) was first
verify, however, that this example satisfies small noted by Aumann (1987). The concavity is shown
group effectiveness since C *(n) = n if n is even to hold with a compact metric space of player
and C *(n) = n 1 otherwise. Within the context types in Wooders (1988) and is simplified to the
of pregames, requiring the superadditive cover finite types case in Wooders (1994).
payoff to be approximately realizable by parti- Theorem 5 follows from the facts that the func-
tions of the total player set into relatively small tion U is superadditive and 1-homogeneous on its
groups is the weakest form of superadditivity domain. Since U is concave, it is continuous on
required for the equivalence of games with many the interior of its domain; this follows from PCB.
players and concave markets. Small group effectiveness ensures that the func-
tion U is continuous on its entire domain
Derivation of Markets from Pregames (Wooders 1994)(Lemma 2).
Satisfying SGE
With SGE and PCB in hand, we can now derive a
Theorem 6 (Wooders 1994) Let (T, C) be a
premarket from a pregame and relate these
pregame satisfying small group effectiveness
concepts.
and let (T, U) denote the derived direct market
To construct a limiting direct premarket from a
pregame. Then (T, U) is a totally balanced mar-
pregame, we first define an appropriate utility
ket game. Moreover, U is one-homogeneous,
function. Let (T, C) be a pregame satisfying
that is, U(lx) = lU(x) for any non-negative
SGE. For each vector X in RTþ define
real number l.
In interpretation, T denotes a number of types
def C ðf v Þ of players/commodities and U denotes a utility
U ðxÞ ¼ kxk lim (16)
v!1 kf v k function on RTþ. Observe that when U is restricted
to profiles (in ZTþ), the pair (T, U) is a pregame with
where the sequence {f v} satisfies the property that every game [n; (T, U)] has a
nonempty core; thus, we will call (T, U) the pre-
market generated by the pregame (T, C). That
fv x every game derived from (T, U) has a nonempty
limv!1 ¼ and kf v k ! 1 (17)
kf v k kx k core is a consequence of the Shapley and Shubik
(1969) result that market games derived from
Theorem 5 (Wooders 1988, 1994) Assume the markets with concave utility functions are totally
pregame (T, C) satisfies small group effectiveness. balanced.
476 Market Games and Clubs
Cores and Approximate Cores It is well known that the e-core of a game with
transferable utility is nonempty if and only if the
The concept of the core clearly was important in equal-treatment e-core is nonempty.
the work of Shapley and Shubik (1966, 1969, Continuing with the notation above, for any
1975) and is also important for the equivalence s RTþ , let P(s) denote the set of subgradients
of games with many players and market games. to the function U at the point s;
Thus, we discuss the related results of non-
emptiness of approximate cores and convergence def
PðsÞ ¼ p RT : p sand p s0 U ðs0 Þ
of approximate cores to the core of the ‘limit’ – the (19)
game where all players have utility functions foralls0 RTþ :
derived from a pregame and large numbers of
players. First, some terminology is required. The elements in P(s) can be interpreted as equal-
A vector p is a subgradient at x of the concave treatment core payoffs to a limiting game with the
function U if U(y) U(x) p (y x) for all y. mass of players of type t given by st. The core payoff
One might think of a subgradient as a bounding to a player is simply the value of the one unit of a
hyperplane. To avoid any confusion it might be commodity (himself and all his attributes, including
helpful to note that, as Mas-Colell (1985) endowments of resources) that he owns in the direct
remarks: “Strictly speaking, one should use the market generated by a game. Thus P() is called the
term subgradient for convex functions and super- limiting core correspondence for the pregame (T,
gradient for concave. But this is cumbersome”, C). Of course P() is also the limiting core corre-
(pp. 29–30 in Mas-Colell 1985). spondence for the pregame (T, U).
Market Games and Clubs 477
1 X X
jN j1
1
^ ð f Þ < d,
dist½Cð f ; eÞ, Pð f Þ < d and dist Cð f ; eÞ, P SH ðv, iÞ ¼
jN j J¼0 jN j 1
S Nnfig
where ‘dist’ is the Hausdorff distance with J jSj¼J
Theorem 10 (Wooders and Zame 1987) Let (T, In the simplest case, the utility of an individual
C) be a superadditive pregame satisfying bound- depends on the club profile (the numbers of par-
edness of marginal contributions. For each e > 0 ticipants of each type) in his club. The total worth
there is a number d(e) > 0 and an integer m(e) with of a group of players is the maximum that it can
the following property: achieve by splitting into clubs. The results pre-
If [n, (T, C)] is a game derived from the pregame, sented in this section immediately apply. When
for which nt > m(e) for each t, then the Shapley there are many participants, club economies can
value of the game is in the (weak) e-core. be represented as markets and the competitive
payoff vectors for the market are approximated
Similar results hold within the context of private
by equal-treatment payoff vectors in approximate
goods exchange economies (cf., Shapley (1964),
cores. Approximate cores converge to equal treat-
Shapley and Shubik (1969), Champsaur (1975),
ment and competitive equilibrium payoffs.
Mas-Colell (1977), Cheng (1981) and others).
A more general model making these points is
Some of these results are for economies without
treated in Shubik and Wooders (1982a). For recent
money but all treat private goods exchange econo-
reviews of the literature, see Conley and Smith
mies with divisible goods and concave, monotone
(2005) and Kovalenkov and Wooders (2005).
utility functions. Moreover, they all treat either
(Other approaches to economies with clubs/local
replicated sequences of economies or convergent
public goods include Casella and Feinstein
sequences of economies. That games satisfying
(2002), Demange (1994), Haimanko, O., M. Le
SGE are asymptotically equivalent to balanced
Breton and S. Weber (2004), and Konishi, Le
market games clarifies the contribution of the
Breton and Weber (1998). Recent research has
above result. In the context of the prior results
treated clubs as networks.)
developed in this paper, the major shortcoming of
Coalition production economies may also be
the Theorem is that it requires BMC. This author
viewed as club economies. We refer the reader to
conjectures that the above result, or a close ana-
Böhm (1974), Sondermann (1974), Shubik and
logue, could be obtained with the milder condition
Wooders (1983b), and for a more recent treatment
of SGE, but this has not been demonstrated.
and further references, Sun, Trockel and
Yang (2008).
Let us conclude this section with some histor-
Economies with Clubs ical notes. Club economies came to the attention
of the economics profession with the publication
By a club economy we mean an economy where of Buchanan (Buchanan 1965). The author
participants in the economy form groups – called pointed out that people care about the numbers
clubs – for the purposes of collective consumption of other people with whom they share facilities
and/or production collectively with the group mem- such as swimming pool clubs. Thus, there may be
bers. The groups may possibly overlap. A club congestion, leading people to form multiple clubs.
structure of the participants in the economy is a Interestingly, much of the recent literature on club
covering of the set of players by clubs. Providing economies with many participants and their com-
utility functions are quasi-linear, such an economy petitive properties has roots in an older paper,
generates a game of the sort discussed in this essay. Tiebout (1956). Tiebout conjectured that if public
The worth of a group of players is the maximum goods are ‘local’ – that is, subject to exclusion and
total worth that the group can achieve by forming possibly congestion – then large economies are
clubs. The most general model of clubs in the liter- ‘market-like’. A first paper treating club econo-
ature at this point is Allouch and Wooders (2008). mies with many participants was Pauly (1970),
Yet, if one were to assume that utility functions were who showed that, when all players have the
all quasi-linear and the set of possible types of same preferred club size, then the core of econ-
participants were finite. The results of this paper omy is nonempty if and only if all participants in
would apply. the economy can be partitioned into groups of the
480 Market Games and Clubs
preferred size. Wooders (1978) modeled a club in three person households in that region. It is
economy as one with local public goods and dem- simple arithmetic. This consistency should also
onstrated that, when individuals within a club hold for k-person households for any k. Measure-
(jurisdiction) are required to pay the same share ment consistency is the same idea with the work
of the costs of public good provision, then out- “number” replaced by “proportion” or “measure”.
comes in the core permit heterogeneous clubs if One can immediately apply results reported
and only if all types of participants in the same above to the special case of TU games of
club have the same demands for local public Kaneko-Wooders (1986) and conclude that
goods and for congestion. Since these early games satisfying small group effectiveness and
results, the literature on clubs has grown with a continuum of players have nonempty
substantially. cores and that the payoff function for the game is
one-homogeneous. (We note that there have been
a number of papers investigating cores of games
With a Continuum of Players with a continuum of players that have came to the
conclusion that non-emptiness of exact cores does
Since Aumann (1964) much work has been done not hold, even with balancedness assumptions,
on economies with a continuum of players. It is cf., Weber (1979, 1981)). The results of Wooders
natural to question whether the asymptotic equiv- (1994), show that the continuum economy must
alence of markets and games reported in this arti- be representable by one where all players have the
cle holds in a continuum setting. Some such same concave, continuous one-homogeneous util-
results have been obtained. ity functions. Market games with a continuum of
First, let N = [01] be the 0,1 interval with players and a finite set of types are also investi-
Lesbegue measure and suppose there is a partition gated in Azriel and Lehrer (2007), who confirm
of N into a finite set of subsets N 1, . . ., N T where, these conclusions.)
in interpretation, a point in N t represents a player
of type t. Let C be given. Observe that C deter-
mines a payoff for any finite group of players, Other Related Concepts and Results
depending on the numbers of players of each
type. If we can aggregate partitions of the total In an unpublished 1972 paper due to Edward
player set into finite coalitions then we have Zajac (1972), which has motivated a large amount
defined a game with a continuum of players and of literature on ‘subsidy-free pricing’, cost shar-
finite coalitions. ing, and related concepts, the author writes:
For a partition of the continuum into finite “A fundamental idea of equity in pricing is that
groups to ‘make sense’ economically, it must pre- ‘no consumer group should pay higher prices than
serve the relative scarcities given by the measure. it would pay by itself. . .’. If a particular group is
This was done in Kaneko and Wooders (1986). To paying a higher price than it would pay if it were
illustrate their idea of measurement consistent severed from the total consumer population, the
partitions of the continuum into finite groups, group feels that it is subsidizing the total popula-
think of a census form that requires each three- tion and demands a price reduction”.
person household to label the players in the house- The “dual” of the cost allocation problem is the
hold, #1, #2, or #3. When checking the consis- problem of surplus sharing and subsidy-free pric-
tency of its figures, the census taker would expect ing. (See, for example Moulin (1988, 1992) for
the numbers of people labeled #1 in three-person excellent discussions of these two problems.)
households to equal the numbers labeled #2 and Tauman (1987) provides a excellent survey.
#3. For consistency, the census taker may also Some recent works treating cost allocation and
check that the number of first persons in three- subsidy free-pricing include Moulin (1988,
person households in a particular region is equal 1992). See also the recent notion of “Walras’
to the number of second persons and third persons core” in Qin, Shapley and Shimomura (2006).
Market Games and Clubs 481
Another related area of research has been into explored, except for in the context of games,
whether games with many players satisfy some such as those derived from economies with
notion of the Law of Demand of consumer theory clubs, and with utility functions that are linear in
(or the Law of Supply of producer theory). Since money.
games with many players resemble market games, Per capita boundedness seems to be about the
which have the property that an increase in the mildest condition that one can impose on an eco-
endowment of a commodity leads to a decrease in nomic structure and still have scarcity of per
its price, such a result should be expected. Indeed, capita resources in economies with many partici-
for games with many players, a Law of Scarcity pants. In economies with quasi-linear utilities
holds – if the numbers of players of a particular (and here, I mean economies in a general sense,
type is increased, then core payoffs to players of as in the glossary) satisfying per capita bounded-
that type do not increase and may decrease. (This ness and where there are many substitutes for each
result was observed by Scotchmer and Wooders type of participant, then as the number of partici-
(1988)). See Kovalenkov and Wooders (2005, pants grows, these economies resemble or (as if
2006) for the most recent version of such results they) are market economies where individuals
and a discussion of the literature. Laws of scarcity have continuous, and monotonic increasing utility
in economies with clubs are examined in Cart- functions. Large groups cannot influence out-
wright, Conley and Wooders (2006). comes away from outcomes in the core (and out-
comes of free competition) since large groups are
not significantly more effective than many small
Some Remarks on Markets and More groups (from the equivalence, when each player
General Classes of Economies has many close substitutes, between per capita
boundedness and small group effectiveness).
Forms of the equivalence of outcomes of econo- But if there are not many substitutes for each
mies where individuals have concave utility func- participant, then, as we have seen, per capita
tions but not necessarily linear in money. These boundedness allows small groups of participants
include Billera (1974), Billera and Bixby (1974) to have large effects and free competition need
and Mas-Colell (1975). A natural question is not prevail (cores may be empty and price-taking
whether the results reported in this paper can equilibrium may not exist). The condition
extend to nontransferable utility games and econ- required to ensure free competition in economies
omies where individuals have utility functions with many participants, without assumptions of
that are not necessarily liner in money. So far the “thickness”, is precisely small group
results obtained are not entirely satisfactory. Non- effectiveness.
emptiness of approximate cores of games with But the most complete results relating markets
many players, however, holds in substantial gen- and games, outlined in this paper, deal with econ-
erality; see Kovalenkov and Wooders (2003) and omies in which all participants have utility func-
Wooders (2008b). tions that are linear in money and in games with
side payments, where the worth of a group can be
divided in any way among the members of the
Conclusions and Future Directions group without any loss of total utility or worth.
Nonemptiness of approximate cores of large
The results of Shapley and Shubik (1969), show- games without side payments has been demon-
ing equivalence of structures, rather than equiva- strated; see Wooders (1983, 2008b) and
lence of outcomes of solution concepts in a fixed Kovalenkov and Wooders (2003). Moreover, it
structure (as in Aumann 1964, for example) are has been shown that when side payments are
remarkable. So far, this line of research has been limited then approximate cores of games without
relatively little explored. The results for games side payments treat similar players similarly
with many players have also not been fully (Kovalenkov and Wooders 2001).
482 Market Games and Clubs
Results for specific economic structures, relat- Azrieli Y, Lehrer E (2007) Market games in large econo-
ing cores to price taking equilibrium treat can treat mies with a finite number of types. Econ Theor
31:327–342
situations that are, in some respects, more general. Bennett E, Wooders M (1979) Income distribution and firm
A substantial body of literature shows that certain formation. J Comp Econ 3:304–317. http://www.
classes of club economies have nonempty cores myrnawooders.com/
and also investigates price-taking equilibrium in Bergstrom T, Varian HR (1985) When do market games
have transferable utility? J Econ Theor 35(2):222–233
these situations. Fundamental results are provided Billera LJ (1974) On games without side payments arising
by Gale and Shapley (1962), Shapley and Shubik from a general class of markets. J Math Econ
(1972), and Crawford and Kelso (1982) and many 1(2):129–139
more recent papers. We refer the reader to Roth Billera LJ, Bixby RE (1974) Market representations of
n-person games. Bull Am Math Soc 80(3):522–526
and Sotomayor (1990) and to Two-Sided Böhm V (1974) The core of an economy with production.
Matching Models, by Ömer and Sotomayor in Rev Econ Stud 41:429–436
this encyclopedia. A special feature of the models Bondareva O (1963) Some applications of linear program-
of these papers is that there are two sorts of players ming to the theory of cooperative games. Problemy
kibernetiki 10 (in Russian, see English translation in
or two sides to the market; examples are (1) men Selected Russian papers in game theory 1959–1965.
and women, (2) workers and firms, (3) interns and Princeton University Press, Princeton
hospitals and so on. Buchanan J (1965) An economic theory of clubs.
Going beyond two-sided markets to clubs in Economica 33:1–14
Cartwright E, Conley J, Wooders M (2006) The law of
general, however, one observes that the positive demand in Tiebout economies. In: Fischel WA (ed) The
results on nonemptiness of cores and existence of Tiebout model at 50: essays in public economics in
price-taking equilibria only holds under restrictive honor of Wallace Oates. Lincoln Institute of Land Pol-
conditions. A number of recent contributions how- icy, Cambridge
Casella A, Feinstein JS (2002) Public goods in trade on the
ever, provide specific economic models for which, formation of markets and jurisdictions. Intern Econ Rev
when there are many participants in the economy, 43:437–462
as in exchange economies it holds that price-taking Champsaur P (1975) Competition vs. cooperation. J Econ
equilibrium exists, cores are non-empty, and the Theory 11:394–417
Cheng HC (1981) On dual regularity and value conver-
set of outcomes of price-taking equilibrium are gence theorems. J Math Econ 8:37–57
equivalent to the core (see, for example, Allouch Conley J, Smith S (2005) Coalitions and clubs; Tiebout
and Wooders 2008; Allouch et al. 2008; Ellickson equilibrium in large economies. In: Demange G,
et al. 1999; Wooders 1989, 1997). Wooders M (eds) Group formation in economies; net-
works, clubs and coalitions. Cambridge University
Press, Cambridge
Conley JP, Wooders M (1995) Hedonic independence and
taste-homogeneity of optimal jurisdictions in a Tiebout
economy with crowding types. Ann D’Econ Stat
Bibliography 75(76):198–219
Crawford VP, Kelso AS (1982) Job matching, coalition
Allouch N, Wooders M (2008) Price taking equilibrium in formation, and gross substitutes. Econornetrica
economies with multiple memberships in clubs and 50:1483–1504
unbounded club sizes. J Econ Theor 140:246–278 Debreu G, Scarf H (1963) A limit theorem on the core of an
Allouch N, Conley JP, Wooders M (2008) Anonymous price economy. Int Econ Rev 4:235–246
taking equilibrium in Tiebout economies with a contin- Demange G (1994) Intermediate preferences and stable
uum of agents: existence and characterization. J Math coalition structures. J Math Econ 1994:45–48
Econ. https://doi.org/10.1016/j.jmateco.2008.06.003 Ellickson B, Grodal B, Scotchmer S, Zame W (1999)
Aumann RJ (1964) Markets with a continuum of traders. Clubs and the market. Econometrica 67:1185–1218
Econometrica 32:39–50 Gale D, Shapley LS (1962) College admissions and the
Aumann RJ (1987) Game theory. In: Eatwell J, Milgate M, stability of marriage. Am Math Mon 69:9–15
Newman P (eds) The new Palgrave: a dictionary of Garratt R, Qin C-Z (1997) On a market for coalitions with
economics. Palgrave MacMillan, Basingstoke indivisible agents and lotteries. J Econ Theor
Aumann RJ, Dreze J (1974) Cooperative games with coa- 77(1):81–101
lition structures. Int J Game Theory 3:217–237 Gillies DB (1953) Some theorems on n-person games. PhD
Aumann RJ, Shapley S (1974) Values of non-atomic Dissertation, Department of Mathematics. Princeton
games. Princeton University Press, Princeton University, Princeton
Market Games and Clubs 483
Haimanko O, Le Breton M, Weber S (2004) Voluntary von Neumann J, Morgenstern O (1953) Theory of games
formation of communities for the provision of public and economic behavior. Princeton University Press,
projects. J Econ Theor 115:1–34 Princeton
Hildenbrand W (1974) Core and equilibria of a large econ- Owen G (1975) On the core of linear production games.
omy. Princeton University Press, Princeton Math Program 9:358–370
Hurwicz L, Uzawa H (1977) Convexity of asymptotic Pauly M (1970) Cores and clubs. Public Choice 9:53–65
average production possibility sets. In: Arrow KJ, Qin C-Z, Shapley LS, Shimomura K-I (2006) The Walras
Hurwicz L (eds) Studies in resource allocation pro- core of an economy and its limit theorem. J Math Econ
cesses. Cambridge University Press, Cambridge 42(2):180–197
Kalai E, Zemel E (1982a) Totally balanced games and Roth A, Sotomayer M (1990) Two-sided matching; a study
games of flow. Math Oper Res 7:476–478 in game-theoretic modeling and analysis. Cambridge
Kalai E, Zemel E (1982b) Generalized network problems University Press, Cambridge
yielding totally balanced games. Oper Res 30:998–1008 Scotchmer S, Wooders M (1988) Monotonicity in games
Kaneko M, Wooders M (1982) Cores of partitioning that exhaust gains to scale. IMSSS Technical Report
games. Math Soc Sci 3:313–327 No. 525, Stanford University
Kaneko M, Wooders M (1986) The core of a game with a Shapley LS (1952) Notes on the N-Person game III: some
continuum of players and finite coalitions; the model variants of the von-Neumann-Morgenstern definition
and some results. Math Soc Sci 12:105–137. http:// of solution Rand Corporation research memorandum
www.myrnawooders.com/ RM-817:1952
Kaneko M, Wooders M (2004) Utility theories in cooper- Shapley LS (1964) Values of large games-VII: a general
ative games, Chapter 19. In: Handbook of utility theory, exchange economy with money. Rand Memorandum
vol 2. Kluwer, Dordrecht, pp 1065–1098 RM-4248-PR
Kannai Y (1972) Continuity properties of the core of a Shapley LS (1967) On balanced sets and cores. Nav Res
market. Econometrica 38:791–815 Logist Q 9:45–48
Konishi H, Le Breton M, Weber S (1998) Equilibrium in a Shapley LS, Shubik M (1960) On the core of an economic
finite local public goods economy. J Econ system with externalities. Am Econ Rev 59:678–684
Theory:79224–79244 Shapley LS, Shubik M (1966) Quasi-cores in a monetary
Kovalenkov A, Wooders M (2001) Epsilon cores of games economy with nonconvex preferences. Econometrica
with limited side payments: nonemptiness and equal 34:805–827
treatment. Games Econ Behav 36(2):193–218 Shapley LS, Shubik M (1969) On market games. J Econ
Kovalenkov A, Wooders M (2003) Approximate cores of Theor 1:9–25
games and economies with clubs. J Econ Theory Shapley LS, Shubik M (1972) The assignment game 1; The
110:87–120 core. Int J Game Theor 1:11–30
Kovalenkov A, Wooders M (2005) A law of scarcity for Shapley LS, Shubik M (1975) Competitive outcomes in the
games. Econ Theor 26:383–396 cores of market games. Int J Game Theor 4:229–237
Kovalenkov A, Wooders M (2006) Comparative statics Shapley LS, Shubik M (1977) Trade using one commodity
and laws of scarcity for games. In: Aliprantis CD, as a means of payment. J Political Econ 85:937–968
Matzkin RL, McFadden DL, Moore JC, Yannelis NC Shubik M (1959a) Edgeworth market games. In: Luce FR,
(eds) Rationality and equilibrium: a symposium in hon- Tucker AW (eds) Contributions to the theory of games
our of Marcel K. Richter, Studies in economic theory IV. Annals of mathematical studies 40. Princeton Uni-
series, vol 26. Springer, Berlin, pp 141–169 versity Press, Princeton, pp 267–278
Mas-Colell A (1975) A further result on the representation Shubik M (1959b) Edgeworth market games. In: Luce FR,
of games by markets. J Econ Theor 10(1):117–122 Tucker AW (eds) Contributions to the theory of games
Mas-Colell A (1977) Indivisible commodities and general IV. Annals of mathematical studies 40. Princeton Uni-
equilibrium theory. J Econ Theory 16(2):443–456 versity Press, Princeton, pp 267–278
Mas-Colell A (1979) Competitive and value allocations of Shubik M, Wooders M (1982a) Clubs, markets, and near-
large exchange economies. J Econ Theor 14:307–310 market games. In: Wooders M (ed) Topics in game
Mas-Colell A (1980) Efficiency and decentralization in the theory and mathematical economics: essays in honor
pure theory of public goods. Q J Econ 94:625–641 of Robert J Aumann. Field Institute Communication
Mas-Colell A (1985) The theory of general economic Volume, American Mathematical Society, originally
equilibrium. Economic Society Publication Near Markets and Market Games, Cowles Foundation,
No. 9. Cambridge University Press, Cambridge Discussion Paper No. 657
Moulin M (1988) Axioms of cooperative decision making. Shubik M, Wooders M (1982b) Near markets and market
Econometric Society Monograph No. 15. Cambridge games. Cowles Foundation Discussion Paper No. 657.
Press, Cambridge http://www.myrnawooders.com/
Moulin H (1992) Axiomatic cost and surplus sharing, Shubik M, Wooders M (1982c) Clubs, markets, and near-
Chapter 6. In: Arrow K, Sen AK, Suzumura K (eds) market games. In: Wooders M (ed) Topics in game
Handbook of social choice and welfare, 1st edn, theory and mathematical economics: essays in honor
vol 1. Elsevier, Amsterdam, pp 289–357 of Robert J Aumann. Field Institute Communication
484 Market Games and Clubs
Volume, American Mathematical Society, originally Wooders M (1983) The epsilon core of a large replica
Near Markets and Market Games, Cowles Foundation, game. J Math Econ 11:277–300. http://www.
Discussion Paper No. 657 myrnawooders.com/
Shubik M, Wooders M (1983a) Approximate cores of Wooders M (1988) Large games are market games 1. Large
replica games and economies: part II set-up costs and finite games. C.O.R.E. Discussion Paper No. 8842.
firm formation in coalition production economies. http://www.myrnawooders.com/
Math Soc Sci 6:285–306 Wooders M (1989) A Tiebout theorem. Math Soc Sci
Shubik M, Wooders M (1983b) Approximate cores of rep- 18:33–55
lica games and economies: part I replica games, exter- Wooders M (1991a) On large games and competitive mar-
nalities, and approximate cores. Math Soc Sci 6:27–48 kets 1: theory. University of Bonn
Shubik M, Wooders M (1983c) Approximate cores of Sonderforschungsbereich 303 Discussion Paper
replica games and economies: part II set-up costs and No. (B-195, Revised August 1992). http://www.
firm formation in coalition production economies. myrnawooders.com/
Math Soc Sci 6:285–306 Wooders M (1991b) The efficaciousness of small groups
Shubik M, Wooders M (1986) Near-markets and market- and the approximate core property in games without
games. Econ Stud Q 37:289–299 side payments. University of Bonn
Sondermann D (1974) Economics of scale and equilibria in Sonderforschungsbereich 303 Discussion Paper
coalition production economies. J Econ Theor No. B-179. http://www.myrnawooders.com/
8:259–291 Wooders M (1992a) Inessentiality of large groups and the
Sun N, Trockel W, Yang Z (2008) Competitive outcomes approximate core property; An equivalence theorem.
and endogenous coalition formation in an n-person Econ Theor 2:129–147
game. J Math Econ 44:853–860 Wooders M (1992b) Large games and economies with
Tauman Y (1987) The Aumann-Shapley prices: a survey. effective small groups. University of Bonn
In: Roth A (ed) The Shapley value: essays in honor of Sonderforschingsbereich 303 Discussion Paper
Lloyd S Shapley. Cambridge University, Cambridge No. B-215. (Revised: In: Mertens J-F, Sorin S (eds)
Tauman Y, Urbano A, Watanabe J (1997) A model of Game-theoretic methods in general equilibrum analy-
multiproduct price competition. J Econ Theor sis. Kluwer, Dordrecht). http://www.myrnawooders.
77:377–401 com/
Tiebout C (1956) A pure theory of local expenditures. Wooders M (1993) The attribute core, core convergence,
J Political Econ 64:416–424 and small group effectiveness; The effects of property
Weber S (1979) On e-cores of balanced games. Int J Game rights assignments on the attribute core. University of
Theor 8:241–250 Toronto Working Paper No. 9304
Weber S (1981) Some results on the weak core of a non- Wooders M (1994) Equivalence of games and markets.
sidepayment game with infinitely many players. J Math Econometrica 62:1141–1160. http://www.
Econ 8:101–111 myrnawooders. com/
Winter E, Wooders M (1990) On large games with Wooders M (1997) Equivalence of Lindahl equilibria with
bounded essential coalition sizes. University of Bonn participation prices and the core. Econ Theor
Sondeforschungsbereich 303 Discussion Paper B-149. 9:113–127
http://www.myrnawooders.com/. Intern J Econ Theor Wooders M (2007) Core convergence in market games and
(2008) 4:191–206 club economics. Rev Econ Design (to appear)
Wooders M (1977) Properties of quasi-cores and quasi- Wooders M (2008a) Small group effectiveness, per capita
equilibria in coalition economies. SUNY-Stony Brook boundedness and nonemptiness of approximate cores.
Department of Economics Working Paper J Math Econ 44:888–906
No. 184, revised (1979) as A characterization of Wooders M (2008b) Games with many players and abstract
approximate equilibria and cores in a class of coalition economies permitting differentiated commodities,
economies. State University of New York Stony Brook clubs, and public goods (submitted)
Economics Department. http://www.myrnawooders. Wooders M, Zame WR (1987) Large games; Fair and
com/ stable outcomes. J Econ Theor 42:59–93
Wooders M (1978) Equilibria, the core, and jurisdiction Zajac E (1972) Some preliminary thoughts on subsidiza-
structures in economies with a local public good. tion. Presented at the Conference on Telecommunica-
J Econ Theor 18:328–348 tions Research, Washington
Repeated games A repeated game is an exten-
Learning in Games sive form of game representing a repeated stra-
tegic interaction. The interaction being
John Nachbar repeated is called the stage game. In a
Department of Economics, Washington discounted repeated game, payoffs in the
University, St. Louis, MO, USA repeated game are a geometrically weighted
sum of the payoffs each period from the stage
game. The weight on period t payoffs is dt,
Article Outline where d (0, 1) is the discount factor.
A player who is patient has a discount factor
Glossary close to 1.
Definition of the Subject and Its Importance
Introduction
Deterministic Learning
Stochastic Learning Definition of the Subject and Its
Future Directions Importance
Bibliography
In the context of this entry, learning refers to a
Glossary particular class of dynamic game theoretic
models. In models in this class, players are ratio-
Bayesian learning In repeated games, a model nal in the sense that they forecast the future
in which each player best responds to her prior, behavior of their opponents and optimize, or e
which is a probability distribution over her optimize, with respect to their forecasts. But
opponent’s behavior strategies. players are not necessarily in equilibrium; in
Behavior strategy In a repeated game, a behav- particular, their forecasts are not necessarily
ior strategy for player i gives, for each possible accurate. Two objectives are to model out-of-
date and each possible history of play in the equilibrium behavior by sophisticated players
game up to that date, a probability distribution and to understand when, or whether, play might
over i’s actions next period. This includes the converge to equilibrium.
possibility that player i may play some action Learning models are a branch of a larger liter-
for certain. ature on out-of-equilibrium behavior in dynamic
Belief learning In repeated games, a model in games. In other branches of the literature, players
which players best respond to prediction rules. are modeled as “adaptive.” Players do not forecast
Prediction rule In a two-player repeated game, a and they do not optimize. Rather, they follow
deterministic prediction rule gives a probabil- some other form of behavioral rule, such as imi-
ity distribution over the opponent’s actions tation, regret minimization, or reinforcement.
next period as a deterministic function of the Learning models, especially those that attempt to
history of the game. In a stochastic prediction capture sophisticated behavior, are most appropri-
rule, the distribution over the opponent’s ate in settings where players have a good under-
actions can depend on history probabilistically. standing of their strategic environment and where
A deterministic prediction rule that gives the stakes are high enough to make deliberation
player 2’s forecast about player 1’s actions is worthwhile. For surveys of evolutionary/adaptive
formally equivalent to a behavior strategy for models, see Sandholm (2007b), Young (2008a, b),
player 1. and Camerer (2008).
the initial probability of L to be 1/2 and the param- Second, if the discount factor is close enough
eter k to be 2. Thus, fictitious play can be thought to 1, the folk theorem implies that there will typ-
of as a Bayesian model in which each player is ically be many repeated game Nash equilibrium
certain that the other is playing an i.i.d. strategy, and some of these may involve play of actions that
but isn’t sure which i.i.d. strategy. Note that since are not part of any stage-game equilibrium (for
each player believes the other is i.i.d., each more on the folk theorem, see Fudenberg and
believes she has no influence on the other’s future Tirole (1991)). The standard example is the
behavior. Hence, as noted earlier, myopic optimi- repeated prisoner’s dilemma, in which the
zation is optimal for any discount factor. repeated game can have Nash equilibria that sus-
Likewise, the Cournot prediction rule is Bayes- tain cooperation, even though cooperation is
ian. In particular, one can form the degenerate strictly dominated in the stage game.
prior from the Cournot prediction rule. To my More generally, learning stories take place in
knowledge, the Cournot rule has no compelling the context of a larger dynamic game. Learning
nondegenerate Bayesian interpretation. may lead to Nash equilibrium play in the contin-
The fact that both the Cournot dynamic and uation of this larger game, but Nash equilibrium in
fictitious play can be interpreted as Bayesian the continuation of the larger game need not cor-
underscores the fact that there is no presumption respond to Nash equilibrium play in a component
that Bayesian players are sophisticated. I take up (such as a stage game) of that larger game.
the issue of sophisticated learning below, espe-
cially in section “Sophisticated Learning.”
Convergence in What Sense?
Until the 1990s, most of the work on learning
What Should “Convergence” Mean?
focused on whether the empirical marginal fre-
One of the characteristics of the literature on
quencies of realized play converged to a Nash
learning in games is a running dialog about what
equilibrium of the stage game.
“converges to equilibrium” ought to mean.
As an illustration, consider repeated matching
pennies, the stage game for which is given in
Convergence in What Game?
Fig. 2. The Nash equilibrium of repeated
For concreteness, suppose that we are looking for matching pennies, for any discount factor, calls
convergence to Nash equilibrium; similar com-
for players to randomize 50:50 in every period,
ments apply to convergence to other forms of
regardless of history. Under convergence of the
equilibrium, such as correlated equilibrium. empirical marginal frequencies, play is said to
In the models that are the focus of this entry,
converge to Nash equilibrium if each player
learning takes place within the context of a
plays a half of the time. This criterion is satisfied
repeated game. Therefore, it is natural to look for if play alternates deterministically between (a, a)
convergence to Nash equilibria of the repeated
and (b, b). In contrast, a Nash equilibrium play
game. Repeated play of a stage-game Nash equi-
path looks random; in particular, with high prob-
librium always constitutes a Nash equilibrium of ability, over any finite set of dates, the four pure
the repeated game, but typically there are also
other types of repeated game equilibria.
First, even if players do play stage-game Nash
Learning in Games, a b
Fig. 1 Battle of the sexes
equilibria, they need not play the same stage- a 8, 10 0, 0
game Nash equilibrium in every period. As an b 0, 0 10, 8
example, consider the repeated battle of the
sexes, a stage game of which is given in Fig. 1.
Alternation between (a, a) in odd periods and (b, Learning in Games, a b
b) in even periods, both of which are stage-game Fig. 2 Matching pennies
a 1, −1 −1, 1
Nash equilibria, is a natural repeated game Nash
equilibrium.
b −1, 1 1, −1
Learning in Games 489
outcomes (a, a), (a, b), (b, a), and (b, b) occur Convergence to What Sort of Equilibrium?
about equally often. Convergence of the empirical Although my focus here is primarily on Nash
marginal frequencies is thus a very weak conver- equilibrium, other solution concepts are often of
gence criterion. interest. Obvious alternatives are rationalizable
A somewhat tougher criterion is convergence of strategy profiles and correlated equilibria, and
the empirical joint frequencies. In the matching both have received attention (e.g., Bernheim
pennies example, this would require that each of 1984; Foster and Vohra 1997; Nyarko 1994).
the four pure action profiles gets played 1/4 of the Two other variants of Nash equilibrium have
time. This convergence criterion eliminates the pre- also proved important in the literature. First, it is
vious example but it is still extremely weak. It often natural to assume that players e optimize
would, for example, allow convergence to the deter- rather than exactly optimize. If players only e
ministic sequence (a, a), (a, b), (b, a), (b, b), (a, a), . . optimize then convergence will be to an e-Nash
. to count as convergence to a Nash equilibrium. equilibrium rather than to an exact Nash equilib-
Note that along this sequence, (a, a) gets played for rium. It is common in the literature to model
certain at dates t = 1, 5, 9, . . . . In contrast, in the myopic players as using a logit selection from
Nash equilibrium, (a, a) gets played only 1/4 of the the stage game’s e-best response correspondence.
time at those dates. These difficulties with defining In this case, e-Nash equilibrium in the stage game
convergence in terms of empirical frequencies were takes the form of a quantal response equilibrium
first emphasized in Fudenberg and Kreps (1993); (QRE), McKelvey and Palfrey (1995), a solution
see also the related Jordan (1993). concept that has become important in the experi-
Choice of convergence standard is thus a matter mental game theory literature (see, for example,
of balance. Choose too weak a standard, as in the Goeree and Holt 2001). While any Nash equilib-
examples above, and convergence is arguably not rium is an e-Nash equilibrium, e-Nash equilibria
meaningful. Choose too strong a standard and one are typically not Nash equilibria, although the
gets impossibility results, even though positive difference is small if e is small.
results are still available for other, weaker but argu- A second important variant arises in repeated
ably still satisfactory, forms of convergence. For a games in which the stage game has a nontrivial
stark example, consider the battle of the sexes extensive form. In such settings, the perfect mon-
game of Fig. 1. If players start in the repeated itoring assumption may be untenable. It may
game Nash equilibrium in which they play (a, a) make sense to assume instead that while players
in odd periods and (b, b) in even periods then play observe the outcome of the stage game, they do
converges (trivially) to Nash equilibrium play in not observe the full strategy profile for that
any continuation game. But note that, strictly stage game.
speaking, we get a different Nash equilibrium To take a concrete example, consider the
depending on whether the starting date of the con- repeated ultimatum bargaining game. In the
tinuation game is odd or even. So a convergence stage game, player 1 makes an offer in the form
standard that requires that play be close to the same of an integer x [0, 100] and player 2 either
repeated game Nash equilibrium in every continu- rejects, yielding a payoff profile of (0, 0), or
ation game yields non-convergence, even in this accepts, yielding a payoff profile of (x, 100 x).
trivial example. For a more subtle example illus- It may make sense to assume that while player
trating the trade-offs in choosing the right conver- 1 can observe player 2’s action (accept or reject) in
gence standard, see section “Payoff Uncertainty.” response to the actual offer, she cannot observe
Loosely, the strongest form of convergence player 2’s entire stage-game strategy, since this
that one can generally hope for is that, to an would mean observing how player 2 would have
outside observer, play over finite continuation responded to every other possible offer.
histories looks asymptotically like play of some For this sort of setting, Fudenberg and Levine
Nash equilibrium or e-Nash equilibrium of the (1993a) propose self-confirming equilibrium
repeated game. (SCE) as an alternative to Nash equilibrium.
490 Learning in Games
modeling theme in the learning literature: it is of LBR holds if beliefs satisfy an absolute conti-
often easier to get positive convergence results nuity condition: each player assigns positive prob-
for e optimization than for exact optimization. ability to any (measurable) set of play paths that
has positive probability given the players’ actual
Kalai-Lehrer Learning strategies. A strong sufficient condition for this is
Kalai and Lehrer (1993a) (hereafter KL) take a that each player assigns positive, even if
Bayesian perspective and ask what conditions on extremely low, prior probability to her opponent’s
priors are sufficient to give convergence to equi- actual strategy, a condition that KL call grain of
librium, or approximate equilibrium, play. I find it truth.
helpful to characterize KL, and related papers, in
the following way. Universal Convergence
A player learns to predict the play path if her The work described thus far leaves open whether
prediction of next period’s play is asymptotically one can write down any deterministic learning
as good as if she knew her opponent’s behavior model, even an implausible one, that exhibits
strategy. If the behavior strategies call for random- universal convergence, that is, a learning model
ization then players accurately predict the distri- that, for a given stage-game form (giving stage-
bution over next period’s play rather than the game actions but not payoff functions), gives con-
realization of next period’s play. For example, vergence to Nash equilibrium play in the repeated
consider a 2 2 game in which player 1 has game for all (or at least for generic) discount
stage-game actions T and B and player 2 has factors and stage-game payoff functions.
stage-game actions L and R. If player 2 is random- Implicit in most work on this question is an
izing 50:50 every period and player 1 learns to assumption that the learning model is “uncoupled”
predict the play path, then for every g > 0, there is in the sense of Hart and Mas-Colell (2003): player
a time, which depends on the realization of player i’s prior over her opponent’s repeated game strat-
2’s strategy, after which player 1’s next period egies does not depend on the specification of her
forecast puts the probability of L within g of 1/2. opponent’s stage-game payoffs. Cournot best
(This statement applies to a set of play paths that response and fictitious play, as typically
arises with probability one with respect to the employed, are both examples of uncoupled learn-
underlying probability model; I gloss over this ing models. One can get convergence in a
sort of complication both here and below.) Finally, “coupled” model simply by having the players
say that a prior profile (giving a prior for each play a Nash equilibrium of the repeated game.
player) has the learnable best response property So, some degree of uncoupling is needed to
(LBR) if there is a profile of best response avoid triviality. On the other hand, full uncoupling
(or e-best response) strategies (LBR strategies) is a strong assumption from the perspective of
such that, if the LBR strategies are played, then sophisticated learning, since it effectively rules
each player learns to predict the play path. out introspective reasoning about one’s opponent.
If LBR holds, and players are using their LBR And full uncoupling presumably makes conver-
strategies, then, asymptotically, the continuation gence more difficult to achieve. Indeed, the main
play path is an approximate equilibrium play path result of Hart and Mas-Colell (2003) is an impos-
of the continuation repeated game. The exact sibility theorem on convergence for certain clas-
sense in which play converges to equilibrium ses of adaptive learning models.
play depends on the strength of learning and of For belief learning models, there are two com-
optimization. See KL and also Sandroni (1998) peting intuitions on (uncoupled) universal conver-
(both of which focus on exact optimization) and gence. One intuition is that universal convergence,
Noguchi (2015b) (which considers e optimization). even to e-equilibrium, is impossible because uni-
KL, building on work in the probability litera- versal learnability is impossible: for any prior,
ture on the merging of measures, notably Black- there are opposing strategies that a player will
well and Dubins (1962), show that a strong form fail to forecast, even approximately. One can
492 Learning in Games
show this via a diagonalization argument along • Learnability. For any strategy profile drawn
the lines of Oakes (1985), and it also follows as a from the strategy subsets, both players learn
corollary of results discussed in section “Sophis- to predict the play path.
ticated Learning.” In fact, for any given prior, the • Richness. If a behavior strategy is included in
set of strategies one can learn to forecast is small, one of the strategy subsets then (informally)
in a sense that can be made precise. A classic certain variations on that strategy must be
reference on the difficulty of prediction, in the included in both strategy subsets. This condi-
context of general stochastic processes, is Freed- tion, called CSP in Nachbar (2005), is satisfied
man (1965). For a useful survey, see the literature automatically if the strategy subsets consist of
review in Al-Najjar (2009). all strategies satisfying a standard complexity
The competing intuition is that universal bound, the same bound for both players. Thus
learnability is not necessary for convergence. richness/CSP holds if the subsets consist of all
The fact that both players are engaged in a game strategies with k-period memory, or all strate-
provides structure that conceivably could force gies that are automaton implementable, or all
posteriors to be correct, at least along the path of strategies that are Turing implementable, and
play, even if priors are fundamentally wrong. so on.
Something like this happens in fictitious play. In • Consistency. Each player’s subset contains a
the standard Bayesian interpretation of fictitious best response to her belief. The motivating
play (section “Belief Learning and Bayesian idea is that, for priors to be considered sophis-
Learning”), each player is certain that her oppo- ticated, a necessary (but not sufficient) condi-
nent is playing an i.i.d. strategy, even though tion is that the priors can be represented as
neither player is (since, for many games, an probability distributions whose supports
i.i.d. strategy is not optimal under the fictitious (informally speaking) contain sets satisfying
play prediction rule). Yet, in many games, ficti- these criteria. Note that typically priors have
tious play generates behavior that looks asymp- multiple representations. For example, as
totically i.i.d. and play does converge to that of a discussed earlier, the prior for fictitious play
Nash equilibrium. could be represented either as a beta distribu-
Based on the first intuition and also on other tion over i.i.d. strategies or as a degenerate
negative results like those in Hart and Mas-Colell prior that assigns probability one to the associ-
(2003) and Foster and Young (2003), it had been ated reduced form. The question, therefore, is
widely believed that universal convergence was not whether all representations of a given prior
impossible for deterministic learning. More satisfy the above criteria but whether any
recently, however, Noguchi (2015a) has reported do. Nachbar (2005) studies this question and
a universal convergence result for e-equilibrium. concludes that in a large number of cases, there
is no representation of any prior that satisfies
Sophisticated Learning these criteria.
A number of papers investigate classes of predic-
tion rules that exhibit desirable properties, such as Consider, for example, the Bayesian interpre-
the ability to detect certain kinds of patterns in tation of fictitious play in which priors are proba-
opponent behavior. Important examples include bility distributions over the i.i.d. strategies. The
Aoyagi (1996), Fudenberg and Levine (1995), set of i.i.d. strategies satisfies learnability and
Fudenberg and Levine (1999), and Sandroni (2000). richness. But for any stage game in which neither
In Nachbar (2005), I consider the issue of player has a weakly dominant action, the
sophistication from a Bayesian perspective. For i.i.d. strategies violate consistency: any player
simplicity, focus on two-player games. Fix a pro- who is optimizing will not be playing i.i.d. The
file of priors and a subset of behavior strategies for main result in Nachbar (2005) implies that any
each player, and consider the following criteria for other interpretation of fictitious play will violate at
these strategy subsets. least one of the above criteria.
Learning in Games 493
As shown in Nachbar (2005), this feature of games, if, say, player 1’s strategy is (e) optimal
Bayesian fictitious play extends to all Bayesian and drawn from her strategy subset, and if those
learning models. For large classes of repeated subsets satisfy the richness condition, then there is
games, for any profile of priors, and for any strat- a strategy in the opponent’s subset for which
egy subsets satisfying richness, if learnability player 1 cannot learn to predict the play path.
holds, then consistency fails. Last, consistency is not necessary for conver-
Let me make a few remarks. First, since the set gence. See the discussion of fictitious play in
of all strategies satisfies richness and consistency, section “Universal Convergence.” The impossi-
it follows that for any profile of priors there is a bility result is a statement about the ability to
strategy profile that the players will not learn to construct Bayesian models with certain proper-
predict. This can also be shown directly by a ties; it is not a statement about convergence to
diagonalization argument along the lines of equilibrium per se.
Oakes (1985) and Dawid (1985). The impossibil-
ity result of Nachbar (2005) can be viewed as a
game theoretic version of Dawid (1985). For a Payoff Uncertainty
description of what subsets are learnable, see Suppose that, at the start of the repeated game,
Noguchi (2015b). each player is privately informed of his or her
Second, suppose that the strategy subsets are stage-game payoff function, which remains fixed
generated by some standard definition of com- throughout the course of the repeated game. Refer
plexity, the same for both players. Then, as to player i’s stage-game payoff function as her
noted above, richness holds. Suppose further that payoff type. Assume that the joint distribution
there are priors for which learnability holds. This over payoff functions is independent (to avoid
will be the case, for example, for Turing correlation issues that are not central to my dis-
implementable strategies, since the set of such cussion) and commonly known.
strategies is countable. Then, for such priors, for Each player can condition her behavior strategy
large classes of repeated games, consistency fails: in the repeated game on her realized payoff type.
best responses must violate the complexity bound. A mathematically correct way of representing this
Third, if one constructs a Bayesian learning conditioning is via distributional strategies; see
model satisfying learnability and consistency, Milgrom and Weber (1985).
then LBR (see section “Kalai-Lehrer Learning”) For any prior about player 2, now a probability
holds, and, if players play their LBR strategies, distribution over player 2’s distributional strate-
play converges to equilibrium play. This identifies gies, and given the probability distribution over
a potentially attractive class of Bayesian models in player 2’s payoff types, there is a behavior strat-
which convergence obtains. The impossibility egy for player 2 in the repeated game that is
result says, however, that if learnability and con- equivalent in the sense that it generates the same
sistency hold, then player beliefs must be partially distribution over play paths. Again, this is essen-
equilibrated in the sense of, in effect, excluding tially Kuhn’s theorem.
some of the strategies required by richness. Say that a player learns to predict the play path
Fourth, the main result in Nachbar (2005) is if her forecast of next period’s play is asymptoti-
robust along a number of dimensions. It holds cally as good as if she knew the reduced form of
under e optimization, for e small. It holds for fairly her opponent’s distributional strategy. This defi-
weak definitions of prediction (e.g., definitions nition specializes to the previous one if the distri-
that allow occasional but persistent forecasting bution over types is degenerate. If distributional
errors). And the learnability condition can be strategies are in Nash equilibrium (also known in
relaxed. A weaker learnability condition would this context as a Bayesian Nash equilibrium),
require that a player be required to learn to predict then, in effect, each player is optimizing with
only when her own strategy is optimal. A variation respect to a degenerate belief that puts probability
of the result states that, for a large class of repeated one on her opponent’s actual distributional
494 Learning in Games
strategy, and in this case players trivially learn to “Convergence in Classical Learning Models”).
predict the path of play. Thus, for almost every realized type profile in a
One can define LBR (see section “Kalai-Lehrer neighborhood of a game like matching pennies,
Learning”) for distributional strategies and, as in actual play (again meaning the distribution over
the payoff certainty case, one can show that LBR play paths generated by the realized behavior strat-
implies convergence to Nash equilibrium play in egies) cannot converge to Nash equilibrium play of
the repeated game with payoff types. the realized repeated game, even if the distribu-
More interestingly, Nash equilibrium play in tional strategies are in Nash equilibrium. Foster
the repeated game with payoff types implies con- and Young (2001) provide a generalization for
vergence to Nash equilibrium play of the realized non-myopic players.
repeated game – the repeated game determined by This impossibility result is not robust to weak-
the realized type profile. This line of research was ening optimization to e optimization; the positive
initiated by Jordan (1991). Other important papers convergence results for e-optimizing fictitious
include Kalai and Lehrer (1993a) (KL), Jordan play provide one illustration; see section “Conver-
(1995), Nyarko (1998), and Jackson and Kalai gence in Classical Learning Models.” A more
(1999) (the last studies recurring rather than subtle point, however, is that, even for exact opti-
repeated games). mization, a form of convergence obtains that,
Suppose first that the realized type profile has while weaker than convergence of actual play, is
positive probability. In this case, if a player learns to still very strong.
predict the play path then, as shown by KL, her For simplicity, assume that each player knows
forecast is asymptotically as good as if she knew the other’s distributional strategy and that these
both her opponent’s distributional strategy and her strategies form a (Bayesian) Nash equilibrium.
opponent’s realized type. LBR then implies that Then to an outsider, for almost any type profile,
actual play, meaning the distribution over play observed play looks asymptotically like Nash
paths generated by the realized behavior strategies, equilibrium play in the realized repeated game
converges to equilibrium play of the realized (this follows from the main theorem in Nyarko
repeated game. For example, suppose that the type (1998)). In particular, in a neighborhood of a
profile for matching pennies gets positive probabil- game like matching pennies, for almost any type
ity. In the unique equilibrium of repeated matching profile, observed play looks random. Since, by
pennies, players randomize 50:50 in every period. Foster and Young (2001), actual play in this set-
Therefore, LBR implies that if the matching pennies ting cannot converge to equilibrium and in partic-
type profile is realized then each player’s behavior ular cannot be random, the implication is that
strategy in the realized repeated game involves convergence to equilibrium involves a form of
50:50 randomization asymptotically. purification in the sense of Harsanyi (1973), a
If the distribution over types admits a density, so point that has been emphasized by Nyarko
that no type profile receives positive probability, (1998) and Jackson and Kalai (1999). To a player
then convergence is more complicated. Suppose in the game, opponent behavior likewise looks
that players are myopic and that the realized stage random because, even if she knows her oppo-
game is like matching pennies, with a unique and nent’s distributional strategy, she does not know
fully mixed equilibrium. Given myopia, the unique her opponent’s type. As play proceeds, each
equilibrium of the realized repeated game calls for player in effect learns more about her opponent’s
repeated play of the stage-game equilibrium. In type, but never enough to zero in on her oppo-
particular, it calls for players to randomize. It is nent’s realized, pure, behavior strategy.
not hard to show, however, that in a type space Finally, the difficulties with characterizing
game with a density, exact optimization calls for sophistication in Bayesian learning models extend
each player to play a pure strategy for almost every to models with payoff uncertainty, with learnability,
realized type (this is a generalization of a point richness, and consistency redefined in terms of dis-
made in the context of fictitious play in section tributional strategies; see Nachbar (2001).
Learning in Games 495
The Foster and Vohra (1998) calibration result example concerns institutional design, an
has spawned another literature, distinct from its instance of which is the choice of game for
application to learning, on whether it is possible selling goods (a particular type of auction, for
for an observer to distinguish between an expert instance). Different institutions may have sim-
(someone who knows the true form of some sto- ilar Nash equilibria but very different dynamic
chastic process) and a charlatan (someone who properties, and the dynamic properties could
uses the Foster and Vohra (1998) prediction rule, therefore play a role in the choice of institution.
or one of its more elaborate cousins, to pass the
observer’s tests). Papers in this area include
Lehrer (2001), Sandroni (2003), Dekel and Bibliography
Feinberg (2006), Al-Najjar and Weinstein
(2008), and Feinberg and Stewart (2008). Al-Najjar N (2009) Decision makers as statisticians.
Econometrica 77(5):1339–1369
Al-Najjar N, Weinstein J (2008) Comparative testing of
experts. Econometrica 76(3):541–559
Future Directions Aoyagi M (1996) Evolution of beliefs and the Nash equi-
librium of normal form games. J Econ Theory
Directions for future work, both for the learning 70:444–469
Aumann R (1964) Mixed and behaviour strategies in infi-
literature and, more broadly, for the general liter-
nite extensive games. In: Dresher M, Shapley LS,
ature on out-of-equilibrium dynamics in games, Tucker AW (eds) Advances in game theory, Annals of
include the following: mathematics studies, vol 52. Princeton University
Press, Princeton, pp 627–650
Basu K, Weibull AW (1991) Strategy subsets closed under
1. Benchmark learning. One of the goals of the
rational behavior. Econ Lett 36:141–146
literature is to establish a benchmark model of Benaim M, Hirsch M (1999) Mixed equilibria arising from
sophisticated learning. Nachbar (2005) sheds fictitious play in perturbed games. Games Econ Behav
doubt on whether this is possible, but the desid- 29:36–72
Bernheim BD (1984) Rationalizable strategic behavior.
erata considered in Nachbar (2005) are not
Econometrica 52(4):1007–1028
based on axiomatic, decision theoretic criteria. Blackwell D, Dubins L (1962) Merging of opinions with
2. Environments. This survey has focused on increasing information. Ann Math Stat 38:882–886
repeated two-player games, but many other Blume LE (1993) The statistical mechanics of strategic
interaction. Games Econ Behav 5(3):387–424
learning environments and patterns of interaction
Brown GW (1951) Iterative solutions of games by fictitious
are possible. One can, for example, study envi- play. In: Koopmans TJ (ed) Activity analysis of produc-
ronments in which players are linked through a tion and allocation. Wiley, New York, pp 374–376
network as in Blume (1993), or environments in Camerer C (2008) Behavioral game theory. In: Blume LE,
Durlauf SN (eds) The new Palgrave dictionary of eco-
which players are behaviorally heterogenous.
nomics, 2nd edn. Mcmillan, New York
The possibilities are endless. The task is to enrich Cournot A (1838) Researches into the mathematical prin-
the scope of research without simply generating ciples of the theory of wealth. Kelley, New York. Trans-
an ever expanding catalog of models. lation from the French by Nathaniel T. Bacon.
Translation publication date: 1960
3. Empirical testing. One objective of the learn-
Dawid AP (1985) The impossibility of inductive inference.
ing literature is to understand how real people J Am Stat Assoc 80(390):340–341
behave in real games. For a survey of work Dekel E, Feinberg Y (2006) Non-Bayesian testing of an
tying learning models to actual behavior see expert. Rev Econ Stud 73:893–906
Feinberg Y, Stewart C (2008) Testing multiple forecasters.
Camerer (2008). The appropriateness of a
Econometrica 76(3):561–582
model may depend on the environment, and Foster D, Vohra R (1997) Calibrated learning and corre-
in at least in some cases, behavioral heteroge- lated equilibrium. Games Econ Behav 21:40–55
neity may be relevant. Foster D, Vohra R (1998) Asymptotic calibration.
Biometrika 85:379–390
4. Application. An understanding of how players
Foster D, Young P (2001) On the impossibility of pre-
behave outside of equilibrium will enrich the dicting the behavior of rational agents. Proc Natl
application of game theoretic models. One Acad Sci 98:12848–12853
Learning in Games 497
Foster D, Young P (2003) Learning, hypothesis testing, Jordan JS (1993) Three problems in learning mixed-
and Nash equilibrium. Games Econ Behav 45:73–96 strategy Nash equilibria. Games Econ Behav
Freedman D (1965) On the asymptotic behavior of Bayes 5(3):368–386
estimates in the discrete case II. Ann Math Stat Jordan JS (1995) Bayesian learning in repeated games.
36:454–456 Games Econ Behav 9:8–20
Fudenberg D, Kreps D (1993) Learning mixed equilibria. Kalai E, Lehrer E (1993a) Rational learning leads to Nash
Games Econ Behav 5(3):320–367 equilibrium. Econometrica 61(5):1019–1045
Fudenberg D, Levine DK (1993a) Self-confirming equilib- Kalai E, Lehrer E (1993b) Subjective equilibrium in
rium. Econometrica 61(3):523–545 repeated games. Econometrica 61(5):1231–1240
Fudenberg D, Levine DK (1993b) Steady state learning Kalai E, Samet D (1984) Persistent equilibria in strategy
and Nash equilibrium. Econometrica 61(3):547–574 form games. Int J Game Theory 13:129–144
Fudenberg D, Levine DK (1995) Universal consistency Kalai E, Lehrer E, Smorodinsky R (1999) Calibrated fore-
and cautious fictitious play. J Econ Dyn Control casting and merging. Games Econ Behav
19:1065–1089 18(1/2):151–169
Fudenberg D, Levine DK (1998) Theory of learning in Kets W, Voorneveld M (2008) Learning to be prepared. Int
games. MIT Press, Cambridge, MA J Game Theory 37(3):333–352
Fudenberg D, Levine DK (1999) Conditional universal Kuhn HW (1964) Extensive games and the problem of
consistency. Games Econ Behav 29:104–130 information. In: Dresher M, Shapley LS, Tucker AW
Fudenberg D, Levine DK (2006) Superstition and rational (eds) Contributions to the theory of games,
learning. Am Econ Rev 96:630–651 vol II. Annals of mathematics studies, vol 28. Princeton
Fudenberg D, Levine DK (2009) Learning and equilib- University Press, Princeton, pp 193–216
rium. Ann Rev Econ 1:385–420. Harvard University, Lehrer E (2001) Any inspection is manipulable.
Cambridge, MA Econometrica 69(5):1333–1347
Fudenberg D, Takahashi S (2011) Heterogeneous beliefs McKelvey R, Palfrey T (1995) Quantal response equilib-
and local information in stochastic fictitious play. rium for normal form games. Games Econ Behav
Games Econ Behav 71(1):100–120. Harvard Univer- 10:6–38
sity, Cambridge, MA Milgrom P, Roberts J (1991) Adaptive and sophisticated
Fudenberg D, Tirole J (1991) Game theory. MIT Press, learning in repeated normal form games. Games Econ
Cambridge, MA Behav 3:82–100
Germano F, Lugosi G (2007) Global Nash convergence of Milgrom P, Weber R (1985) Distributional strategies for
Foster and Young’s regret testing. Games Econ Behav games with incomplete information. Math Oper Res
60:135–154 10:619–632
Goeree JK, Holt CA (2001) Ten little treasures of game Moulin H (1984) Dominance-solvability and Cournot sta-
theory and ten intuitive contradictions. Am Econ Rev bility. Math Soc Sci 7(1):83–102
91:1402–1422 Nachbar JH (2001) Bayesian learning in repeated games of
Hahn F (1977) Exercises in conjectural equilibrium. Scand incomplete information. Soc Choice Welf 18(2):303–326
J Econ 79:210–226 Nachbar JH (2005) Beliefs in repeated games.
Harsanyi J (1973) Games with randomly disturbed pay- Econometrica 73:459–480
offs: a new rationale for mixed-strategy equilibrium Nachbar J (2008) Learning and evolution in games: belief
points. Int J Game Theory 2:1–23 learning. In: Blume LE, Durlauf S (eds) The new Pal-
Hart S, Mas-Colell A (2000) A simple adaptive procedure grave dictionary of economics, 2nd edn. Palgrave Mac-
leading to correlated equilibrium. Econometrica Millan Ltd, New York
68(5):1127–1150 Noguchi Y (2015a) Bayesian learning, smooth approxi-
Hart S, Mas-Colell A (2001) A general class of adaptive mate optimal behavior, and convergence to e-Nash
strategies. J Econ Theory 98:26–54 equilibrium. Econometrica 83(1):353–373. Kanto
Hart S, Mas-Colell A (2003) Uncoupled dynamics do not Gakuin University, Tokyo
lead to Nash equilibrium. Am Econ Rev 93:1830–1836 Noguchi Y (2015b) Merging with a set of probability
Hart S, Mas-Colell A (2006) Stochastic uncoupled dynam- measures: a characterization. Theor Econ
ics and Nash equilibrium. Games Econ Behav 10(2):411–444. Kanto Gakuin University, Tokyo
57:286–303 Nyarko Y (1994) Bayesian learning leads to correlated
Hofbauer J, Sandholm W (2002) On the global conver- equilibria in normal form games. Economic Theory
gence of stochastic fictitious play. Econometrica 4:821–841
70(6):2265–2294 Nyarko Y (1998) Bayesian learning and convergence to
Hurkens S (1995) Learning by forgetful players. Games Nash equilibria without common priors. Economic
Econ Behav 11:304–329 Theory 11(3):643–655
Jackson M, Kalai E (1999) False reputation in a society of Oakes D (1985) Self-calibrating priors do not exist. J Am
players. J Econ Theory 88(1):40–59 Stat Assoc 80(390):339
Jordan JS (1991) Bayesian learning in normal form games. Sanchirico CW (1996) A probabilisitc model of learning in
Games Econ Behav 3:60–81 games. Econometrica 64(6):1375–1393
498 Learning in Games
Sandholm W (2007a) Evolution in Bayesian games II: Shapley L (1962) Some topics in two-person games. Ann
stability of purified equilibrium. J Econ Theory Math Stat 5:1–28
136:641–667 Voorneveld M (2004) Preparation. Games Econ Behav
Sandholm W (2007b) Evolutionary game theory. In: Ency- 48:403–414
clopedia of complexity and systems science. Springer. Voorneveld M (2005) Persistent retracts and preparation.
Forthcoming Games Econ Behav 51:228–232
Sandroni A (1998) Necessary and sufficient conditions for Young P (1993) The evolution of conventions.
convergence to Nash equilibrium: the almost absolute Econometrica 61(1):57–84
continuity hypothesis. Games Econ Behav 22:121–147 Young P (2008a) Adaptive heuristics. In: Blume LE,
Sandroni A (2000) Reciprocity and cooperation in repeated Durlauf S (eds) The new Palgrave dictionary of eco-
coordination games: the principled-player approach. nomics, 2nd edn. Palgrave MacMillan Ltd
Games Econ Behav 32(2):157–182 Young P (2008b) Stochastic adaptive dynamics. In: Blume
Sandroni A (2003) The reproducible properties of correct LE, Durlauf S (eds) The new Palgrave dictionary of
forecasts. Int J Game Theory 32(1):151–159 economics, 2nd edn. Palgrave MacMillan Ltd.
2016) have given many results. The goal in this
Fair Division survey is to single out certain issues of fairness
and discuss some contributions. In particular, we
Steven J. Brams1 and Christian Klamler2 focus on (i) cake-cutting, i.e., the division of a
1
Department of Politics, New York University, single heterogeneous divisible good, and (ii) the
New York, NY, USA allocation of indivisible items.
2
Institute of Public Economics, University of Issues of fairness come up in many different
Graz, Graz, Austria situations, be it a divorce settlement, the division
of land, the sharing of a common resource, the
allocation of costs, or simply the division of a
Article Outline birthday cake. From a philosophical point of
view, fairness concepts have been widely studied
Glossary (see, e.g., Rawls 1971; Roemer 1996; Ryan 2006;
Introduction Young 1994). But how can we transform this
Cutting Cakes work into concepts useful in disciplines such as
Conclusion economics or computer science? The goal of this
Future Directions review is to address certain specific situations in
Bibliography which fairness considerations might play a role
and how normative and/or algorithmic concepts
Glossary can help in finding acceptable fair-division rules
and/or solutions.
Efficiency An allocation is efficient if there is no The above situations and most other fair-
other allocation that is strictly preferred by at division problems do have certain aspects in com-
least one agent and not worse for any other agent. mon. First, there are at least two agents. These
Envy-freeness An allocation is envy-free if no might be human beings, but they could also be
agent strictly prefers any of the other agents’ states, firms, or other entities. Second, there is a
portions. resource that is going to be divided. Such a
Equitability An allocation is equitable if every resource can be a heterogeneous good (such as a
agent values his or her portion the same. cake with different toppings), a set of indivisible
Maximinality An allocation is maximin if the items, costs or benefits of a joint project, or any-
worst ranked item received by any of the thing else that might have to be divided or allo-
agents is as highly ranked as possible. cated in a division process. Third, there is
Proportionality An allocation is proportional if information about the agents’ preferences over
each of the n agents gets at least 1/n of the good the object(s) to be divided. Such preferences
or goods in his or her valuation. could be in the form of value functions (as in
cake-cutting), an ordinal ranking over sets of
objects (such as in the division of indivisible
Introduction items), or simply the idea that a smaller cost
share is preferred to a larger cost share. Many
Over the last decades, fairness has become a major other aspects of an allocation problem might, of
issue in many different research areas such as course, be of importance in different situations.
economics and computer science. Various books For an overview, see Thomson (2016).
(e.g., Brams and Taylor 1996; Robertson and In general, the fairness of a division procedure
Webb 1998; Moulin 2003) and surveys (e.g., is determined by the properties it satisfies, i.e., a
Thomson 2016; Bouveret et al. 2016; Procaccia normative approach is followed. Different fair-
© Springer Science+Business Media, LLC, part of Springer Nature 2020 499
M. Sotomayor et al. (eds.), Complex Social and Behavioral Systems,
https://doi.org/10.1007/978-1-0716-0368-0_198
Originally published in
R. A. Meyers (ed.), Encyclopedia of Complexity and Systems Science, © Springer Science+Business Media LLC 2017
https://doi.org/10.1007/978-3-642-27737-5_198-3
500 Fair Division
division situations might ask for different such interval. (Formally, the agents have value func-
properties. However, because the existence of a tions, e.g., represented by a probability distribu-
fair-division procedure or solution does not by tion function, over the interval.) The goal is to
itself guarantee that one can also find such a divide the interval into pieces, sometimes requir-
procedure or solution, an important part of the ing them to be contiguous, which would imply the
literature is concerned with the algorithmic use of the minimal number of cuts, i.e., n 1. In
aspects of fair-division procedures. In this case, the case that an agent does not receive one
an additional problem arises with respect to the connected piece, the usual assumption is that the
computational complexity of fair-division proce- total value of the collection of subpieces is simply
dures, i.e., even if we have a procedure at hand to the sum of values of the smaller pieces, i.e., addi-
deal with a certain problem, will it still be able to tive preferences are assumed. From a normative
find a solution in reasonable time when the fair- perspective, the procedures discussed in this liter-
division problem increases in size (i.e., the num- ature focus essentially on four properties:
ber of agents and/or the resource grows)?
Recently, various book-length studies have been • Efficiency: An allocation is efficient if there is
published dealing with computational aspects in no other allocation that is strictly preferred by
Economics (see Rothe 2016) and Social Choice at least one agent and not worse for any other
Theory (see Brandt et al. 2016). agent. (This is in the spirit of Pareto optimality,
a concept widely used in Economics.)
• Proportionality: An allocation is proportional
Cutting Cakes if each of the n agents gets at least 1/n of the
total cake in his or her valuation.
A cake is usually taken as a metaphor for a single • Envy-Freeness: An allocation is envy-free if
heterogeneous good. How to divide such a good no agent strictly prefers any of the other agents’
dates back about 2,800 years, when in Hesiod’s pieces.
Theogony the Greek gods Prometheus and Zeus • Equitability: An allocation is equitable if
argued about how to divide an ox. Eventually they every agent values his or her piece the same.
agreed on the following division procedure: Pro-
metheus divided the ox into two piles, and Zeus Early theoretical results on cake-cutting were
chose one (see Brams and Taylor 1996). Their mostly concerned with the existence of certain
procedure, now called cut-and-choose, can be allocations, e.g., Lyapunov’s theorem (1940) and
seen as the standard example for two-agent situa- results by Dvoretsky et al. (1951) (see also
tions in the fair-division literature. Many other Barbanel 2005). In particular, Lyapunov’s theo-
situations can be analyzed in such a framework, rem shows that there always exists an allocation in
such as the division of land or time slots for the which every agent receives a piece that she values
use of a machine, to mention just a few. at 1/n of the total cake and she also values every
First attempts to tackle the cake-cutting prob- other agent’s piece at 1/n. However, an agent’s
lem came from Polish mathematicians in the piece could consist of a large collection of sub-
1940s, in particular by Hugo Steinhaus, Stefan pieces, which, from a practical point of view
Banach, and Bronislaw Knaster (see Brams and (obtaining only a pile of crumbs), might make
Taylor 1996; Brams 2006 for a historical over- such an allocation unsatisfactory.
view). Moreover, many books and surveys have Let us now turn to cake-cutting procedures
been written about this topic; see, e.g., Brams and (or algorithms) that help in guaranteeing certain
Taylor (1996), Robertson and Webb (1998), and fairness aspects in terms of the properties previ-
Procaccia (2013, 2016). ously defined. Essentially there are two types of
In principle, the cake is seen as the 0–1 interval procedures, namely, discrete algorithms and those
that has to be divided by n agents who have that use moving knives. Discrete procedures are, in
different valuations of the respective parts of the general, algorithms that use a finite number of
Fair Division 501
discrete steps to reach the final allocation. In gets a piece of value at least 1/n = 1/2. In addition,
moving-knife procedures, on the other hand, an it is also envy-free. (In general, envy-freeness
agent has to make continuous decisions and/or implies proportionality: for two agents, envy-
valuations while the knife moves along the cake freeness and proportionality are equivalent),
(see Robertson and Webb 1998). In the cake- because no agent strictly prefers the other agent’s
cutting literature, moving-knife procedures are piece. (Beware that agent A is indifferent between
considerably more complex with respect to the her piece and agent B’s piece.) However, the allo-
decisions that have to be made by the agents. cation is not equitable, because the values that the
As indicated earlier in the example of Prome- agents attach to their pieces are not equal (1/26¼1).
theus and Zeus when dividing an ox, there exists a Finally, the allocation is also efficient, because
simple two-agent cake-cutting procedure called any other allocation that increases the value of
cut-and-choose. The procedure works as follows: one agent’s piece would decrease the value of
the other agent’s piece. (There are situations, i.e.,
1. One agent (the cutter) cuts the cake in two preferences of the agents, in which an allocation
pieces. determined by one cut will be Pareto dominated
2. The other agent (the chooser) chooses one of by an allocation determined by, e.g., two or more
the two pieces. The remaining piece goes to the cuts.) Certainly, A could be better off by guessing
cutter. We illustrate the procedure with the about B’s preferences and making the cut at a
following example:9 different point. However, this is risky, because
by misjudging the other agent’s preferences, she
Example 1 Consider a cake (the 0–1 interval)
could get a smaller piece (and, hence, risk-averse
where one half, i.e., the subinterval 0, 12 , is made agents might not want to do this). What can easily
of strawberry, and the other half, i.e., the subinterval be seen, however, is that the allocation does
1
2 , 1 , is made of chocolate. (As cakes are assumed depend on who is the cutter and who is the
to be nonatomic, i.e., one single point on the interval chooser. If agent B were the cutter, he would, to
does not have any value, we do not have to be maximize the minimum-size piece that he could
concerned about open versus closed intervals.) get, cut the cake at point 34 , i.e., each of the two
Assume that agent A likes both flavors equally and pieces contains exactly half of the chocolate. In
A would pick the first piece, i.e., interval
so only cares about the size of her piece. Agent B, on
that3case
the other hand, does not like strawberry at all and 0, 4 , and leave the second piece, i.e., interval
3
4 , 1 , to agent B. In that case, A receives a value
only seeks pieces containing as much chocolate as
possible. If those preferences are not known to the of 3/4 and B a value of 1/2. Again, this is efficient,
agents and agent A is the cutter, where should she proportional, and envy-free, but it is not equitable.
cut the cake? Obviously, to maximize the minimum The original cut-and-choose procedure is a
share she might receive, she should cut the cake in discrete algorithm, because it only requires two
such a way that both pieces have equal value to her. decisions from the agents. First, the cutter has to
1
This leads to her cutting 1the
cake at point 2 , gener- cut the cake where he/she thinks the two pieces
ating the two pieces
0, 2 (containing only straw- have the same value. Second, the chooser has to
berry) and 12 , 1 (containing only chocolate). Agent evaluate the two pieces and choose one of them.
B, the chooser, will now go for the piece that he However, there is also a moving-knife variant of
considers to be larger.
Given his preferences, this cut-and-choose, introduced by Dubins and
will be the piece 12 , 1 . Eventually, A will get all of Spanier (1961), in which a knife is moved along
the strawberry and B all of the chocolate, and, hence, the interval from 0 to 1 and the first agent to stop
A thinks of having a piece of value 12 and B a piece of the knife receives the piece from 0 to the stopping
value 1. point and the other agent the remainder. The
Is this a fair allocation? According to our pre- fairness properties of this procedure are the
viously defined properties, we see that the alloca- same as those of the discrete version of cut-and-
tion is proportional as each of the n = 2 agents choose.
502 Fair Division
Although the cut-and-choose procedure seems Assume that C received the trimmed piece (the
promising, it cannot really be extended to more same argument holds if B received it). Hence,
than two agents without losing its fairness prop- because B divides the remainder, he will again
erties. If we were to go from two to three agents, receive a tied-for-largest piece. So his total value
guaranteeing envy-freeness with two cuts (the will be at least as large as the total value of any
minimal number) becomes difficult. In the other player. C is going to choose first. Therefore,
1960s, Selfridge and Conway, however, indepen- she will not envy any of the other agents. Finally,
dently discovered a discrete procedure for three A, the second chooser, will definitely not envy B,
agents using up to five cuts. Following Brams and because she chooses before him. But she also will
Taylor (1996), the Selfridge-Conway procedure not envy C, because A originally preferred her
for three agents, A, B and C, can be stated as piece in the first round over the piece C received
follows: plus the remainder. Thus, the final allocation is
envy-free. However, we needed up to five cuts to
1. Let A cut the cake into three pieces she con- achieve this allocation.
siders of equal value. Actually, guaranteeing fairness in the division
2. B trims (if necessary) the piece he considers to of a cake among three or more agents turns out to
be of highest value such that the trimmed piece be difficult if we want to ensure an upper bound on
equals in value his second most preferred the number of cuts to be made. This computa-
piece. tional aspect of fair-division procedures has
3. C chooses from the three pieces the one she recently attracted a lot of attention (see, e.g.,
considers of largest value. Brandt et al. 2016 in their Handbook of Compu-
4. B chooses from the remaining two pieces. In tational Social Choice). Most of the computa-
case they contain the piece that B previously tional results rely on a framework introduced by
trimmed, he must choose this one. Robertson and Webb (1998) in which, focusing on
5. A receives the remaining piece. If there was no the decisions to be made by the agents, they dis-
trimming by B in step 2, stop the procedure. tinguish between cuts, i.e., cutting the cake at a
6. Because only B or C could have received the specific value, and evaluations of a certain sub-
trimmed piece, let the one of them, who did not piece of the cake. It is the number of those queries
receive it, divide the remainder, which was which gives an indication of the complexity of a
created in step 2, into three pieces she or he procedure. Using this approach, it has been shown
considers to be of equal value. that proportionality is definitely easier to achieve
7. Now, let the agent who previously received the than envy-freeness (see Woeginger and Sgall
trimmed piece choose first, then A chooses, and 2007; Procaccia 2009). Actually, given n players,
finally the cutter of the remainder receives there exists a procedure that achieves proportion-
the rest. ality and uses at most n log n queries, whereas for
any algorithm that leads to envy-freeness at least
Let us show why this leads to an envy-free n2 queries are necessary. The least complex pro-
allocation. Obviously, after the first round of cut- portional procedure, as shown by Edmonds and
ting and choosing, the allocation will be envy-free. Pruhs (2006), is an algorithm by Evan and Paz
Because A cuts the cake in the first stage, and (1984) (Even and Paz 1984 also designed a ran-
therefore will definitely receive one of the domized protocol that uses an expected number of
untrimmed pieces in step 5, she will not envy either O(n) cuts.). (See also Brams et al. 2011 on an
of the other agents. B, as the second chooser, will analysis of the divide-and-conquer algorithm.)
also not envy anyone, because he receives a tied- Essentially it works as follows (for simplicity
for-largest piece, and C, as the first chooser, will get assume the number of players to be a power of 2):
her most preferred piece. Now, let us determine
whether the allocation of the remainder of the 1. Each agent makes a mark at a point where
trimmed piece could make someone envious. he/she values the left side of the cake to be
Fair Division 503
equal to the right side, i.e., we have a cut query only exact moving-knife procedure, introduced
to do this for each of the n agents. by Brams and Taylor (1995), requires an
2. Divide the n agents into two subgroups, unbounded number of cuts.
namely, the first n2 agents with their marks This is, from a computational point of view,
farthest to the left and the other agents with unsatisfactory. Stromquist (2008) showed that,
their marks farthest to the right. Cut the cake assuming contiguous pieces, an unbounded algo-
between those two sets. rithm does not exist. Actually, not even the restric-
3. Continue by asking the agents to mark the tion to simpler preference information (such as
point at which they value the left side of their uniformly distributed valuations of the cake)
subpiece equal to the right side of their sub- helps, as was shown by Kurokawa et al. (2013)
piece. Again, this requires a cut query for each (see also Brams et al. 2012a for a discussion of
of the n agents. structured valuations). However, recently Aziz
4. Repeat this until there are only two agents for and Mackenzie (2016) have found a discrete finite
each remaining subpiece. Then use cut-and- and bounded algorithm for n agents. The bounds
choose. are, unfortunately, still very large.
Finally, Brams et al. (2013) show, that if one
As can easily be seen, the Even-Paz procedure adds equitability as a desirable property to be
leads to a proportional allocation, because each satisfied, then together with efficiency and envy-
agent in any round is temporarily assigned a sub- freeness such an allocation does not exist for
piece which he/she values at least 1/2 of the previ- situations with at least three agents. Actually,
ous subpiece. This, eventually, leads to each agent this is independent of the number of cuts allowed.
receiving a piece she values at least 1/n. Interest- If one slightly extends the cake-cutting frame-
ingly, the pieces are also contiguous (in contrast to work, it is also possible to talk about pies.
the Selfridge-Conway allocation). However, envy- Whereas cakes are represented by closed intervals,
freeness is not guaranteed as an agent does not pies are infinitely divisible heterogeneous and
have any influence on the allocation in subgroups atomless one-dimensional continuums whose end-
it does not belong to. How many queries are points are topologically identified. More intui-
needed in this procedure? Because in each round tively, cakes are represented as lines, whereas pies
each of the n agents makes a decision and there are are given as circles. Obviously, because the mini-
log n rounds, the total number of queries is n log n. mal number of cuts necessary to allocate pieces of a
As indicated earlier, envy-freeness is much cake to n agents is n 1, for pies this number is n.
harder to achieve. However, there are some pro- Barbanel et al. (2009) show that pie-cutting is also
cedures for three and four agents that have been a difficult problem, because for three or more
devised and satisfy envy-freeness. Stromquist players, there exists a pie and corresponding pref-
(1980) introduced a moving-knife procedure that erences for which no allocation is envy-free and
uses four simultaneously moving knifes and efficient. However, an allocation that is both equi-
assigns contiguous pieces. Barbanel and Brams table and efficient always exists. In addition, in a
(2004) achieved such an allocation with only two-agent-setting, in contrast to cake-cutting,
two simultaneously moving knifes. Extensions pie-cutting allows for positive results with respect
to four agents turn out to be even more difficult. to agents being entitled to pieces of a certain size
Brams et al. (1997) devised a moving-knife pro- (see Brams et al. 2008).
cedure which requires up to 11 cuts; however, the
extension of the three-agent procedure by Dividing Indivisible Items
Barbanel and Brams (2004) to four agents The previous section on cake-cutting involved
requires only up to five cuts. Beyond four agents, one divisible item. For two agents, it was easily
Su (1999) provided an e-approximate algorithm shown that an envy-free allocation exists. Of
that relies on Sperner’s lemma, but it requires course, there also exist many situations in which
convergence to an exact division. Essentially, the we are concerned with the division of different
504 Fair Division
items. If each of these items were divisible, we procedure to resolve this fair-division problem,
could obviously fall back on the cake-cutting set- called Adjusted Winner (AW), was introduced
ting by just putting the items next to each other by Brams and Taylor (1996) (see also their popu-
(see Jones 2002). The problem is that in many lar book, Brams and Taylor 1999). It is based on
cases items are not divisible, as, e.g., when divid- assigning values to the single items and assuming
ing a painting. Thus, not even in the simplest an additivity condition (hence, there are no com-
setting of two agents can we achieve envy- plementarities or substitutabilities between items,
freeness by dividing one indivisible item. The i.e., the value of an item for the agent does not
only envy-free allocation would be to throw change irrespective of which other items the agent
away the item, but this is, obviously, an inefficient receives), which makes the value of any set of
solution. If, eventually, one agent receives the items to just be the sum of the values of the
object, then the other agent will envy the receiving items contained in that set. In addition, their pro-
agent. cedure relies on one important assumption,
Recently, this branch in the fair-division liter- namely, that eventually one of the items can be
ature has attracted some attention, as can be seen divided (without knowing in advance which item
in a survey paper by Bouveret et al. (2016) and this will be).
papers by Bouveret and Lang (2008), Bouveret The procedure works as follows:
et al. (2010), and Brams et al. (2012b, 2014, 2015,
2017). If the resource to be divided is a set of 1. Each agent distributes 100 points among the
items, we have to compare different bundles, i.e., items. Given the point distributions, we
we need to go from single items to bundles of (provisionally) assign each item to the agent
items. Of course, we could just let the agents that gave more points to it. If an item received
evaluate each of the bundles or provide a ranking the same number of points from the agents, we
of them. However, for just 15 items, this leads to allocate it randomly.
considering 215 = 32,768 subsets. (A different 2. Determine the sum of points of the items that
way to state preferences would be via compact an agent received. If it is exactly the same for
preference representation. This uses a sort of inter- both agents, stop. If the sums are different,
mediate language as a proxy for representing the transfer the item with the lowest point-value
agents’ preferences. See Bouveret et al. 2016 for ratio from the agent with the currently higher
an overview.) Although under certain conditions total sum to the other agent. Calculate the new
we might reduce the number of relevant compar- total sums. In case it is still higher for the first
isons, the general task still seems highly unrealis- agent, continue by transferring the item with
tic. The approach by Herreiner and Puppe (2002) the next-lowest point-value ratio. If it is the
is based on such a ranking of subsets. Their pro- same, stop the procedure. If, after the transfer,
cedure, called descending demand procedure, the receiving agent has a higher total sum,
simultaneously goes down the rankings of the proceed to the last step.
agents until, for the first time, the demands of all 3. Divide the last transferred item so as to equal-
the agents can jointly be satisfied. (This is related ize the total sums for the agents.
to the fallback bargaining algorithm by Brams and
Kilgour 2001.) Although envy-freeness cannot be Let us illustrate AW with the following
guaranteed, the goal is to create balanced alloca- example.
tions based on a maximin idea, i.e., maximize the
minimum rank of each agent’s share. Example 2 Consider a divorce in which there are
In the previous approach of evaluating or rank- two agents, A and B, and six items: the apartment,
ing all subsets, the task for the agents to state those custody of the kids, the dog, a music collection,
preferences is, in general, considered to be jewelry, and a painting. The value that an agent
impractical. Other approaches, however, have i attaches to any item o is given by vi(o). Those
been devised. A very simple and attractive values are given in Table 1, where the last column
Fair Division 505
Fair Division, Table 1 Divorce example of items based on the ranking of items. (For an
Object Agent A Agent B Ratio vB(o)/vA(o) introduction to the literature on ranking sets of
Apartment 35 5 0.14 objects, refer to Barbera et al. 2004.)
Custody 25 40 1.6 Consider a set of four items, 1, 2, 3, and 4,
Dog 20 25 1.25 which are ranked by an agent in the following
Music coll. 10 3 0.3 way: 1 2 3 4. That is, the agent most
Jewelry 7 15 2.14 prefers item 1, then item 2, and so on. How can
Painting 3 12 4 we compare subsets of these items if the only
available information is the agent’s preference
ranking of the items? If we assume no synergies
between the items, i.e., the items are neither com-
determines the ratio of the value assigned by plements nor substitutes, then obviously the set
player B to the value assigned by player A. {1, 2} should be considered better than the set
Given the point distributions by the agents, we {3, 4}, because each item in the first set is pre-
first assign every item to the agent that values it ferred to every item in the second set. Also, {1, 3}
more, i.e., agent A receives the apartment and the can be seen as being better than {2, 4}, because
music collection and agent B the other four items. each item in the first set can be assigned to an item
As this leads to a total sum of points of 45 for in the second set to which it is preferred. Of
A and 92 for B, we need to transfer the item with course, this is not the case when we compare {2,
the lowest point-value ratio currently belonging to 3} to {1, 4}, because there is no different item for
B to agent A. This is the dog. After transferring both 2 and 3 to which they are preferred in the
this item, A has a total value of 65 points and B a other set.
total value of 67 points. Hence, an additional Obviously, in this context, equitability becomes
transfer needs to be made. The item with the a meaningless condition. However, envy-freeness
next-lowest point-value ratio is custody; however, can, to a certain extent, still be a relevant property.
we cannot transfer all of it to A but have to divide Bouveret et al. (2010) define the concept of neces-
it so that we equalize the total sums of the players. sary envy-freeness. (See also Brams et al. 2003 for
We do this by solving the following equation: a similar definition under a different name.) An
35 + 10 + 20 + 25a = 15 + 12 + 40(1 a). allocation (SA, SB), where SA is the set of items
This leads to a = 0.03, i.e., 3% of the item needs allocated to agent A and SB the set of items allo-
to be transferred to A. Practically, this could mean cated to agent B, is necessarily envy-free when-
that there are additional visiting rights that agent ever for every item in SA there is an item in SB that
B concedes to agent A. Eventually, both agents agent A prefers and vice versa for agent B. (This is
receive a set of items that each of them values at also called assuredly envy-free in Brams et al.
65.77. 2001.) Hence, there is one major fairness concept
AW is an attractive procedure. As can be shown, that can be ascertained even in this case. Brams
it satisfies the main (fairness) properties of effi- et al. (2014) also discuss another interesting prop-
ciency, proportionality, envy-freeness, and equita- erty, called maximinality, which requires the worst
bility. However, it cannot deal with cases in which ranked item received by any of the agents to be as
the item to be divided is not eventually divisible or highly ranked as possible. This seems plausible in
agents are not able to attach values to each item. situations in which the worst item does have an
Hence, matters turn out to be more complicated essential impact on the quality of the set of items
when there is no possibility to divide an item or received by an agent. An example might be the
get precise value information from the agents. choice of teams from a pool of workers, where the
Recently, various procedures have been performance of the team crucially depends on the
devised to deal with this sort of situation. The worker with lowest quality.
framework in which those procedures work relies, To see how algorithms might work in this
in principle, on the possibility of comparing sets framework, let us discuss two procedures which
506 Fair Division
have recently been devised, the sequential algo- to B. This leads to a final allocation in which the
rithm (SA) and the singles-doubles procedure apartment, the dog, and the music collection are
(SD) introduced by Brams et al. (2014, 2015). Let assigned to A and custody, jewelry, and the paint-
us start with the apparently simpler SA. It is based ing are assigned to B. Notice that this unique
on the agents’ rankings of the items and works as allocation is efficient, envy-free, and maximin
follows (beware that any statements about effi- according to our previous definitions.
ciency and envy-freeness require that the agents Although in the previous example SA provides
receive sets of items of the same size; that is why a normatively satisfying allocation, this, unfortu-
Brams et al. (2015) assume the set of items to be a nately, is not always the case. As shown in Brams
multiple of the number of agents): et al. (2015), whenever SA provides multiple allo-
cations, certain normative properties cannot be
1. On the first round, descend the ranks of the two guaranteed for more than one of them. However,
or more agents, one rank at a time, stopping at it is shown that in a two-agent division problem,
the first rank at which each agent can be given a SA produces at least one allocation that is efficient
different item (at or above this rank). This is the and, if an envy-free allocation exists, then SA will
stopping point for that round; the rank reached give at least one allocation that is envy-free and
is its depth, which is the same for each agent. efficient.
Assign one item to each agent in all possible In a similar spirit, Brams et al. (2014) devised
ways that are at or above this depth (there may an algorithm, called the singles-doubles proce-
be only one). This may give rise to one or more dure (SD), for a two-agent division problem with
SA allocations. an even number of items that restricts the number
2. On subsequent rounds, continue the descent, of possible outcomes and guarantees envy-
increasing the depth of the stopping point on freeness, if it is possible.
each round. At each stopping point, assign Before stating the algorithm, let us define a
items not yet allocated in all possible ways couple of concepts. The maximin rank m is the
until all items are allocated. maximum rank such that every item comes up
3. At the completion of the descent, if SA gives in either A’s or B’s ranking. A single (for an
more than one possible allocation, choose one agent i) is an item that only comes up in agent
that is efficient (Pareto optimal) and, if possi- i’s ranking before the maximin rank. A double
ble, envy-free. comes up in both players’ rankings before the
maximin rank. The SD algorithm can now be
Let us illustrate SA with the following example: stated as follows:
Example 3 We use the same items as in the AW 1. Determine the maximin rank m.
example. However, consider just the ordinal rank- 2. Assign to each agent its singles. If all items are
ings of the six items and not their precise valua- allocated, stop the procedure.
tions. This leads to the following preferences given 3. Identify, for each agent, its most preferred
in Table 2 (starting with the most preferred item on unassigned double. If these are different,
top to the least preferred item at the bottom):
The stopping point in round 1 is depth 1, where
agent A obtains the apartment and agent Fair Division, Table 2 Preference rankings
B custody. At depth 2, we cannot give different PA PB
items to the agents, because agent B has already Apartment Custody
received custody. Hence, in round 2 we must Custody Dog
descend to depth 3, to give the agents different Dog Jewelry
items, namely, the dog to A and jewelry to B. Music coll. Painting
Finally, in round 3, we descend to depth 4 and Jewelry Apartment
assign the music collection to A and the painting Painting Music coll.
Fair Division 507
assign them accordingly. If they are the same, apartment and the music collection, whereas for
identify the agent who can be assigned its agent B the singles are jewelry and the painting.
second most preferred unassigned double Hence, this leaves us with two doubles, namely,
while still satisfying envy-freeness. (This can custody and the dog. Because both agents prefer
easily be checked by looking at each rank k in custody over the dog, we need to check whether
an agent’s ranking. An agent is not envious if we can assign the dog to one of the two agents
he/she receives strictly more than half of the without creating envy. This is only possible for
items from its ranking up to rank k for any odd agent A. Therefore, we need to assign custody to
k < n.) Break any ties at random. agent B. The final allocation, therefore, assigns the
apartment, the dog, and the music collection to
We illustrate the SD algorithm with the follow- agent A and custody, jewelry, and the painting to
ing example: agent B. As can easily be checked, this allocation
is envy-free and maximin.
Example 4 Let us again use the same rankings as Although SA and the SD procedure provide
in the previous example, now presented in the same outcome for the above rankings, this is,
Table 3. in general, not the case. Moreover, note that SA
The double horizontal line after rank 4 indicates always outputs a complete allocation, whereas the
the maximin rank, i.e., every item comes up in SD procedure only works whenever an envy-free
either A’s or B’s ranking before the line. First, we allocation exists. On the other hand, if an envy-
want to check for existence of a maximin and free allocation exists, the SD procedure will only
envy-free allocation. This can easily be done by output allocations which are envy-free, whereas
checking whether the sets of objects up to every SA might also output allocations which do not
odd rank are the same for both agents or not. If satisfy this property.
not, a maximin and envy-free allocation exists (for A few other procedures for the two-agents case
a precise definition of the relevant condition, see do exist, e.g., the undercut procedure (see also
Brams et al. 2014). As we see in the preferences of Vetschera and Kilgour 2014 for a discussion of
Table 3, the top-ranked items are different for both related contested pile methods) by Brams et al.
agents. We further need to check rank 3, where we (2012b) (with an extension by Aziz 2015 who
need to compare agent A’s set of items consisting simplifies the procedure and allows for more gen-
of the apartment, custody, and the dog with agent eral conditions) or the Trump rule by Pruhs and
B’s set consisting of custody, the dog, and jewelry. Woeginger (2012). Extending these to more than
Because those sets are different, we proceed with two agents makes the problem significantly
rank 5, which, as can be easily seen, also leads to harder. Brams et al. (2001) provide a list of para-
different sets of items. Hence, an envy-free and doxes showing what can go wrong with respect to
maximin allocation exists. Now, let us determine efficiency and envy-freeness in determining fair
the singles and doubles. As the maximin rank is 4, shares for the agents.
we know that agent A’s singles must be the Another very simple class of procedures are
picking sequences. These demand rather little
preference information from the agents and
require only a sequence according to which the
Fair Division, Table 3 Preference rankings
agents choose their items. For example, if we have
PA PB five items and three agents, A, B, and C, then the
Apartment Custody sequence ABCCB indicates that agent A chooses
Custody Dog an item first, then B chooses an item, then
Dog Jewelry C selects two items, and finally B receives the
Music coll. Painting final item. Computationally, this procedure is
Jewelry Apartment very appealing because it only requires small
Painting Music coll. bits of information from the agents. Using
508 Fair Division
sequences is also helpful in determining whether continuing progress to be made in these areas.
an allocation is efficient. As shown by Brams and We also expect practical problems of fair division,
King (2005), an allocation is efficient if and only ranging from the splitting of the marital property
if it is the product of sincere choices by the agents in a divorce to determining who gets what in an
in some sequence. In case of underlying additive international dispute, to be more and more solved
preferences, Bouveret and Lang (2011) have stud- by algorithms available for online use (see, e.g.,
ied sequences with respect to their best social http://www.spliddit.org or http://www.nyu.edu/
performance in utilitarian and egalitarian settings. projects/adjustedwinner/).
Under certain assumptions, it turns out that a
sequence of strict alternation (e.g., ABCABC...
for three agents) maximizes utility for society. Bibliography
If we turn from envy-freeness to proportionality,
other approaches are possible. One is to adapt the Aziz H (2015) A note on the undercut procedure. Soc
Choice Welf 45(4):723–728
cut-and-choose algorithm from cake-cutting (see
Aziz H, Mackenzie S (2016) A discrete and bounded envy-
Budish 2011). Because we are concerned with free cake cutting protocol for any number of agents.
indivisible items, we need only look at so-called preprint, https://arxiv.org/abs/1604.03655
maximin shares, i.e., the highest value that an agent Barbanel JB (2005) The geometry of efficient fair division.
Cambridge University Press, Cambridge
can guarantee for himself or herself in case he/she
Barbanel JB, Brams SJ (2004) Cake division with minimal
were to divide the items into n piles and will choose cuts: envy-free procedures for three persons, four per-
last. Procaccia and Wang (2014) show that there are sons, and beyond. Math Soc Sci 48(3):251–269
cases in which such a maximin share will not be Barbanel JB, Brams SJ, Stromquist W (2009) Cutting a pie
is not a piece of cake. Am Math Mon 116(6):496–514
achieved. However, they also prove that there is
Barbera S, Bossert W, Pattanaik PK (2004) Ranking sets of
always a guarantee to each agent of receiving at objects. In: Barbera S, Hammond PJ, Seidl C (eds)
least 2/3 of the maximin share. Handbook of utility theory, vol 2. Springer, New York
Bouveret S, Lang J (2008) Efficiency and envy-freeness in
fair division of indivisible goods: logical representation
and complexity. J Artif Intell Res 32:525–564
Conclusion Bouveret S, Lang J (2011) A general elicitation-free pro-
tocol for allocating indivisible goods. Proceedings of
In this survey, we have discussed two important the 22nd international joint conference on artificial
intelligence. pp 73–78
kinds of fair-division, cake-cutting, and the divi-
Bouveret S, Endriss U, Lang J (2010) Fair division under
sion of indivisible items. Various normative con- ordinal preferences: computing envy-free allocations of
cepts and applicable procedures have been indivisible goods. Proceedings of the 19th European
introduced. Clearly, envy-freeness is the major conference on artificial intelligence
Bouveret S, Chevaleyre Y, Maudet N (2016) Fair allocation
normative property in this respect, but it cannot
of indivisible goods. In: Brandt F, Conitzer V, Endriss U,
always be satisfied. We also discussed recent Lang J, Procaccia AD (eds) Handbook of computational
results from the literature on Computational social choice. Cambridge University Press, New York
Social Choice that concern the computational Brams SJ (2006) Fair division. In: Weingast BR, Wittman
D (eds) Oxford handbook of political economy. Oxford
complexity of fair-division algorithms.
University Press, New York
Brams SJ, Kilgour DM (2001) Fallback bargaining. Group
Decis Negot 10(4):287–316
Future Directions Brams SJ, King DL (2005) Efficient fair division: help the
worst off or avoid envy? Ration Soc 17(4):387–421
Brams SJ, Taylor AD (1995) An envy-free cake division
Patently, fair division is a hard problem, whatever protocol. Am Math Mon 102(1):9–18
the things being divided are. While some conflicts Brams SJ, Taylor AD (1996) Fair division: from cake-
are ineradicable, the trade-offs that best resolve cutting to dispute resolution. Cambridge University
Press, New York
them are by no means evident. Neither are the best
Brams SJ, Taylor AD (1999) The win-win solution:
algorithms for solving fair-division problems, or guaranteeing fair shares to everybody. W.W. Norton,
their computational complexity; we expect New York
Fair Division 509
Brams SJ, Taylor AD, Zwicker WS (1997) A moving-knife Kurokowa D, Lai JK, Procaccia A (2013) How to cut a
solution to the four-person envy-free cake division cake before the party ends. Proceedings of the 27th
problem. Proc Am Math Soc 125(2):547–554 AAAI conference on artificial intelligence. pp 555–561
Brams SJ, Edelman PH, Fishburn PC (2001) Paradoxes of Lyapounov A (1940) Sur les fonctions-vecteurs
fair division. J Philos 98(6):300–314 completement additives. Bull Acad Sci USSR 4:465–478
Brams SJ, Edelman PH, Fishburn PC (2003) Fair division Moulin H (2003) Fair division and collective welfare. MIT
of indivisible items. Theor Decis 55(2):147–180 Press, Cambridge
Brams SJ, Jones MA, Klamler C (2008) Proportional pie- Procaccia AD (2009) Thou shalt covet thy neighbor’s cake.
cutting. Int J Game Theor 36:353–367 Proceedings of the 21st international joint conference
Brams SJ, Jones MA, Klamler C (2011) Divide and con- on artificial intelligence. pp 239–244
quer: a proportional, minimal-envy cake-cutting proce- Procaccia AD (2013) Cake cutting: not just a child’s play.
dure. SIAM Rev 53(2):291–307 Commun ACM 56(7):78–87
Brams SJ, Feldman M, Morgenstern J, Lai JK, Procaccia Procaccia AD (2016) Cake cutting algorithms. In:
A (2012a) On maxsum fair cake divisions. Proceedings Brandt F, Conitzer V, Endriss U, Lang J, Procaccia
of the 26th AAAI conference on artificial intelligence. AD (eds) Handbook of computational social choice.
pp 1340–1346 Cambridge University Press, New York
Brams SJ, Kilgour DM, Klamler C (2012b) The undercut Procaccia AD, Wang J (2014) Fair enough: guaranteeing
procedure: an algorithm for the envy-free division of approximate maximin shares. Proceedings of the 15th
indivisible items. Soc Choice Welf 39:615–631 ACM conference on economics and computation.
Brams SJ, Jones MA, Klamler C (2013) N-person cake- pp 675–692
cutting: there may be no perfect division. Am Math Pruhs K, Woeginger GJ (2012) Divorcing made easy. In:
Mon 120:35–47 Kranakis E, Krizanc D, Luccio F (eds). FUN 2012,
Brams SJ, Kilgour DM, Klamler C (2014) Two-person fair LNCS 7288. Springer. pp 305–314
division of indivisible items: an efficient envy-free Rawls J (1971) A theory of justice. Harvard University
algorithm. Not AMS 61:130–143 Press, Cambridge
Brams SJ, Kilgour DM, Klamler C (2015) How to divide Robertson J, Webb W (1998) Cake-cutting algorithms. A K
things fairly. Math Mag 88(5):338–348 Peters, Natick
Brams SJ, Kilgour DM, Klamler C (2017) Maximin envy- Roemer JE (1996) Theories of distributive justice. Harvard
free division of indivisible items. Group Decis Negot University Press, Cambridge
26:115–131 Rothe J (ed) (2016) Economics and computation: an intro-
Brandt F, Conitzer V, Endriss U, Lang J, Procaccia AD duction to algorithmic game theory, computational
(eds) (2016) Handbook of computational social choice. social choice and fair division. Springer, Berlin
Cambridge University Press, New York Ryan A (2006) Fairness and philosophy. Soc Res 73:597–606
Budish E (2011) The combinatorial assignment problem: Stromquist W (1980) How to cut a cake fairly. Am Math
approximate competitive equilibrium from equal Mon 87(8):640–644
incomes. J Polit Econ 119(6):1061–1103 Stromquist W (2008) Envy-free cake divisions cannot be
Dubins LE, Spanier EH (1961) How to cut a cake fairly. found by finite protocols. Electron J Comb 15:
Am Math Mon 68:1–17 R11:1–10
Dvoretsky A, Wald A, Wolfovitz J (1951) Relations among Su FE (1999) Rental harmony: Sperner’s lemma in fair
certain ranges of vector measures. Pac J Math 1:59–74 division. Am Math Mon 106:930–942
Edmonds J, Pruhs K (2006) Cake cutting really is not a Thomson W (2016) Introduction to the theory of fair allo-
piece of cake. Proceedings of the 17th annual ACM- cation. In: Brandt F, Conitzer V, Endriss U, Lang J,
SIAM symposium on discrete algorithms. pp 271–278 Procaccia AD (eds) Handbook of computational social
Even S, Paz A (1984) A note on cake cutting. Discret Appl choice. Cambridge University Press, New York
Math 7:285–296 Vetschera R, Kilgour DM (2014) Fair division of indivis-
Herreiner DK, Puppe C (2002) A simple procedure for ible items between two players: design parameters for
finding equitable allocations of indivisible goods. Soc contested pile methods. Theor Decis 76(4):547–572
Choice Welf 19:415–430 Woeginger GJ, Sgall J (2007) On the complexity of cake
Jones MA (2002) Equitable, envy-free and efficient cake cutting. Discret Optim 4(2):213–220
cutting for two people and its application to divisible Young HP (1994) Equity: in theory and practice. Princeton
goods. Math Mag 75(4):275–283 University Press, Princeton
society faces more than two alternatives,
Social Choice Theory while respecting the Arrowian conditions of
Independence of Irrelevant Alternatives, Non-
Salvador Barberà Dictatorship, Universal Domain, and Pareto.
MOVE, Universitat Autònoma de Barcelona and Chaos theorems Cyclical patterns in social pref-
Barcelona GSE, Barcelona, Spain erences arise in many cases, under a wide
variety of aggregation rules. In multi-
dimensional settings, where social alternatives
Article Outline can be identified with vectors of characteris-
tics, chaos theorems prove that such cyclical
Glossary patterns can emerge, even if individual prefer-
Definition of the Subject ences are restricted to be saturated and con-
Introduction cave, in almost arbitrary forms.
Cyclical Patterns and Arrow’s Impossibility Characterizations Many aggregation rules,
Theorem social choice functions, and voting rules have
Sen’s Result on the Impossibility of a Paretian been proposed and used by societies. Results
Liberal that make explicit the unique sets of properties
Incentives: The Gibbard-Satterthwaite Theorem that characterize single rules, or classes of
Escaping Impossibilities them, illuminate the advantages and the draw-
Voting Rules: A Gallery of Proposals backs of such proposals.
Broader Horizons Impossibility theorems Some important results
Future Research in social choice theory state that certain com-
Bibliography binations of axioms cannot be jointly satisfied
by any aggregation rule, or by any social
choice function, or by any voting method.
Keywords These results highlight the existence of
Aggregation rules · Voting methods · Social unavoidable trade-offs between normatively
choice functions · Impossibility theorems · attractive properties that one might demand
Arrow’s impossibility theorem · Chaos from such procedures. They are best
theorems · Characterizations · Strategy- interpreted as invitations to explore the possi-
proofness · Single peakedness · Liberalism bility of satisfying attractive subsets of such
properties through the choice of appropriate
rules.It requires truthful revelation to be a dom-
Glossary inant strategy for each agent, at each possible
social situation. Other requirements regarding
Aggregation Rules These are methods that com- incentives are also studied in the literature.
bine information about the preferences of Liberalism This is one of the many axioms that can
agents in society and turn them into binary be imposed on social decision-making models. It
relations, interpreted as “collective prefer- requires agents to be endowed with power to
ences,” that may or may not inherit the proper- choose between those alternatives that only differ
ties of those attributed to individuals. in aspects that are of their sole concern.
Arrow’s Impossibility theorem This pioneering Single peakedness This is a condition on prefer-
result expresses the logical impossibility of ence profiles, requiring alternatives to be line-
aggregating individual transitive preferences arly ordered in such a way that, for each
into social transitive preferences, when a individual, any alternative is worse than all
© Springer Science+Business Media, LLC, part of Springer Nature 2020 511
M. Sotomayor et al. (eds.), Complex Social and Behavioral Systems,
https://doi.org/10.1007/978-1-0716-0368-0_666
Originally published in
R. A. Meyers (ed.), Encyclopedia of Complexity and Systems Science, © Springer Science+Business Media LLC 2018
https://doi.org/10.1007/978-3-642-27737-5_666-1
512 Social Choice Theory
others that are “between” it and his or hers assemblies, to pass judgment on disputed issues,
unique best. to appoint officers, or to choose political and
Social choice functions These are methods that religious leaders. A letter of Pliny the Young to
combine information about the preferences of Titus Aristo already contains a rich discussion of
agents in society and turn them into a social strategic voting in the Roman Senate’s. And the
decision. early Byzantine church already used methods that
Strategy-proofness This a property regarding the we now call rules of k names, to arbitrate between
incentives of agents to reveal their true charac- the secular and the religious powers. Yet, the
teristics when participating in a collective deci- systematic study of such methods followed a
sion process. rather discontinuous path. It reached one early
Voting methods These are collective decision- peak in the middle age, starting with Ramon
making procedures based on two definitory Llull’s early proposals of voting systems, whose
elements: the information that agents are allo- use and study have persisted to our days. Then
wed to express when filling a ballot, and the followed a wide gap, until the Enlightenment
rules to be used when determining a winner marked a golden age, with authors as influential
from the set of votes emitted by society as Borda and Condorcet, who connected voting
members. with the great issues of political philosophy. Then,
another period of lesser activity followed, until, in
the middle of the twentieth century, the works of
Definition of the Subject Arrow (1951, 1963) and Black (1948, 1958)
marked the beginning of a period of renewed
The use of voting methods dates back from interest in voting, and a definite extension of
ancient times, and many rules have been many of the ideas already advanced in its study
designed and used by different societies to take to cover an even longer list of topics and concerns.
into account the opinions of their members. Yet, The volume on Classics of Social Choice, edited
the systematic study of such methods followed a by Mc Lean and Urken (1995), collects many
rather discontinued path. It reached one initial historical contributions to the subject. The editors’
peak in the middle ages, lived a golden age introduction highlights the fact that authors in
during the Enlightenment, and was activated in each of these periods of intense discussion were
the middle of the twentieth century thanks to the not only distant in time but also mainly ignored its
works of Black and very specially of Arrow, predecessors.
who set the grounds for a comprehensive view Social choice theory is the result of this
of the subject of social choice and provided new extended view. Its topics of interest include the
tools of analysis. Modern social choice theory is design of collective decision-making mecha-
the result of this extended view. Its distinctive nisms, whether through voting or by other
characteristic is the normative study of the means, the analysis of procedures to aggregate
values and procedures involved in collective preferences, information, or judgments and
decision-making, through the use of the axiom- reflect the concern for different ethical and prag-
atic method. Kenneth Arrow, John Harsanyi, matic principles, ranging from justice and equity
Amartya Sen, and Eric Maskin are Nobel laure- to efficiency and incentives and computational
ates with leading contributions to social choice. complexity. One distinctive characteristic of
social choice theory is the use of axiomatic anal-
ysis, and its focus on normative, rather than
Introduction descriptive aspects. This distinguishes social
choice from a large part of political science and
The use of voting methods dates back from of political economy, whose choice of topics and
ancient times, and many different rules have methodology are positive. Methods leading,
been designed by different societies to rule their directly or indirectly, to collective decision-
Social Choice Theory 513
making, are modeled as functions, and their to many different contexts and apply to many
desirable properties are expressed in the form of rules other than simple majority. The example
axioms that each potential method may or may and related ones show that simple majority can-
not satisfy. This allows discussing the ability of not be used to aggregate the transitive prefer-
each method to satisfy some of these desiderata, ences of individuals over more than two
to compare alternative procedures, and eventu- alternatives and always guarantee the transitiv-
ally to characterize each one of them in terms of ity of the aggregate (not even its acyclicity, in
the properties they can meet. Alternatively, when case social indifferences are allowed). Hence,
demands on single methods are incompatible, simple majority is not an aggregator that
impossibility results are proven and become a delivers objects of the same kind than the inputs
helpful tool to stop fruitless debates and to sug- it aggregates. Moreover, it is not even a proper
gest profitable trade-offs between competing method to select best decisions. This is because
principles. a best alternative may not exist for cyclical
We concentrate in the text on a few essential binary relations; hence, majority rule cannot
results that have marked the development of always be used to determine which alternative
social choice theory, since Arrow to our days, is socially best, a purpose that it could serve if
and add a necessarily incomplete list of refer- its outputs were transitive, or at least
ences. The surveys and textbooks that are men- acyclic. Thus, majority voting, which is an
tioned as additional references should helpful the extremely attractive rule to aggregate prefer-
interested reader to compensate for possible ences and to adopt decisions when only two
biases in the present list. alternatives are at stake, runs into difficulties
when it comes to properly order, or simply to
choose from, sets of three or more alternatives.
In his seminal book Social Choice and Individ-
Cyclical Patterns and Arrow’s
ual Values (1951, 1963), Arrow extended this
Impossibility Theorem
remark well beyond the case of majority voting
and showed that the problems we just pointed at
Consider a society where three voters, 1, 2, and
will be shared by any voting rule satisfying
3 must decide which of three alternatives x, y, and
some conditions that he considered minimal.
z to choose. Assume that their preferences are
Let A be the set of alternatives and R be the set
expressed by a ranking of those alternatives, as
of all possible rational (that is, reflexive, com-
shown in the following table, where alternatives
plete, and transitive) preference relations on A.
are listed by order of preference for each of the
For each i N, let Ri R that we shall interpret
agents (Table 1).
as the family of i’s admissible preferences.
Clearly, x is preferred to y by a majority of
A preference aggregation rule on i NRi Rn
voters, and with the same criterion y is preferred
is a map, F : i NRi ! B, where B denotes the
to z, and z is preferred to x: The resulting com-
set of all reflexive and complete binary relations
parisons between these alternatives do not yield
on A.
a transitive binary relation, but a cyclical one.
Elements in Rn are called preference profiles
This simple example is often referred to as the
denoted by RN = (R1, . . . , Rn).
“paradox of voting.” Already known by Con-
A preference aggregation rule F has universal
dorcet (1785) in the eighteenth century, it
domain if all individual preferences are admissi-
announces a host of problems that generalize
ble. That is, if for each i N, Ri is the set of all
orderings on A (i.e., if Ri = R).
Social Choice Theory, R1 R2 R3
Table 1 Agents’ A preference aggregation rule F satisfies
x z y independence of irrelevant alternatives on
preferences
y x z
i NRi Rn if the social preferences between
z y x
alternatives x and y depend only on the individual
514 Social Choice Theory
preferences between x and y. Formally, for two literature that we will discuss below. Before we
preference profiles (R1, . . ., Rn) and (R0 1, . . ., R0 n) do, let us propose two other important results that
such that for all individuals i, alternatives x and also had a strong impact in the development of
y have the same order in Ri as in R0 i, alternatives social choice theory.
x and y have the same order in F (R1, . . ., Rn) as in
F (R0 1, . . ., R0 n).
A preference aggregation rule F satisfies the
Sen’s Result on the Impossibility of a
Pareto condition if when an alternative x is
Paretian Liberal
ranked strictly above y by all individual orderings
R1, . . ., Rn, then x is ranked strictly above y by
Amartya Sen proposed the following puzzle,
F (R1, . . ., Rn).
regarding the possibility of allowing individuals
A preference aggregation rule is non-
to freely choose the characteristics of alternatives
dictatorship if there is no individual, i whose strict
that are of their sole concern, while respecting the
preferences always prevail. That is, there is no
Pareto principle. He started by a colorful exam-
i N such that for all (R1, . . ., Rn) i NRi,
ple. Two agents, one of them (V) lascivious, the
x ranked strictly above y by Ri implies x ranked
other (D) a prude, must decide whether one of
strictly above y by F (R1, . . ., Rn), for all x and y.
them reads a copy of Lady Chatterley’s Lover.
A social welfare function F on i NRi Rn
The alternatives are that V reads it (v), that
is a preference aggregation rule whose elements
D reads it (d) and that none of them does (n).
in the range are transitive orders (i.e.,
Suppose that D’s ideal is that nobody reads, but
F : i NRi ! R).
thinks that, if someone must, he prefers to do it,
rather than letting V to enjoy the book. On his
Theorem 1 (Arrow’s Impossibility Theorem,
side, V considers a waste that nobody reads the
1951–1963) When the number of alternatives is
book, prefers to read it himself, but enjoys even
larger than two, no social welfare function can
more the prospect that the prude is the one to do
simultaneously satisfy the conditions of universal
it. The resulting preference orders would be
domain, independence of irrelevant alternatives,
(Table 2)
Pareto and nondictatorship.
Sen proposed that a minimal condition of
liberalism should allow agents to decide about
Notice that Arrow starts by formalizing the
those aspects of the decision that are of their
methods that would solve his aggregation prob-
exclusive concern. In that case, each agent
lem as functions then imposes desiderata on them,
should be able to choose between reading the
in terms of the axioms that one may want these
book or not to, and the social ranking of alter-
functions to satisfy, to conclude that these axioms
natives should agree with the result of these free
are mutually incompatible. Arrow’s theorem can
choices. Hence, the social preference would
be proved in many different ways. See, for
have to rank n over d, because D prefers not to
example, Sen (1970), Mas-Colell et al. (1995),
read, and v over n, because V prefers to read. If
Fishburn (1970), Barberà (1983b), Geanakoplos
social preferences were transitive, then v would
(2005), Yu (2012).
be declared better than d. But this ranking con-
Other results in social choice that are also
tradicts the Pareto principle, since v is Pareto
obtained by the axiomatic method need not be
dominated by d!
negative or may involve other classes of functions
than the ones Arrow referred to. We shall present
these different setups and corresponding results, Social Choice Theory, Prude Lascivious
Table 2 Agents’ D and (D) (V)
positive or negative. V preferences
At any rate, Arrow’s impossibility theorem n d
opened the door to numerous interpretations and d v
v n
qualifications and generated a whole body of
Social Choice Theory 515
The contradiction raised by this example was not to tackle the subject, and in spite of some
formally stated and extended in Sen (1970) to early and brilliant contributions by Farquharson
general situations, in the following terms. Let us (1969) and Vickrey (1960), it was only in the
assume that each individual i N has a “pro- 1970s that the topic entered in full force into social
tected or recognized private sphere” Di consisting choice theory, thanks to the simultaneous discov-
of at least one pair of personal alternatives over ery by Gibbard (1973) and Satterthwaite (1975) of
which this individual is decisive both ways in the what is now called the Gibbard-Satterthwaite the-
social choice process, i.e. (x, y) Di if and only if orem, and the contemporary work of Prasanta
(y, x) Di with x = y. Di is called symmetric in Pattanaik (1976, 1978).
this case. And decisiveness means that whenever Here is the seminal result in this area, again in
(x, y) Di and xPiy, then xPy and whenever the form of an impossibility theorem.
(y, x) Di and yPix, then yPx for society. A social choice function on i NRi Rn is a
A preference aggregation rule satisfies liberalism function f : i NRi ! A.
if for each individual i, there is at least one pair of A social choice function f on i NRi is non-
personal alternatives (x, y) A such that the indi- dictatorial if there is no individual d such that for
vidual is decisive both ways in the social choice all (R1, . . ., Rn) i NRi, f (R1, . . ., Rn) is a most
process. Therefore, (x, y) Di and xPiy imply preferred alternative for Rd in A.
xPy and (y, x) Di and yPix imply yPx for society. A social choice function f on i NRi is
Minimal liberalism is the weaker requirement manipulable at RN i NRi by coalition
that the above property should at least hold for C N if there exists R'C i CRi ( R0i 6¼ Ri
two agents. for any i C ) such that f(R'C, RN\C)Pi f (RN) for
Finally Sen concentrated attention on aggrega- all i C. A social choice function is group
tion rules whose images are cyclic in order to strategy-proof if it is not manipulable at any RN
ensure nonempty social choices. The result then is: by any coalition C N, and it is (individually)
strategy-proof if it is not manipulable by any
Theorem 2 (Sen’s impossibility of a Paretian lib- singleton agent.
eral, 1970) There is no preference aggregation
rule generating acyclic social preferences that Theorem 3 (Gibbard-Satterthwaite impossibility
satisfies the universal domain condition, weak result) There is no social choice function that is
Pareto efficiency, and minimal liberalism. nondictatorial, strategy-proof and has at least
three possible outcomes in the range.
This “Paretian liberal” paradox gave rise to
many comments and reformulations that we’ll The Gibbard-Satterthwaite theorem shows that
discuss below. when society must eventually choose out of more
than two alternatives, using a nondictatorial rule,
there will exist preference profiles where an agent
Incentives: The Gibbard-Satterthwaite would gain from not declaring her true preferences.
Theorem Telling the truth is not a weakly dominant strategy,
because it is not always best. In other terms, con-
The possibility of strategic behavior on the part of ditioning one’s vote on those used by other agents
voters has been recognized since ancient times: is optimal, and strategic voting will be the rule.
Plinius the Young already discussed an instance of Given its importance, many different proofs of
it in one of his letters, and medieval authors were this theorem have been developed. See Gibbard
clearly concerned by the possibility that voters (1973) for the original one, Mas-Colell et al.
might not vote for the best candidate. The idea (1995), Schmeidler and Sonnenschein (1978),
that one may cast a “useful vote,” rather than a Barberà (1983a).
“sincere” one is in the mind of any participant in a In fact, there exist interesting parallels between
decision process. But Arrow explicitly decided Arrow’s and the Gibbard-Satterthwaite theorem,
516 Social Choice Theory
that have been stressed by many authors, includ- enriched by the work of Bossert and
ing Satterthwaite (1975), Muller and Satterthwaite Suzumura (2010).
(1977, 1985), Reny (2001), Eliaz (2004), and A second route of escape from the impossibil-
Yu (2013). ity is found by excluding certain preference pro-
files from the domain of preference aggregation
rules, thus weakening the universal domain
Escaping Impossibilities requirement. The more celebrated criterion to
restrict preference profiles is Black’s (1948) single
As we already signaled, the axiomatic method is peakedness condition.
the distinctive mark of social choice theory. For any i N, let t(Ri) denote the best alter-
Impossibility results are not its only outputs, native of Ri on A, also called its peak.
although the famous ones we quoted above are A preference profile RN is single peaked if there
very salient. Their importance resides on the fact exists a linear order < on A such that for each
that they pave the way to possibility results: While agent i N and any two alternatives x, y A,
setting the limits of what can be achieved, and t(Ri) < x < y, or y < x < t(Ri), then xPiy.
clearly establishing that no collective decision In many applications, the order relative to
making rule is by any means perfect, they also which single peakedness is predicated arises nat-
indicate the directions in which it is possible to urally from the interpretation of the situation that
identify some attractive ones. In what follows we is modeled. This is the case, for example, when
describe what are often called “escape routes” alternatives are political candidates, whose posi-
away from impossibilities. Some allow for real tion on a left-right spectrum is agreed upon by all
escapes, others retain a negative flavor, proving voters, or locations of some public facility on a
the depth of those conflicts that were unveiled by linear space. In other cases, determining the exis-
the original results. tence of an order that turns the profile into a single
peaked one is itself a question that needs analysis.
Escaping Arrow’s Impossibility Hence, one may ask whether there are properties
Arrow’s framework is particular in several ways, that characterize single peakedness without need
as it concentrates on social welfare functions, to explicitly refer to the underlying order. Indeed
whose domain consists of profiles of preference there are two, that were discovered with a consid-
orders and whose images must also be transitive. erable time gap between them. One condition was
One possible way to escape the impossibility is by identified in seminal papers by Sen (1966) and
changing the domain or the range of definition of Sen and Pattanaik (1969), and it involves the
the functions we consider. This will be discussed ranking of triples of alternatives by triples of
later, but first consider those changes that keep the agents.
original framework and only depart from it in A preference profile RN satisfies Condition 1 if
some specific aspect. for any three agents and each triple of alternatives,
One of them is to weaken the requirement of there exists one alternative that no agent ever
transitivity that is imposed on the aggregate pref- ranks as being worse than the other two.
erences. After all, transitivity is sufficient for the Condition 1 is necessary for a profile to be
existence of maximal elements given any finite set candidate to satisfy single peakedness, but not
of alternatives to choose from, but not necessary. sufficient. Actually, it is one of three conditions
By relaxing the transitivity condition on their that Sen and Pattanaik (1969) collected under the
image, aggregation rules that satisfy all the rest common name of value restriction. Much more
of Arrow’s conditions can be identified and char- recently, Ballester and Haeringer (2011) identified
acterized. Mas-Colell and Sonnenschein (1972), a second necessary condition, this time involving
Plott (1973), Blair et al. (1976), Sen (1977a), and four alternatives, but only two agents at a time.
Blair and Pollack (1979) are early examples of an A preference profile RN satisfies Condition 2 if
extensive literature on the subject, still recently for any two agents i and j, and every four
Social Choice Theory 517
alternatives x, y, z, w such that xPiyPiz and zPjyPjx, and for all x, y A such that y > x, if yPix then
it cannot be that wPiy and wPjy. yPjx.
These two authors prove that Conditions 1 and 2, The implications of single crossing on the
together, characterize single peaked preference design of aggregation and decision rules are
profiles. This result nicely closes a gap in our quite parallel to those of single peakedness. For
understanding of single peakedness. The authors an odd set of agents, the preference of the median
also characterized, in a similar manner, other voter, according to the reference ranking of
related domain restrictions based on the ordering agents, actually coincides with the majoritarian
of alternatives. social preference and is thus transitive. And the
An immediate consequence of assuming that top alternative for this median agent is the major-
all preference profiles in the domain of an aggre- ity winner. Again, slight qualifications must be
gation rule are single peaked is that the social added when the number of voters is even. Thus,
preference obtained by simple majority voting is single crossing performs very well.
transitive, when the number of agents is odd, and Given the similarities of results, one could ask
quasi transitive in all cases. Since majority voting whether single peakedness and single crossing are
satisfies all other conditions demanded by somewhat part of the same family of domain
Arrow’s impossibility theorem, restricting prefer- conditions. Indeed, they are. In Barberà and
ences to be single peaked allows us to escape from Moreno (2011), it is proven that, in fact, there is
the impossibility. Other early conditions were due a common root: A weaker condition, called top
to Inada (1964, 1969). In fact, many other nice monotonicity, can be imposed on domains, is
aggregation rules can also be defined in single implied by both single peakedness and single
peaked domains. See Austen-Smith and crossing and still retains the common property
Banks (1999). that a median voter is well defined and would
A very classical observation regarding major- choose the majoritarian outcome.
ity voting as a social choice function is that for all Still another related domain restriction is inter-
single peaked preference profiles, it selects the mediateness (see Grandmont (1978) and Gans and
median of the peak’s distribution whenever this is Smart (1996)). Extensions of single peakedness
unique. This median voter result is extremely to multidimensional settings have been pro-
useful to analyze political and location problems, posed. One of them is to impose similar condi-
among other applications. Since single peaked- tions on a tree, rather than on a line, and still helps
ness is such a fruitful condition on domains and define domain restrictions allowing for possibility
leads to nice positive results, it is natural to ask results. See Demange (1982).
whether some alternative condition on preference However, the most ambitious and probably
profiles may lead to similar conclusions. most natural extension of single peakedness is
Arrow stressed that single peakedness relates the one adopted by most of political science and
the agent’s preferences to an underlying one- location theory, whereby agent’s preferences are
dimensional ranking that expresses a similarity concave and have a single maximal element on a
of their views regarding alternatives, even if they multidimensional space. Very strong negative
value them differently. That led him to interpret results, sometimes termed “chaos” theorems,
that similarity among agents is at the root of the have proven the persistence and pervasiveness of
solution to the aggregation problem. Another, not cycles under rules satisfying Arrow’s conditions
necessarily contradictory order reflecting similar- even in these restricted domains. Major early con-
ity of views is at the basis of single crossing, tributions to this very important literature for
which can be expressed as follows. political science were made by McKelvey (1976,
A preference profile RN satisfies single cross- 1979) and Schofield (1978).
ing if there exist a linear order > on the set of For similar reasons, economists have also stud-
alternatives and a linear order >0 on the set of ied carefully the consequences on Arrow’s theo-
agents such that for all i, j N such that j>0i, rem of imposing domain restrictions that are
518 Social Choice Theory
Until here, we have essentially discussed of utility functions (u1, . . ., un) and (u0 1, . . ., u0 n)
changes in Arrow’s demands that allow for some differ only by a common change of origin and
relaxations of his impossibility result, and seen units, that is, whenever there are numbers b > 0
that some departures are more productive than and a such that ui ðxÞ ¼ bu0i ðxÞ þ a for all i and
others. But a more radical escape from Arrow’s x A. If the invariance is only with respect to
impossibility results comes from changing the common changes of origin (i.e., we require b = 1)
very framework of discussion. or of units (i.e., we require a = 0), then we say that
By assuming that the inputs to be aggregated F () is invariant to common changes of origin or
were preference orders, rather than utility func- of units, respectively.
tions, Arrow explicitly ruled out the possibility of The social welfare functional F : un ! R does not
giving the values of utilities any meaning other allow interpersonal comparisons of utility if F (u1,
than ordinal. And he proposed the condition of . . ., un) = F (u0 1, . . ., u0 n) whenever there are num-
independence of irrelevant alternatives as a guar- bers bi > 0 and ai such that ui ðxÞ ¼ bi u0i ðxÞ þ ai
antee that interpersonal comparisons of utility for all i and x A. If the invariance is only with
would be ruled out. In an important departure respect to independent changes of origin (i.e., we
from Arrow’s framework, Sen (1977b), require bi = 1 for all i) or only with respect to
d’Asprémont and Gevers (1977), Hammond independent changes of units (i.e., we require
(1976), and Roberts (1980) proposed the study ai = 0 for all i), then we say that F () is invariant
of social welfare functionals, rather than social to independent changes of origins or of units,
welfare functions, as a wider framework to ana- respectively.
lyze the informational basis of aggregation prob- Fundamental rules in the history of economic
lems. A very lucid account of their main thought, like utilitarianism, or in the theory of
conclusions is contained in Moulin (1988, justice, like leximin or maximin, have been char-
Chap. 2). acterized on the basis of the different degrees of
Denote by u the set of all possible utility func- cardinality or comparability that they allow. For
tions on A. nice simple proofs, see Blackorby et al. (1984).
A social welfare functional F : uN ! R is a rule These tie in with classical and deep philosoph-
that assigns a rational preference relation F (u1, ical proposals advance by Harsanyi (1953, 1955)
. . ., un) among the alternatives in A to every pos- and Rawls (1971), among other thinkers.
sible profile of individual utility functions u1(), Allowing for outputs other than orders to be in
. . ., un () defined on A. The strict preference the range of a rule is also a possibility: For exam-
relation derived from F (u1, . . ., un) is Fp (u1, ple, lotteries over preferences, rather than a single
. . ., un). one, or probabilistic expressions of binary rela-
A social welfare functional may use more or tions, could be the result of an aggregation exer-
less of the many possible pieces of information cise. See Fishburn (1973, Chap. 18), Barberà and
contained in the numerical values of a utility pro- Sonnenschein (1978), and Barberà and
file. It is most useful to think of how much infor- Valenciano (1983).
mation it does not use, by considering the families Finally, let us not forget that the basic assump-
of preference profiles that share the same social tion precipitating Arrow’s impossibility is the exis-
image. Based on the idea that if these families are tence of at least three alternatives to be ranked.
obtained through certain changes in utilities, the Limiting a priori the range of alternatives to be
invariance of their images shows that the differ- only two is again a way of escape. The classical
ence between these utility changes is not taken paper by May (1952) already contains a character-
into account by the functional, the following def- ization of majority voting that works well for the
initions become natural. case of two alternatives See also the first part of
We say that the social welfare functional F : un ! R Fishburn’s (1973) book for a number of additional
is invariant to common cardinal transformations if problems regarding binary choices that are also of
F (u1, . . ., un) = F (u0 1, . . ., u0 n) whenever the profiles interest to collective decision-making.
520 Social Choice Theory
Escaping Sen’s Paretian Liberal Paradox main possibility (other than restricting the range
A good summary of the polemics surrounding to only take two values) consists in restricting the
Sen’s negative result is provided in Gaertner domain. And, indeed, domain restrictions are very
(2009, Chap. 4). Its publication was quickly fruitful in avoiding manipulations.
followed by a large set of proposals, most of Again, single peakedness comes to rescue. See
them centered on qualifying his axiom of minimal Blin and Satterthwaite (1976) and Moulin
liberalism and discussing the exact implications of (1980a). This author characterized all the rules in
his result on the assignment of rights to decide this domain that are strategyproof and unanimous,
among individuals whose interests conflict, while i.e., such that, whenever all agents agree on what
still respecting his basic framework. The large list alternative is their best then this is the social
of references that one can find in Sen (1982, Chap. choice. Without loss of generality, we can identify
14), including important proposals by Gibbard our finite set of ordered alternatives with integer
(1974) among others, proves the seminal charac- numbers. Then, the class of minmax rules is
ter of Sen’s work and the quick diffusion of his defined as follows.
ideas. Later on, the perspective of commentators A social choice function f is a minmax rule
and critics shifted, to question whether his original associated with a set of integers (aS)S N if for
formulation was the appropriate one to debate the each preference profile RN, f(RN) = minS N
issue of liberalism and private rights, and other (maxi S{aS, t(Ri)}).
authors proposed the use of alternative models Actually, it is easy to see that all of these rules
and tools, based on the study of power distribution are not only strategy proof, but also satisfy the
in game forms. Major references are Pattanaik stronger condition of group strategy proofness.
(1996), Pattanaik and Suzumura (1996), Gaertner That fact can be explained by the structure that
(1986, 1993), and Gaertner et al. (1992). single peakedness imposes on the domain of
admissible preference profiles.
Escaping the Gibbard-Satterthwaite There are other one dimensional domain
Impossibility restrictions that also admit strategy proof rules.
Like in the preceding cases, one way to obtain Interesting cases are those when preferences are
positive results regarding the possibility to design single plateaued or single dipped. See Berga
strategy-proof rules is by changing the set of objects (1998), or Peters et al. (1991, 1992) for an analysis
on which agents can express their preferences of one-dimensional location problems. Other,
and/or the outcomes in the range of the functions more abstract domain restrictions were defined
under study. One early reaction in that direction was by Kalai and Muller (1977), Kalai and Ritz
to allow for more than one alternative being chosen (1980), and Blair and Muller (1983).
by society and for these to have preferences on sets In addition, nondictatorial strategy proof rules
of alternatives. Not much is gained by taking this can also be defined when alternatives are multi-
route. See Barberà (1977), Kelly (1977), Duggan dimensional and satisfy domains restrictions that
and Schwartz (2000), Benoit (2002), Barberà et al. extend the notion of single peakedness. But their
(2001), and Taylor (2005). Some authors have ability to satisfy strategy-proofness depends very
explored the consequences of relaxing manipulabil- much on how the notion of single peakedness is
ity and studying its costs and benefits (Campbell and extended. Results are very negative under Euclid-
Kelly 2009, 2010). ean preferences (see Border and Jordan (1983),
Another possibility, yielding much richer the- Barberà and Peleg (1990) and Peremans et al.
oretical proposals, though hard to use in practice, (1997)). By contrast, possibilities may be
is to allow for lotteries over alternatives to be the obtained when we extend single peakedness by
outcome of voting. See Zeckhauser (1973), using the L1-norm, as introduced in Barberà
Gibbard (1977, 1978), Barberà (1979), Barberà et al. (1993).
et al. (1998), and Dutta et al. (2002).
Consider alternatives to be elements of the Car-
When it comes to weaken the assumptions of Q
K
the theorem within the original framework, the tesian product of K integer intervals A ¼ Bk ,
k¼1
Social Choice Theory 521
where Bk = {ak, . . ., bk} for any k = 1, . . ., K and A special case of the setup in Barberà et al.
endow this set with the L1-norm. Given a, b A, (1993) when only two values are possible in each
the minimal box containing a and b is defined by dimension was used by Barberà et al. (1991) to
study voting rules to elect members of a club.
MBða,bÞ Many other contributions have been made along
¼ fg A : ka bk ¼ ka gk þ kg bkg: similar lines: Serizawa (1999), Le Breton and Sen
(1999), Weymark (1999), Nehring and Puppe
A preference Ri R is (multidimensional) single
(2007a, b). Unfortunately, if the domain is
peaked on A if it has a unique maximal element
restricted and cannot contain some of the conceiv-
t(Ri) A, and for any g, b A, if b MB(g, t(Ri))
able combinations of one-dimensional values,
then bRig. Denote by P the set of (multidimensional)
then new and challenging difficulties arise,
single peaked preferences on A.
limiting the variety of voting rules that can
Notice that the distance between any two alter-
still be strategy proof. See Barberà et al. (1997,
natives a and b is the length of any shortest path
2005), Reffgen and Svensson (2012),
between a and b and the extension of single
Reffgen (2015).
peakedness is based on the idea that one alternative
Moreover, it is no longer the case that such
is better than another if it is “closer” to the best.
rules are also group strategy proof, as in the one-
Functions defined on Cartesian domains of
dimensional case.
preferences and Cartesian ranges satisfying the
The equivalence between these two conditions
above condition will be strategy proof if and
depends, once more, on specific features of the
only if they can be described as follows.
domains for which voting rules are defined (see
A left coalition system on Bk is a correspondence
Barberà et al. (2010) and Le Breton and
C that assigns to any ak Bk a collection of coali-
Zaporozhets (2009)). In spite of these caveats,
tions C(ak) satisfying the following conditions:
we are far from the negative result of Gibbard
(1) Coalition Monotonicity: if W C(ak) and and Satterthwaite in multidimensional spaces:
W W0, then W 0 C(ak). This contrasts with the pervasive negative impli-
(2) Outcome Monotonicity: if bk > ak and cations of multidimensionality for preference
W C(ak), then W C(bk). aggregation.
(3) C(bk) = 2N.
strategies, and the results of their scrutiny are the and Gerber (2017), following the lead of different
outcome function. For an example of voting rules political. See scientists, like Shepsle and Weingast
that can be represented directly as social choice (1984). Moulin and Peleg 1982 studied the possi-
function, consider the Borda rule that actually bility of implementing rules through a proper
provides us with a social welfare function, which assignment of decision power. Other phenomena,
satisfying all of Arrow’s requirements except for like the no-show paradox, also arise when consid-
independence of irrelevant alternatives, and is a ering classes of rules.
member of the large family of scoring rules, all of Yet another seminal idea to construct voting
which assign weights to alternatives based on how rules consists in using some metric and trying to
voters rank them, and that differ on the way that approach the voter’s preferences by an appropri-
these weights are defined. ate distanceminimizing choice of social prefer-
Scoring rules, with an additional tie-breaking ences or decisions. An important method based
criterion when several alternatives attain the same on this idea is the Kemeny rule. See Gaertner
maximal score, also define social choice functions (2006, Chap. 6).
that have been elegantly axiomatized through Approval voting is an interesting method pro-
interesting properties they all share. See Young posed by Brams and Fishburn (1978) that is better
(1975) and also Saari (1995, 2000). described in terms of game forms. The ways in
They all fail, however, to comply with Condor- which voters can fill ballots, their strategies,
cet consistency, the natural requirement that are sets of alternatives, to be interpreted as the
whenever an alternative would defeat all others list of all alternatives they approve of. Then, the
in pairwise majority contests, and hence and way to arrive at one chosen alternative consists in
majority winner exists, it should be chosen. Sim- selecting the one that is approved by most voters
ple majority was characterized in an early paper (with a tiebreaking rule if necessary). The nice
by May (1952). Extending its basic principles to properties of approval voting have given rise to
choose from more than two alternatives requires additional proposals. For example, one could
to complement it in some way when simple major- admit that voters cannot only approve of some
ity leads to a top cycle. Two examples of Condor- candidates but also disapprove of others. See
cet consistent voting rules are the Copeland and Felsenthal (1989), Alcantud and Laruelle (2014).
the Simpson rule: see Moulin (1988, Chap. 9), for A method to select candidates that has been
a careful study of these rules, and of the conflict used since ancient times is based on the idea of
between the principles underlying the two impor- balancing the decision power of two different
tant classes of social choice functions based groups of agents, by letting one side to approve
on scoring and Condorcet consistency, already of a fixed number k of candidates and then allo-
exhibited in Fishburn (1984). wing the other side to select one out of them.
Other rules based on majorities (simple or These rules have been studied by Barberà and
qualified) can be defined through the use of Coelho (2010, 2017), who called them rules of
trees, establishing the order of elimination by k-names.
vote at each node where an alternative confronts Another proposal on how to choose among
others. Examples of different rules include the candidates has been made recently by Balinski
amendment and the successive procedure. and Laraki (2010), who call it majority judg-
Apesteguia et al. (2014) provide an axiomatiza- ment. These authors present it as a radical depar-
tion of these two rules. These and other sequential ture from standard voting theory. It essentially
rules were studied in Banks (1985) and Banks and consists of allowing voters to submit numerical
Bordes (1988). See Moulin 1988 for a nice intro- assessments of each candidate, from a given scale,
duction to the subject and Austen-Smith and and then choose the candidate that obtains the
Banks (2005) for further details. The use of such highest median value.
methods gives rise to the possibility of agenda A different proposal (Casella 2005) involves
manipulation that has been studied in Barberà the use of storable votes. It is based on the remark
Social Choice Theory 523
that voters who participate in making a series of declare a defendant guilty if he was, and to acquit
decisions would like to have much to say on him if he is innocent, but they may differ on the
those issues they strongly care about, even at the signals they have received regarding that crucial
expense of losing some decision power on others. fact. Discovering truth becomes the objective of
Hence, the idea of endowing agents with storable voting, a view that is deeply rooted in medieval
voting rights that they may use in any amount in writings, where the objective is to elicit God’s will,
order to give more or less support to selected or in the Enlightenment attempts to define the will
decisions thus allows each voter to express the of the people. Young (1988) provides a fascinating
intensity of its interest regarding different issues. account of Condorcet’s own internal debate regard-
Another route is suggested by the observation ing the issue and of his seminal result: When the
that, in many practical cases, it may be acceptable jury faces two options, the signals they receive are
to let chance play a role in adopting a final deci- independently drawn and the correct signal is more
sion. In fact, randomizing is often a way to provide likely to obtain than the wrong one, simple major-
a decision process with a sense of fairness. We may ity is the maximum likelihood estimator of truth.
decide by chance who has to accept a dangerous Extensions of this result are numerous. An impor-
task or who will benefit from public housing for tant line of research has emphasized the conse-
which there is an excess demand. And in many quences of information exchange among voters
civilizations and periods, drawing by lot who had and the possible consequences of strategic behav-
to serve in public office was a way to fight corrup- ior in that context. See Nitzan and Paroush (1985),
tion. These ideas justify the analysis of voting rules Austen-Smith and Banks (1996), and Austen-
whose outcome is a lottery over alternatives, rather Smith and Feddersen (2006, 2009).
than a single alternative, or even lotteries over Another important direction of development is
social preferences. Thanks to the enlargement of that of judgment aggregation. Starting from what
possible outcomes that these extensions provide, is known as the doctrinal paradox, this line of work,
voting rules that introduce randomness allow for essentially developed by philosophers, investigates
the solution of different aggregation and incentive the difficulties that arise when complex decisions
issues They also allow to better understand the must be adopted, and the aggregation of different
reasons why difficulties to design satisfactory pieces of information can follow different patterns.
rules arise in the subset of cases where only deter- Specifically, the original paradox compares the
ministic decisions are allowed. We have already result of letting each agent aggregate different pieces
introduced references to the use of lotteries in of information through his own logical reasoning,
social choice in a previous section. and then vote on an overall decision, versus the
possibility of aggregating each piece of partial evi-
dence through partial votes, and then applying the
Broader Horizons collective logic to draw a final conclusion. Many
aspects are investigated, after this starting point,
This already long article has not touched upon a many of which parallel, and eventually generalize,
number of very important pieces of literature that the classical findings of social choice theory. See
are deeply related to the ones we have surveyed List and Pettit (2002), Dietrich (2006), List and
and would deserve full articles on their own. Polak (2010), and List (2012).
An essential one is the study of committee Voting rules are mechanisms that societies
decisions, building from the essential result that employ to make collective decisions, but certainly
is usually known as Condorcet’s Jury Theorem. not the only ones. When the objects we decide upon
The framework is one where voters are essentially have private components, and alternatives are com-
in agreement regarding their objectives but differ plete descriptions of who gets what, one can hardly
on the information they hold and hence on the best choose among them by vote. Markets allocate
course of action that must be taken. The paradig- resources in ways that are different than voting, in
matic case is that of a jury: All jurors would like to spite of analogies made by expressions like “the
524 Social Choice Theory
Austen-Smith D, Banks JS (1996) Information aggrega- Barberà S, Dutta B, Sen A (2001) Strategy-proof social
tion, rationality and the Condorcet Jury Theorem. Am choice correspondences. J Econ Theory 101(2):
Polit Sci Rev 90(1):34–45 374–394
Austen-Smith D, Feddersen T (2006) Deliberation, prefer- Barberà S, Massó J, Neme A (2005) Voting by committees
ence uncertainty and voting rules. Am Polit Sci Rev under constraints. J Econ Theory 122:185–205
100(2):209–217 Barberà S, Berga D, Moreno B (2010) Individual versus
Austen-Smith D, Feddersen T (2009) Information aggre- group strategy- Proofness: when do they coincide?
gation and communication in committees. Philos Trans J Econ Theory 145(5):1648–1674
R Soc B 364(1518):763–769 Barberà S, Moreno B (2011) Top monotonicity: a common
Baigent N (2002) Chapter 18, Topological theories of root for single peakedness, single crossing and the
social choice. In: Arrow KJ, Sen AK, Suzumura median voter result. Games Econom Behav 73
K (eds) Handbook of social choice and welfare, (2):345–359
vol 2. Elsevier, Amsterdam, pp 301–334 Benoît J-P (2002) Strategic manipulation in voting games
Ballester MA, Haeringer G (2011) A characterization when lotteries and ties are permitted. J Econ Theory
of the single-peaked domain. Soc Choice Welf 1023(2):421–436
36:305–322 Berga D (1998) Strategy-Proofness and single-plateaued
Banks JS (1985) Sophisticated voting outcomes and preferences. Math Soc Sci 35:105–120
agenda control. Soc Choice Welf 1:295–306 Bergson A (1938) A reformulation of certain aspects of
Banks JS, Bordes G (1988) Voting games, indifference, welfare economics. Q J Econ 52(2):310–334
and consistent sequential choice rules. Soc Choice Welf Black D (1948) On the rationale of group decision making.
5:31–44 J Polit Econ 56:23–34
Barberà S (1977) The manipulation of social choice mech- Blackorby C, Donaldson D, Weymark JA (1984) Social
anisms that do not leave too much to chance. choice with interpersonal utility comparisons: a dia-
Econometrica 45(7):1573–1588 grammatic introduction. Int Econ Rev 25:327–356
Barberà S (1979) Majority and positional is voting in a Blair D, Bordes G, Kelly J, Suzumura K (1976) Impossi-
probabilistic framework. Rev Econ Stud 46(2): bility theorems without collective rationality. J Econ
379–389 Theory 13(3):361–379
Barberà S (1983a) Strategy-Proofness and pivotal voters: a Blair DH, Muller E (1983) Essential aggregation proce-
direct proof of the Gibbard-Satterthwaite Theorem. Int dures on restricted domains of preferences. J Econ The-
Econ Rev 24(2):413–418 ory 30(1):34–53
Barberà S (1983b) Pivotal voters: a simple proof of Blair D, Pollak R (1979) Collective rationality and dicta-
Arrow’s Theorem. In: Pattanaik PK, Salles M (eds) torship: the scope of the arrow theorem. J Econ Theory
Social choice and welfare. North-Holland, Amsterdam, 21:186–194
pp 31–35 Blin JM, Satterthwaite MA (1976) Strategy-proofness and
Barberà S (2001) An introduction to strategy-proof social single peakedness. Public Choice 26:51–58
choice functions. Soc Choice Welf 18:619–653 Border K, Jordan JS (1983) Straightforward elections,
Barbera S, Coelho D (2010) On the rule of k names. Games unanimity and phantom voters. Rev Econ Stud
Econom Behav 70:44–61 50:153–170
Barbera S, Coelho D (2017) Balancing the power to Brams SJ, Fishburn PC (1978) Approval voting. Am Polit
appoint officers. Games Econom Behav 101:189–203 Sci Rev 72(3):831–847
Barbera S, Gerber A (2017) Sequential voting and agenda Brams SJ, Fishburn PC (2002) Voting procedures. In:
manipulation. Theor Econ 12(1):211–247 Arrow KJ, Sen AK, Suzumura K (eds) Handbook of
Barberà S, Peleg B (1990) Strategy-proof voting schemes social choice and welfare, vol 1. North-Holland,
with continuous preferences. Soc Choice Welf 7:31–38 Amsterdam, pp 173–236
Barberà S, Sonnenschein H (1978) Preference aggregation Campbell D, Kelly J (2002a) Impossibility theorems in the
with randomized social orderings. J Econ Theory Arrovian framework. In: Arrow K, Sen A, Suzumura
18(2):244–254 K (eds) Handbook of social choice and welfare,
Barberà S, Valenciano F (1983) Collective probabilistic vol 1. Elsevier Science, Amsterdam
judgements. Econometrica 51(4):1033–1046 Campbell DE, Kelly JS (2002b) Impossibility theorems in
Barberà S, Sonnenschein H, Zhou L (1991) Voting by the Arrowian framework. In: Arrow KJ, Sen AK,
committees. Econometrica 59:595–609 Suzumura K (eds) Handbook of social choice and wel-
Barberà S, Gul F, Stacchetti E (1993) Generalized median fare, vol 1. North-Holland, Amsterdam
voter schemes and committees. J Econ Theory Campbell DE, Kelly JS (2009) Gains from manipulating
61:262–289 social choice rules. Econ Theory 40(3):349–371
Barberà S, Massó J, Neme A (1997) Voting under con- Campbell DE, Kelly JS (2010) Losses due to manipulation
straints. J Econ Theory 76(2):298–321 of social choice rules. Econ Theory 45(3):453–467
Barberà S, Bogomolnaia A, van der Stel H (1998) Strategy- Casella A (2005) Storable votes. Games Econom Behav,
proof probabilistic rules for expected utility maxi- special issue: In Honor of Richard D. McKelvey.
mizers. Math Soc Sci 35(2):89–103 Elsevier 51(2):391–419
526 Social Choice Theory
Chichilnisky G (1980) Social choice and the topology of Gibbard A (1977) Manipulation of schemes that mix vot-
spaces of preferences. Adv Math 37:165–176 ing with chance. Econometrica 45:665–681
D’Aspremont C, Gevers L (1977) Equity and the informa- Gibbard A (1978) Straightforwardness of game forms with
tional basis of collective choice. Rev Econ Stud lotteries as outcomes. Econometrica 46(3):595–614
44:199–209 Grandmont JM (1978) Intermediate preferences and the
de Borda JC (1781) Mémoire sur les élections au scrutin. majority rule. Econometrica 46:317–330
Hist Acad Roy Sci:657–665 Hammond PJ (1976) Equity, Arrow’s conditions, and
de Condorcet M (1785) Essai sur l’application de l’analyse Rawls’ difference principie. Econometrica 44:793–804
à la probabilité dés décisions rendues à la pluralité des Harsanyi JC (1953) Cardinal utility in welfare economics
voix, Paris and in the theory of risk- taking. J Polit Econ
Demange G (1982) Single-peaked orders on a tree. Math 61:434–435
Soc Sci 3(4):389–396 Harsanyi JC (1955) Cardinal welfare, individualistic
Dietrich F (2006) Judgment aggregation: (im)possibility ethics, and lnterpersonal comparisons of utility. J Polit
theorems. J Econ Theory 126(1):286–298 Econ 63:309–321
Duggan J, Schwartz T (2000) Strategic manipulability Inada KI (1964) A note on the simple majority decision
without resoluteness or shared beliefs: Gibbard–Sat- rule. Econometrica 32:316–338
terthwaite generalized. Soc Choice Welf 17(1):85–93 Inada KI (1969) The simple majority decision rule.
Dutta B, Peters H, Sen A (2002) Strategy-proof probabi- Econometrica 37:490–506
listic mechanisms in economies with pure public Inada KI (1970) Majority rule and rationality. J Econ The-
goods. J Econ Theory 106(2):392–416 ory 2:27–40
Dutta B, Peters H, Sen A (2007) Strategy-proof cardinal Kalai E, Muller E (1977) Characterization of domains
decision schemes. Soc Choice Welf 28(1):163–179 admitting nondictatorial social welfare functions and
Eliaz K (2004) Social aggregators. Soc Choice Welf nonmanipulable voting procedures. J Econ Theory
22:317–330 16(2):457–469
Farquharson R (1969) Theory of voting. Yale University Kalai E, Ritz Z (1980) Characterization of the private
Press, New Haven alternatives domains admitting arrow social welfare
Feldman A, Serrano R (2008) Arrow’s impossibility theo- functions. J Econ Theory 22(1):23–36
rem: two simple single-profile versions. Harv Coll Kelly JS (1977) Strategy-proofness and social welfare
Math Rev 2:46–57 functions without single- valuedness. Econometrica
Felsenthal DS (1989) On combining approval with disap- 45(2):439–446
proval voting. Soc Choice Welf 34(1):53–60 Kirman AP, Sondermann D (1972) Arrow’s theorem, many
Fishburn PC (1970) Arrow’s impossibility theorem: con- agents and invisible dictators. J Econ Theory
cise proof and infinite voters. J Econ Theory 2:103–106 5(2):267–277
Fishburn PC (1973) The theory of social choice. Princeton Le Breton M, Sen A (1999) Separable preferences,
University Press, Princeton strategyproofness and decomposability. Econometrica
Fishburn PC (1984) Discrete mathematics in voting and 67(3):605–628
group choice. SIAM J Algebra Discrete Methods 5 Le Breton M, Weymark J (1996) An introduction to
(2):263–275 Arrovian social welfare functions on economic and
Gaertner W (1986) Pareto, interdependent rights exercising political domains. In: Schofield N (ed) Collective deci-
and strategic behaviour. J Econ, SuppL: 5:79–98 sion making: social choice and political economy.
Gaertner W (1993) Rights and game forms, types of pref- Kluwer, Dordrecht
erence orderings and Pareto inefficiency. In: Diewert Le Breton M, Zaporozhets V (2009) On the equivalence of
WE, Spremann K, Stehling E (eds) Mathematical coalitional and individual strategy-proofness proper-
modelling in economics. Essays in honor of Wolfgang ties. Soc Choice Welf 33(2):287–309
Eichhorn. Springer, Berlin/Heidelberg/New York List C (2012) The theory of judgment aggregation: an
Gaertner W (2009) A primer in social choice theory, introductory review. Synthese 187(1):179–207
revised. Oxford University Press, New York List C, Pettit P (2002) Aggregating sets of judgments: an
Gaertner W, Pattanaik PK, Suzumura K (1992) Individual impossibility result. Econ Philos 18(1):89–110
rights revisited. Economica 59:161–177 List C, Polak B (eds) (2010) Symposium: judgment aggre-
Gans JS, Smart M (1996) Majority voting with single- gation, J Econ Theory 145(2):441–638
crossing preferences. J Public Econ 59:219–237 Mas-Colell A, Sonnenschein H (1972) General possibility
Gardenfors P (1973) Positionalist voting functions. Theor theorems for group decisions. Rev Econ Stud
Decis 4:1–24 39:185–192
Geanakoplos J (2005) Three brief proofs of Arrow’s Maskin E (1995) Majority rule, social welfare functions,
Impossibility Theorem. Econ Theory 26(1):211–215 and game forms. In: Basu K, Pattanaik PK, Suzumura
Gibbard A (1973) Manipulation of voting schemes: a gen- K (eds) Choice, welfare and development. Festschrift
eral result. Econometrica 41:587–602 for Amartya Sen. Clarendon Press, Oxford
Gibbard A (1974) A Pareto-consistent libertarian claim. Maskin E, Sjöström T (2002) Implementation theory. In:
J Econ Theory 7(4):388–410 Arrow K, Sen AK, Suzumura K (eds) Handbook of
Social Choice Theory 527
social choice and welfare, vol 1. Elsevier Science, Pattanaik PK, Peleg B (1986) Distribution of power under
Amsterdam stochastic social choice rules. Econometrica
May KO (1952) A set of lndependent necessary and 54(4):909–921
suffcient conditions for simple majority decision. Pattanaik PK, Suzumura K (1996) Individual rights and
Econometrica 20:680–684 social evaluation. Oxf Econ Pap 48:194–212
McKelvey RD (1976) Intransitivities in multidimensional Peremans W, Peters H, van der Stel H, Storcken T (1997)
voting models and some implications for agenda con- Strategy-proofness on Euclidean spaces. Soc Choice
trol. J Econ Theory 12(3):472–482 Welf 14:379–401
McKelvey R (1979) General conditions for global intransitiv- Peters H, van der Stel H, Storken T (1991) On uncompro-
ities in formal voting models. Econometrica 47:1085–1111 misingness and strategy-proofness, Reports in opera-
McLennan A (1980) Randomized preference aggregation: tions research and systems theory-report M 91–15.
additivity of power and strategy proofness. J Econ The- University of Limburg, Holland
ory 22(1):1–11 Peters H, van der Stel H, Storken T (1992) Pareto optimal-
Moulin H (1979) Dominance solvable voting schemes. ity, anonymity, and strategy-proofness in location prob-
Econometrica 47(6):1337–1351 lems. Int J Game Theory 21:221–235
Moulin H (1980a) On strategy-proofness and single- Plott CR (1973) Path independence, rationality and social
peakedness. Public Choice 35(4):437–455 choice. Econometrica 41(6):1075–1091
Moulin H (1980b) Implementing efficient, anonymous and Reffgen A (2015) Strategy-proof social choice on multiple
neutral social choice functions. J Math Econ and multi-dimensional single peaked domains. J Econ
7(3):249–269 Theory 157:349–383
Moulin H (1983) The strategy of social choice, Advanced Reffgen A, And Svensson L-G (2012) Strategy-proof voting
textbooks in economics, vol 18. North-Holland, for multiple public goods. Theor Econ 7(3):663–688
Amsterdam Reny PJ (2001) Arrow’s theorem and the Gibbard-
Moulin H (1994) Social choice. In: Aumann RJ, Hart Satterthwaite theorem: a unified approach. Econ Lett
S (eds) Handbook of game theory with economic appli- 70:99–105
cations, vol 2. Elsevier, North Holland, Amsterdam. Roberts K (1980) Possibility theorems with interpersonally
pp 1091–1125 comparable welfare levels. Rev Econ Stud 47:409–420
Moulin H, Peleg B (1982) Cores of effectivity functions and Roth A (2008) What have we learned from market design?,
implementation theory. J Math Econ 10(1):115–145 Hahn Lecture. Econ J 118(527):285–310
Mueller D (1978) Voting by veto. J Public Econ 10:57–75 Saari DG (2000) Mathematical structure of voting para-
Muller E, Satterthwaite MA (1977) The equivalence of doxes. II positional voting. Econ Theory 15(1):55–102
strong positive association and strategy-proofness. Samuelson PA (1967) Arrow’s Mathematical politics. In:
J Econ Theory 14:412–418 Hook S (ed) Human values and economic policy. New
Muller E, Satterthwaite MA (1985) Strategy-proofness: York University Press, New York, pp 41–52
the existence of dominant-strategy mechanisms. In: Saporiti A (2009) Strategy-proofness and single crossing.
Hurwicz L, Schmeidler D, Sonnenschein H (eds) Social Theor Econ 4:127–163
goals and social organization. Essays in memory Satterthwaite MA (1975) Strategy-proofness and Arrow’s
of Elisha Pazner. Cambridge University Press, conditions: existence and correspondence theorems for
New York, 131–172 voting procedures and social welfare functions. J Econ
Myerson RB (2008) Perspectives on mechanism design in Theory 10:187–217
economic theory. Am Econ Rev 98(3):586–603 Satterthwaite MA, Sonnenschein H (1981) Strategy-proof
Nehring K, Puppe C (2007a) Efficient and strategy-proof allocation mechanisms at differentiable points. Rev
voting rules: a characterization. Games Econom Behav Econ Stud 48:587–597
59(1):132–153 Schmeidler D, Sonnenschein H (1978) Two proofs of the
Nehring K, Puppe C (2007b) The structure of strategy- Gibbard-Satterthwaite theorem on the possibility of a
proof social choice: general characterization and possi- strategy-proof social choice function. In: Gottinger
bility results on median spaces. J Econ Theory HW, Leinfellner W (eds) Decision theory and social
135(1):269–305 ethics. Reidel, Dordrecht, pp 227–234
Nicolò A (2004) Efficiency and truthfulness with Leontief Schofield N (1978) Instability in simple dynamic games.
preferences. A note on two- agent, two-good econo- Rev Econ Stud 45:575–594
mies. Rev Econ Des 8(4):373–382 Schummer J (1977) Strategy-proofness versus efficiency
Pattanaik PK (1976) Threats, counterthreats and strategic on restricted domains of exchange economies. Soc
voting. Econometrica 44:91–103 Choice Welf 14(1):47–56
Pattanaik PK (1978) Strategy and group choice. North- Sen AK (1966) A possibility theorem on majority deci-
Holland, Amsterdam sions’. Econometrica 34(2):491–499
Pattanaik PK (1996) On modelling individual rights: some Sen AK (1970) The impossibility of a Paretian liberal.
conceptual issues. In: Arrow KJ, Sen AK, Suzumura J Polit Econ 78:152–157
K (eds) Social choice reexamined. Palgrave Macmillan, Sen AK (1977a) Social choice theory: a re-examination.
UK. ISBN: 978-0-312-12741-1 Econometrica 45:53–89
528 Social Choice Theory
Sen AK (1977b) On weights and measures: informational Balinski M, Laraki R (2010) Majority judgment: measur-
constraints in social welfare analysis. Econometrica ing, ranking, and electing. MIT Press, Cambridge, MA
45:1539–1572 Black D (1958) The theory of committees and elections.
Sen AK, Pattanaik PK (1969) Necessary and sufficient Cambridge University Press, Cambridge
conditions for rational choice under majority decision. Bossert W, Suzumura K (2010) Consistency choice and
J Econ Theory 1:178–202 rationality. Harvard University Press, Cambridge, MA
Serizawa S (1999) Strategy-proof and symmetric social Brandt F, Conitzer V, Endriss U, Lang J, Procaccia AD
choice functions for public good economies. (eds) (2016) Handbook of computational social choice.
Econometrica 67(1):121–145 Cambridge University Press, Cambridge, UK
Shepsle KA, Weingast BR (1984) Uncovered sets and Fishburn P (1973) The theory of social choice. Princeton
sophisticated voting outcomes with implications for University Press, Princeton
agenda institutions. Am J Polit Sci 28:49–74 Gaertner W (2001) Domain conditions in social choice
Slinko A (2002) The asymptotic strategy-proofness of the theory. Oxford University Press, Oxford, UK.
plurality and the run-off rules. Soc Choice Welf Gaertner W (2006) A primer in social choice theory.
19:313–324 Oxford University Press, Oxford, UK
Smith J (1973) Aggregation of preferences with variable Kelly JS (1988) Social choice theory. An introduction.
electorate. Econometrica 41(6):1027–1041 Springer, Berlin/Heidelberg/New York
Sönmez T (1999) Strategy-proofness and essentially Krantz DH, Luce RD, Suppes P, Tversky A (1971) Foun-
single-valued cores. Econometrica 67:677–689 dations of measurement, vol I: Additive and polyno-
Taylor AD (2005) Social choice and the Mathematics of mial representations. Academic Press, New York
manipulation. Cambridge University Press, New York Luce RD, Krantz DH, Suppes P, Tversky A (1990) Foun-
Vickrey W (1960) Utility, strategy and social decision dations of measurement, vol III: representation, axi-
rules. Q J Econ 74:507–535 omatization, and invariance. Academic, New York
Vorsatz M (2007) Approval voting on dichotomous pref- Mas-Colell A, Whinston MD, Green J (1995) Microeco-
erences. Soc Choice Welf 28(1):127–141 nomic theory. Oxford University Press, Oxford (Part 5)
Weymark JA (1999) Decomposable strategy-proof social McLean I, Urken AB (1995) Classics of social choice. The
choice functions. Jpn Econ Rev 50(3):343–355 University of Michigan Press, Michigan
Wilson R (1972) Social choice without the Pareto princi- Moulin H (1988) Axioms of cooperative decision making.
ple. J Econ Theory 5:478–486 Econometric society monographs, vol 15. Cambridge
Young HP (1974) An axiomatization of Borda’s rule. University Press, Cambridge, UK. ISBN:
J Econ Theory 9:43–52 9780521360555
Young HP (1975) Social choice scoring functions. SIAM Nitzan S, Paroush J (1985) Collective decision making:
J Appl Math 28:824–838 an economic outlook. Cambridge University Press,
Young HP (1988) Condorcet’s theory of voting. Am Polit New York
Sci Rev 82:1231–1244 Pattanaik PK (1978) Strategy and group choice. North-
Yu NN (2012) A one-shot proof of Arrow’s impossibility Holland, Amsterdam
theorem. Econ Theory 50(2):523–525 Peleg B (1984) Game theoretic analysis of voting in com-
Yu NN (2013) A one-shot proof of Arrow’s theorem and mittees. Cambridge University Press, Cambridge
the Gibbard-Satterthwaite theorem. Econ Theory Bull Rawls J (1971) A theory of justice. Harvard University
1(2):145–149 Press, Cambridge, MA. ISBN: 9780674000780
Zeckhauser R (1973) Voting systems, honest preferences Saari DG (1995) Basic geometry of voting. Springer,
and Pareto optimality. Am Polit Sci Rev 67:934–946 Berlin/Heidelberg/New York
Zhou L (1991) Impossibility of strategy-proof mechanisms Sen AK (1970) Collective choice and social welfare.
in economies with pure public goods. Rev Econ Stud Holden-Day, San Francisco/Cambridge
58:107–119 Sen AK (1982) Choice, welfare and measurement. Basil
Blackwell, Oxford, UK
Sen AK (1992) Inequality reexamined. Harvard University
Reference Books Press, Cambridge, MA
Arrow KJ (1951, 1963) Social choice and individual Suppes P, Krantz DH, Luce RD, Tversky A (1989) Foun-
values, 2nd edn. Wiley, New York dations of measurement, vol II: Geometrical, threshold,
Arrow KJ, Sen AK, Suzumura K (eds) (2001/2010) Hand- and probabilistic respresentations. Academic,
book of social choice and welfare, vols 1 and 2. North- New York
Holland, Amsterdam Suzumura K (1983) Rational choice, collective decisions,
Austen-Smith D, Banks JS (1999) Positive political theory and social welfare. Cambridge University Press,
I. Collective preference. The University of Michigan Cambridge
Press, Michigan. ISBN: 9780472068944 Suzumura K (2016) Choice, preferences and procedures.
Austen-Smith D, Banks JS (2005) Positive political theory Harvard University Press, Cambridge, MA
II: strategy and structure. University of Michigan Press, Young P (1994) Equity: in theory and practice. Princeton
Michigan University Press, Princeton
Common values A common values problem is
Voting one in which agents share preferences over the
alternatives, but may differ on their informa-
Alvaro Sandroni, Jonathan Pogach, tion regarding the attributes of alternatives.
Michela Tincani, Antonio Penta and Condorcet jury theorem In a common values
Deniz Selman setting, the result that under majority voting the
University of Pennsylvania, Philadelphia, correct candidate will (almost always) be
PA, USA elected so long as the following assumptions
are satisfied: (1) each voter’s belief is correct
with a probability higher than half, (2) each
Article Outline voter votes according to her belief and
(3) there is a large number of voters. In political
Definition of the Subject science this result is sometimes referred to as
Introduction wisdom of crowds.
The Collective Choice Problem Condorcet paradox See voting cycle.
Voting Rules Downsian model of political competition A
Welfare Economics model in which candidate strategically position
Arrow’s Impossibility Theorem themselves on a unidimensional policy space
Political Ignorance and the Condorcet Jury in order to win an election.
Theorem Efficiency Efficiency is a broad criterion that
Gibbard-Satterthwaite Theorem may be used to evaluate the social value of
Political Competition and Strategic Voting alternative outcomes by demanding that soci-
The Common Value Setting with Strategic Agents ety make use of all valuable resources. Exam-
Future Directions ples of efficiency measures are utilitarianism
Bibliography and Pareto Optimality.
Egalitarianism Egalitarianism is a broad criterion
Glossary used to evaluate the social value of alternative
outcomes by demanding that welfare or resources
Arrow’s impossibility theorem Arrow’s Impos- are evenly distributed across the population.
sibility Theorem states that there does not exist a Gibbard-Satterthwaite theorem The Gibbard-
complete social ranking over alternatives that Satterthwaite Theorem states that, under some
meets minimum impositions of egalitarianism assumptions, every non-dictatorial social
and efficiency, No Dictatorship and Pareto Opti- choice function is manipulable.
mality, respectively. Consequently, there is no Majority voting Majority voting is a voting rule
voting mechanism that can simultaneously satisfy that stipulates that each agent vote for a single
basic notions of egalitarianism and efficiency. alternative and an alternative that receives
Cost of voting Any sacrifice in utility that voting more than half of all votes is the collective
entails, such as cognitive costs, time cost of choice.
going to the polls, etc. Manipulability of a social choice function A
Collective or social choice problem A collective social choice function is said to be manipulable
or social choice problem is a setting in which a if there is some agent who, given the social
group of individuals must jointly decide on a choice function, prefers to misreport his true
single alternative from a set. The outcome of preferences. In a voting context, this translates
such a problem potentially affects the welfare to voting for an alternative different from that
of all individuals. which is most preferred.
© Springer Science+Business Media, LLC, part of Springer Nature 2020 529
M. Sotomayor et al. (eds.), Complex Social and Behavioral Systems,
https://doi.org/10.1007/978-1-0716-0368-0_584
Originally published in
R. A. Meyers (ed.), Encyclopedia of Complexity and Systems Science, © Springer Science+Business Media LLC 2017
https://doi.org/10.1007/978-3-642-27737-5_584-2
530 Voting
Median voter theorem The median voter theo- Utility of voting Benefit to citizens from voting,
rem states that if individuals’ preferences are usually divided into two components: a non
single peaked, then there exists an alternative instrumental component which includes util-
that beats all others in a pairwise majority vote. ity derived from the mere act of voting and not
Single peaked requires that each individual has related to the actual outcome of the election,
a bliss point (most preferred alternative) and and an instrumental component given by the
alternatives are less preferred the further they utility of the outcome a voter would induce if
are from the bliss point. The selected alterna- determining the outcome, weighted by the
tive is then the median bliss point and the voter probability that his vote determines the
who has this bliss point is the median voter. outcome.
No Dictatorship The No Dictatorship criterion Voting cycle or Condorcet’s paradox A voting
of Arrow’s desiderata demands that there is no cycle or Condorcet’s Paradox results when
single individual whose preferences always every feasible alternative is beaten by another
determine those of society. Any egalitarian in a pairwise majority vote. As such, any col-
arrangement must satisfy the no dictatorship lective choice is less preferred to some other
criterion, though arrangements that satisfy no alternative by more than half of the population.
dictatorship need not be egalitarian. Voting rule A voting rule is a mapping of votes
Pairwise majority voting rule A pairwise into a collective choice. Examples of different
majority voting rule compares each pair of voting rules include majority voting and plu-
alternatives in a majority vote. Depending on rality voting. The collective choice may vary
agents preferences and votes, this rule may under alternative voting rules.
lead to a voting cycle.
Paradox of voting the puzzle of why there is
high voter turnout in large elections, when the
probability of any single vote to be determinant Definition of the Subject
for the outcome is extremely small.
Pareto optimality or pareto efficiency An out- Voting is a fundamental mechanism that individ-
come is Pareto Optimal or Pareto Efficient if it uals use to reach an agreement on which one of
is not possible to increase the welfare of one many alternatives to implement. The individuals
individual without lessening the welfare of might all be affected by the outcome of such a
another. process and might have conflicting preferences
Political ignorance A state in which voters are and/or information over alternatives. In a voting
not well informed about the issues and/or can- mechanism, preferences and information are
didates they must vote on. aggregated as individuals submit votes and a vot-
Social choice function A social choice function ing rule maps the compilation of votes into the
is a mapping of all individuals’ preferences alternative that is to be selected. The use of voting
into an alternative, the social choice. as a means of making a group decision dates back
Strategic abstention The tactical decision of an at least to ancient Greece, though French Revolu-
uninformed citizen to abstain in order to allow tionary contemporaries Condorcet’s and Borda’s
informed citizens with the same preferences works are among the pioneers of voting theory.
determine the outcome. Meanwhile, welfare economists such as Bentham
Strategic voting Voting by agents who aim to suggested formal definitions of socially desirable
maximize their utility and might do so by mis- outcomes. As voting theory and welfare econom-
reporting their true preferences over electoral ics evolved, Arrow’s result in the middle twentieth
outcomes. century showed that no mechanism, voting or
Utilitarianism Utilitarianism is a conception of otherwise, can produce outcomes consistent with
efficiency that evaluates outcomes on aggre- some of welfare economists’ definitions of
gate utility. socially desirable states. Moreover, results in the
Voting 531
early 1970s suggested that voters have incentives with formal mathematical tools, which have since
to misrepresent their true preferences in elections. been prominently used in the literature on voting.
Contemporary voting theory has developed new Continuing in this tradition, a fundamental
models of strategic behavior to address questions result in voting theory is Arrow’s Impossibility
on how political agents behave and which out- Theorem. This is a formal treatment of the prob-
comes voting might produce. lem of aggregation of preferences that reached a
striking result: there is no way, under some
assumptions, to aggregate individuals’ prefer-
Introduction ences that is minimally egalitarian and minimally
efficient.
Voting is one of the most commonly used ways of Unlike the problem of aggregation of
making collective decisions. Two issues arise conflicting preferences, studies on the aggregation
when a group of people must find an agreement of conflicting information obtain positive results.
on a choice that will potentially affect the welfare In particular, the Condorcet Jury Theorem finds a
of all. First, individuals might have conflicting mathematical justification for the phenomenon
preferences over the set of alternatives they are that is known in Political Science as the wisdom
choosing from. An interesting question is what is of the crowds, that is, the observation that democ-
the best way to aggregate individual’s preferences racies seem to be better at making decisions than
in order to reach a common decision when pref- single individuals. However Condorcet’s result
erences conflict. Second, even if agents share the relies on the assumption that people vote sincerely
same preferences, they might possess different and subsequent works on voters’ behavior suggest
information about the alternatives. In this situa- that this is not always the case. A fundamental
tion, an interesting question is what is the best way theoretical result, known as Gibbard-Satterthwaite
to aggregate people’s conflicting information so theorem, shows formally that, under some assump-
as to make the right choice. tions, in every election at least one individual has
The optimal ways to aggregate preferences and an incentive to vote non sincerely, i.e. to vote
information are among the most important strategically.
questions that the literature on voting has tried to Gibbard and Satterthwaite’s result is a starting
answer. Since voting concerns societies, as point for a branch of the voting literature that deals
opposed to single individuals, this literature relies with strategic voting. Numerous works analyze
on works in the field of welfare economics. This strategic voting though the use of mathematical
branch of economics is primarily concerned with tools such as Game Theory. The latter has been
the analysis and the definition of the welfare of a used not only to explain voters’ behavior, but also
society. There is no agreement among social sci- to describe competing candidates’ behaviors in an
entists on a single definition of welfare of a election. The research in this game theoretic liter-
society. However, two concepts are prominent: ature focuses on the outcome of political competi-
efficiency, which concerns the minimal waste tion and on the development of a theory of turnout.
scarce resources, and egalitarianism, which con- The former is analyzed through the use of a model
cerns the equal distribution of those resources. of political competition. A main result is that in a
Using these concepts to define the welfare of a two party election, both parties choose the same
society, social scientists have attempted to answer political platform in order to maximize the proba-
questions on preference and information aggrega- bility of winning an election. The latter is moti-
tion in making a collective choice. vated by the paradox of voting, which refers to the
Early works on different voting techniques empirical observation that citizens vote even when
date back to at least the late eighteenth century, the probability that their vote determines the out-
with the works of French mathematicians such as come of the election is negligible, such as in large
Borda (1781) and Condorcet (1976). Their works elections. At present, there is no widely accepted
were among the first to analyze voting procedures theory to explain this phenomenon.
532 Voting
As with the paradox of voting, the voting liter- collection of votes into a social choice. As can be
ature has still many unanswered interesting ques- seen in the next section, there exists a multitude of
tions. As will be mentioned in the section on voting rules.
future directions, the voter’s behavior is still Before discussing voting rules, one should
only partially understood and much needs to be note the following distinction between elections:
explained. On a broader level, the study of the those in which the alternatives are policy and
historical evolution of democracies needs further those in which the alternatives are candidates.
developments. Much has been written on the sub- The former is known as direct democracy, a com-
ject of voting, and some interesting results have mon example of which is a referendum. In a
been obtained, but much still needs to be referendum, citizens vote on a particular proposal,
explained. such as the adoption of a new law. The outcome of
such an election is then to implement or not
implement the proposal. This is in contrast to the
The Collective Choice Problem case, known as representative democracy, where
the election is over candidates. The collective
The basic framework for understanding voting is a choice in such a system is an agent or group
collective or social choice problem: a group of of agents with the responsibility of choosing
agents must reach an agreement on which alterna- policy. The following voting rules apply to both
tive to select, and this decision potentially affects situations.
the welfare of everyone. Such problems could
range from “what should be taught in public
schools?” to “who should be president?” Consider Voting Rules
two possible alternatives, A and B, and a group of
individuals who must jointly choose one of the Majority voting is one of the most commonly used
two. It might be the case that some individuals voting rules. The rule prescribes that each citizen
in society prefer A and some prefer B. These vote for a single alternative and an alternative
conflicting preferences pose a challenge in deter- becomes the social choice if it receives more
mining the appropriate social choice. Alterna- than half of all votes. Clearly, when there are
tively, consider a scenario where A and B are more than two alternatives, majority voting does
different characteristics that two candidates not necessarily produce a social choice.
running for a public office may possess. Suppose To ensure a comparison between alternatives,
that all individuals agree that A is more desirable one can resort to pairwise majority voting rule, in
than B, i.e. this a common values setting. which alternatives are voted over pair by pair with
However, individuals differ in their information; a majority vote. That is, a majority vote is held
some think that candidate 1 possesses trait A, between every pair of feasible alternatives and for
while others think candidate 2 does. In this case, each majority vote, the winner is deemed socially
the challenge is finding the best way to balance the preferable to the loser. However, this voting rule
conflicting information in arriving to a collective might generate an intransitive social preference in
choice. which society chooses x to y, y to z, but z to x.
There are several ways to resolve the problems Consider a three-individual committee comprised
of conflicting preferences and information. For of voters 1, 2 and 3 who must choose one of three
example, people can bargain to reach an agree- alternatives, x, y and z. Individual preferences are
ment, or they can fight. A collective choice prob- such that voter 1 prefers x to y to z, voter 2 prefers
lem might also be solved through a dictatorship, z to x to y, and voter 3 preferences y to z to x. By
or even through the toss of a coin. A fundamental pairwise majority voting x beats y, which in turn
mechanism used to resolve these conflicts is an beats z, which in turn beats x. This intransitivity
election in which people submit votes on the over alternatives is known as a voting cycle or
feasible alternatives and a voting rule maps the Condorcet’s Paradox.
Voting 533
In the presence of a voting cycle, pairwise Voting rules might also grant veto power to one
majority voting might not produce an overall win- or more agents. For example, each of the five
ner, as each alternative might be beaten by permanent members of the fifteen member United
another. So, an agenda setter could end a cycle Nations Security Council has the power to veto
by specifying the order in which the alternatives resolutions on particular matters. Any collective
are to be compared in a pairwise vote. However, choice must therefore have the approval of all five
pairwise majority voting then places all the deci- permanent members.
sion power in the hands of the agenda setter. The different voting rules are not simply dif-
Returning to the previous example, suppose the ferent methods of arriving at the same social
agenda prescribes that voting be carried out choice. Rather, the result of an election depends
between alternatives x and y first and the winner critically on the voting rule that is used. In fact, an
is then to be compared with z. In the first round of alternative that is the social choice according to
voting x beats y and z then beats x, so z becomes one voting rule might be the least preferred under
collective choice. However, the agenda setter another. For instance, consider alternatives x, y,
could instead choose an initial comparison and z and seven voters, three who prefer x to y to z,
between y and z. Since y beats z in the first round two who prefer y to z to x, and two who prefer z to
and x beats y in the second, x would then be the y to x. By pairwise majority voting, x loses to both
collective choice. Hence, the agenda setter y and z. In contrast, x beats both y and z in a
decides the outcome of the election by choosing plurality vote. This suggests that in order to deter-
the order of the pairwise voting. mine which voting rule to use, one must carefully
A plurality voting rule is an alternative way to analyze the various resulting outcomes.
make a collective choice: agents each vote for one To this end, it is useful to identify criteria that
alternative and the alternative with the most votes allow one to discriminate among the different out-
is chosen. comes produced under various voting rules. There
Other voting rules require agents to submit are two main methods of doing this: efficiency and
scores or rankings of all available alternatives, egalitarianism.
rather than just voting for a single one. In a
Borda Count, agents rank all alternatives
assigning the larger numbers to those that are Welfare Economics
more preferred. The voting rule sums the scores
for each alternative across individuals and the The analysis of efficiency and egalitarianism of a
alternative with the highest sum is the social social state is among the objectives of a discipline
choice. called welfare economics. The first criterion for
A supramajority voting rule stipulates that the efficiency used by welfare economists dates back
‘status quo’ alternative is chosen unless another at least to Jeremy Bentham (1789) and is known
alternative receives at least some specified per- as utilitarianism. According to utilitarianism, the
centage of the vote larger than fifty percent. In social interest is judged in terms of the total utility
the limit, there might be a unanimity rule that of a community. For example, if by moving from
mandates one hundred percent of the electorate arrangement A to arrangement B Mr. 1 benefits
vote for an alternative for it to be chosen against more than Ms. 2 suffers, then the movement from
the status quo. Examples of supramajority rules A to B is judged as a social welfare improvement.
include the passing of constitutional amendments Notice that in order to implement this criterion,
in the United States, where the current constitu- the satisfaction intensities of different individuals
tion is the status quo. Unanimity rules are com- must be comparable. In the 1930s this criterion
monly found in the judicial system in which all was criticized by Lionel Robbins (1938) and other
jurors must agree on the guilt of the defendant to welfare economists who claimed that the compar-
override the status quo, the presumption of ison of utilities across individuals has no scientific
innocence. basis. In the 1940s a new criterion was developed
534 Voting
which required no comparison of individual util- social objective should be to maximize the wel-
ities: the Pareto criterion. A social outcome is said fare of the worst-off individual. Finally, notice
to be Pareto Optimal (Pareto Efficient) if there is that an egalitarian outcome might be extremely
no other outcome that would benefit at least one inefficient. Returning to the ten dollar example, an
individual without hurting anyone else. Consider arrangement where person 1 is given nine dollars
a scenario where ten dollars must be split among and person 2 one dollar is not as egalitarian as one
two individuals who value money and there are where both are given two dollars, though the latter
two alternatives: either person 1 receives five dol- does not make use of more than half of the
lars, person 2 four and the remaining dollar is available resources.
thrown away, or both receive five dollars. Clearly,
the first alternative is not Pareto Optimal because
person 2 can be made better off and person Arrow’s Impossibility Theorem
1 would remain as well off if 2 is given the dollar
that is being thrown away. Notice that the first Ideally, one would like to use the normative
alternative is also non utilitarian; in fact, the sum criteria of social efficiency and egalitarianism to
total of utilities can not be maximized when valu- discriminate between different voting rules and
able resources are thrown away. However, a draw- select the best one. However a general result
back of this efficiency criterion is that there exist known as Arrow’s Impossibility Theorem (Arrow
multiple non comparable Pareto Optimal out- 1950) shows mathematically that this is impossi-
comes: any division of the ten dollars among the ble. In his seminal work, Arrow shows that there
two individuals is Pareto Optimal as long as no is no voting mechanism that generates a social
money is thrown away, since to make one person consensus on the ordering of the different alterna-
better off one would have to take resources away tives while satisfying a number of axioms, among
from the other person. Hence, the Pareto Optimal- which are the weakest forms of efficiency
ity criterion does not allow one to distinguish and egalitarianism: Pareto Optimality and No
among multiple outcomes. Finally, notice that Dictatorship. The latter, which states that no
Pareto Optimal outcomes can be extremely non individual always determines preferences of
egalitarian: person 1 receiving ten dollars and society, is a weak form of egalitarianism. While
person 2 zero is an unequal but Pareto Efficient a non-dictatorial society can be quite unequal, any
division. egalitarian society must be non- dictatorial.
An alternative criterion often used to discrim- A number of possibility results have been
inate among social outcomes is egalitarianism, obtained by the relaxation of some of Arrow’s
which focuses on the distribution of welfare axioms. For example, pairwise majority voting
across members of a society. One of the abstract with a particular restriction on individual tastes,
principles behind egalitarianism is the veil of which violates what Arrow called Unrestricted
ignorance (Harsanyi 1953, 1977; Rawls 1971). Domain, satisfies Pareto Optimality and No Dic-
Consider a situation in which two persons must tatorship, while generating an ordering of the
share a cake. Pareto Optimality does not help in social alternatives.
selecting a division: as in the ten dollars example, Black (1948) noticed that pairwise majority
any division of the cake is Pareto Optimal. voting produces an outcome that is not subject to
Suppose that one of the two persons sharing the Condorcet’s paradox when individual preferences
cake is asked to cut it in two without knowing a are single-peaked: every individual must have a
priori which piece she will receive. Her ignorance most preferred alternative (bliss point) and
about which piece she will receive makes her cut between any two alternatives he prefers the one
the cake in two equal shares, an egalitarian divi- that is closer to his bliss point. An important result
sion. Notice that there are a number of ways in voting theory, called the median voter theorem,
to define egalitarian outcomes. For example, shows that when individuals’ preferences satisfy
Rawls’s (1971) maximin rule suggests that the this condition, the bliss point of the median voter
Voting 535
beats any other alternative by pairwise majority charge of it. This evidence is in favor of what is
voting. The median voter is found by ordering called political ignorance. In a setting where
voters according to their bliss points. The impor- voters are politically ignorant, the Condorcet
tance of this theorem derives from its ability to Jury Theorem provides a valuable insight as to
describe how democracies work in practice. It is how much political information matters in deter-
commonly observed that candidates try to appeal mining the outcome.
to voters who are politically moderate, or “in the Consider a committee who has to elect an
middle”: this is consistent with the theory, which administrator out of two candidates, one “good”
suggests that these are the preferences that will and one “bad.” Assume that all the members of the
eventually prevail in a democratic system. committee share the same preferences: they all
prefer the good administrator to be selected.
Individuals differ, however, in the information
Political Ignorance and the Condorcet they have about which candidate is the good
Jury Theorem one. Suppose that each voter has a belief about
which is the good candidate and votes according
As mentioned earlier, voting is not only a way to to his belief If each voter has a correct belief with
aggregate conflicting preferences but it is also a probability higher than 50%, then by the Law of
way to aggregate individual, possibly conflicting, Large Numbers as the number of voters becomes
information when preferences are partially or very large the probability that more than half of
totally aligned. This is the case in common value the electorate votes for the right candidate goes to
settings. When people would agree on the best one. Hence, under simple majority the probability
choice if given the same information on alterna- that the right choice is made goes to one.
tives, but differ in the information they actually Condorcet’s conclusion is that in a common
receive, a natural question is which is the voting value setting a democratic decision is superior to
mechanism that aggregates information in a way an individual decision, because each voter makes
that maximizes the probability of the right deci- the wrong decision with a non-negligible positive
sion being made. A result called Condorcet Jury probability, whereas the population as a whole
Theorem (Condorcet 1976) shows that among all makes the right decision almost always. As far
the possible voting rules, simple majority rule as political ignorance is concerned, this result
guarantees that the right decision is made under shows that it is not necessary for an electorate to
three crucial assumptions: that each voter has a be well informed for it to make the right decision.
correct belief with a probability higher than 50%, So long as each voter has a correct belief with a
the voter votes according to his belief, and that the probability higher than a half, the electoral out-
electorate is very large. Before going into the come will almost always be identical to one in
details of the theorem, it must be mentioned that which the electorate was perfectly informed.
this result has a practical importance. A number of Therefore, Condorcet’s result implies that the
works document that voters are ignorant over both ignorance of individual voters is overcome by
policy and candidate alternatives over which they the aggregation of information in an election.
are to vote. Campbell et al. (1960) claim that
“many people know the existence of few if any
of the major issues of policy”, while (Dye and Gibbard-Satterthwaite Theorem
Zeigler 1970) discusses “mass political igno-
rance” and “mass political apathy” as playing Thus far, citizens have been treated as if they
key roles throughout the history of American pol- disregard any tactical considerations when faced
itics. More recently, the 2004 American National with a voting decision. However, a citizen might
Election Study found that Americans performed find it worthwhile to misrepresent his true prefer-
extremely poorly when asked simple questions ences in order to achieve a social outcome more
about the political system and the leaders in preferred than the one that would result if he voted
536 Voting
naively. Consider an election in which a status quo assumption that citizens vote sincerely. Tactical
will be replaced if a simple majority agrees on one voting is not only an abstract possibility, it is also
of three candidates. Suppose the status quo is a an actual behavior that must be considered in any
conservative government, and the three alterna- voting analysis. The following sections explore
tive candidates to be voted for are a conservative, how the literature on voting has dealt with strate-
a moderate and a liberal. Imagine that there are gic voting.
only three voters, and two votes have already been
cast: one is for the moderate candidate, one is for
the conservative one. Suppose that the last indi- Political Competition and Strategic
vidual who is called to vote is politically liberal. Voting
He knows that if he votes for his truly most pre-
ferred candidate, the liberal one, there would be a The early works of Downs (1957) and Tullock
tie and the status quo conservative government (1967) initiated the analysis of political issues
would not be replaced. However, by voting for within a strategic framework, where voters
his second most preferred alternative, the moder- and/or candidates are assumed to be rational deci-
ate candidate, he would break a tie and the status sion makers. Political competition describes a
quo government would be replaced with a mod- situation in which candidates strategically posi-
erate one. A liberal voter prefers this outcome to tion themselves in order to win an election,
the one where conservatives win. Therefore he has whereas strategic voting refers to individuals’
an incentive to misrepresent his true preferences decision to vote so to maximize utility by some-
and vote tactically for his second-best alternative. times misreporting true preferences. This game
A powerful result in voting theory called theoretical framework developed due to positive
Gibbard-Satthertwaite theorem (Gibbard 1973; arguments such as Gibbard-Satterthwaite theo-
Satterthwaite 1975) shows formally that in most rem, which suggests that voters have an incentive
electoral settings at least one citizen has an incen- to behave strategically, and is also due to spread of
tive to vote tactically. Define a social choice func- game theory as a dominant tool in economic
tion as a mapping of all individuals’ preferences analysis.
into a collective choice. The theorem states that
there is no social choice function that is Non Political Competition
Dictatorial and Non Manipulable, i.e. such that The classic model analyzing the candidates’
no agent has an incentive to vote tactically. choice of positioning on the political spectrum is
Consequently, for every voting rule there is at that of Downs (1957), who adapted the classical
least one agent with an incentive to misrepresent Hotelling model (Hotelling 1929) to the analysis
her preferences. of the choice of political platforms by candidates.
It should be mentioned that the Gibbard- In the Downsian model of political competition,
Satterthwaite theorem places also other technical there is a unidimensional policy space,
restrictions. Furthermore, in a setting such as a representing the political spectrum. There are
majority vote, a misrepresentation of one’s true two candidates who position themselves on this
preferences coincides with voting for an alterna- policy space. Each voter has a most preferred
tive which the voter does not rank top. In the point on this space and prefers points closer to
original formulation of their theorem, however, this point than those further away, i.e. each voter
Gibbard and Satterthwaite dealt with mechanisms has single-peaked preferences. Downs argues that
where agents are required to submit a ranking over strategic candidates concerned only with winning
all alternatives, and a misrepresentation of tastes position themselves at the point that is most pre-
in their framework does not coincide necessarily ferred by the median voter. If candidate A were
with a misrepresentation of only the most pre- positioned anywhere else, say to the left of the
ferred alternative (see (Mas-Colell et al. 1995)). median voter, candidate B could get the majority
This result suggests that for a deep understand- of votes by positioning himself between candidate
ing of voting one should not focus only on the A and the median voter; all the agents to the right
Voting 537
of B, constituting more than half of the voters, generally higher in close elections (i.e. with
would prefer B to A. Given this scenario, candi- smaller margins of victory), and (3) turnout rates
date A (for the same reason) would then have an are different among groups with different demo-
incentive to position himself between B and the graphic characteristics. For instance, from the
median voter, and so on. Hence, the result is that thorough work by (Wolfinger 1980), it emerges
both candidates position themselves at the policy that education has a substantial effect on the prob-
platform most preferred by the median voter. An ability that one will vote. Income has less of an
interesting implication of this result is that under a effect once it has been controlled for the impact of
democracy with two parties, both parties act iden- other variables. After education, the second most
tically, and therefore there are only as many posi- important variable is age, which appears to have a
tions (just one) taken by political parties as there strong positive relationship with turnout. Other
would be in a dictatorship. However, it is crucial socio-economic variables are also important; in
that two parties exist so that the competition particular, racial minorities appear to be less likely
between them can allow the chosen policy point to vote. Finally, turnout seems to be significantly
to represent the preferences of the voters. This is influenced by factors such as the weather condi-
in contrast to a dictatorship in which the ruling tions on the day of the election and voters’ dis-
party can implement its own preferred policy tance from the polls (see (Coate and Conlin
without voter approval. 2004)).
Notice that the example above assumes a two- Such comparative statics suggest that it is
party system (for models that allow for more than appropriate to model voters’ behavior as a rational
two parties see (Besley and Coate 1997; Osborne choice problem within a standard utility maximi-
and Slivinski 1996)). (Duverger 1972) posits that zation framework. The modern theory of voting
in a representative democracy with a plurality applies the classic utilitarian framework to the
voting rule, only two parties compete in the elec- voting problem, positing that agents decide
tions. This theory, known as Duverger’s Law, whether or not to vote by comparing the cost of
states that a proportional representation system, voting with the utility of voting. The traditional
in which parties gain seats proportional to the starting point for the modern theory of voting is
number of votes received, fosters elections with (Riker and Ordeshook 1968), who formalize the
numerous parties. In contrast, a plurality system insights of (Downs 1957; Tullock 1967) in a sim-
marginalizes smaller parties and results in only ple utilitarian model of voting. The cost of voting
two parties entering into political competition. comprises any sacrifice in utility that voting
entails. The utility of voting is usually divided
The Decision to Vote: The Paradox of Voting into two components: a noninstrumental compo-
Another issue raised by Downs, one which nent and an instrumental component. The non-
focuses on voters’ behavior rather than candi- instrumental component includes utility derived
dates’, is known as the paradox of voting. It refers from the mere act of voting and not related to the
to the fact that in a large election, the probability actual outcome of the election. It may include, for
that any single vote determines the outcome is instance, the sense of civic duty. There is consid-
vanishingly small. If every person only votes for erable evidence that voters are motivated by a
the purpose of influencing the outcome of the sense of civic duty (see, for example, (Blais
election, even a small cost of voting would be 2000)). The instrumental component is the utility
sufficient to dissuade anyone from voting. Yet, it of the outcome a voter induces if her vote deter-
is commonly observed that turnout is very high, mines the outcome, weighted by the probability
even in large elections. From the large empirical that her vote actually determines the outcome.
literature on turnout in elections, some facts seem The instrumental component of the utility of
to be acquired knowledge: (1) turnout is higher in voting has attracted most of the attention in the
more important elections (e.g., Presidential elec- literature. It is typically analyzed through the
tion in the US have a significantly higher turnout rational theory of voting, which is motivated by
than Gubernatorial elections), (2) turnout is an empirical observation: there exists a strong
538 Voting
positive correlation between turnout rate and close- of which has a leader who has the same preferences
ness of the election. This fact suggests that, ceteris as all agents in the group and coordinates their
paribus, voters are more likely to vote if their vote behavior. The turnout decision within each group
is more likely to make a difference. The main is determined by how the leaders allocate costly
theoretical problem is to endogenize the probabil- resources to voters. It is as if leaders buy the votes
ity that each voter is pivotal, i.e. that his vote is of the agents in their group, compensating for the
determinant for the outcome of the election. agents’ costs of voting. Since leaders influence a
Ledyard (1981, 1984) are among the early large number of voters, their decisions have a non-
works in the literature on game theoretical models negligible impact on the probability of affecting the
of the pivotal-voter. In these models, voters infer electoral outcome and consequently, on the indi-
the probability of being pivotal from the equilib- vidual instrumental benefit from voting. (Schram
rium strategies of other voters. Subsequently, they 1991; Shachar and Nalebuff 1999) test group based
decide whether or not to vote, trading off the cost models and provide some empirical support for the
of voting with the expected (instrumental) utility mobilization thesis. (see also (Morton 1987, 1991;
of voting. Although Ledyard did not focus on the Uhlaner 1989)). The problem for models of mobi-
magnitude of turnout in a strategic model, this lization is that it is not clear how leaders affect the
question is addressed by Palfrey and Rosenthal individual behavior of voters.
(1983, 1985) who model elections with uncer- Models of group based welfare consider groups
tainty about the total number of voters. Voters of like-minded individuals whose actions are
strategically choose whether or not to vote for intended not to maximize their individual utilities,
their favorite alternative amongst two candidates. but rather that of the group (see (Kinder and
However, Palfrey and Rosenthal’s theories do not Kiewiet 1979; Markus 1988)). In this case, there
explain high turnout in large elections, when the is no leader who prescribes behavior as in
cost of voting is not very (and unrealistically) low. mobilization models, but instead there is an
Ultimately, the game-theoretic approach to costly implicit understanding among agents in the group
voting could not escape the paradox of voting. Since on appropriate behavior. This idea is developed by
the probability of being pivotal is very small in large (Feddersen and Sandroni 2006), who appeal to
elections, the individual incentives to vote cannot (Harsanyi 1980) Group Rule-Utilitarian Theory
justify high turnouts unless the cost of voting is to endogenize the non-instrumental component of
sufficiently small. Conversely, regardless of how the utility from voting in a way that preserves the
small the cost of voting is, the theory posits that positive relation between closeness of the election
there should be low turnout as the election becomes and incentive to vote typical of the classic pivotal-
arbitrarily large, which is in contrast to empirical voter models. In this model, agents derive utility
evidence. The puzzle that remains open is how to from “doing their part”: in the spirit of (Harsanyi
reconcile the evidence of high turnout in large elec- 1980), this is understood to mean following the
tions with the responsiveness of turnout levels to the rule that, when followed by all the agents in a
closeness of the election. given group, would maximize some measure of
the group’s utility. The outcome is a set of rules
for each group, which are mutually optimal (from
Mobilization and Group-Based Notion of the point of view of the group) given that individ-
Welfare uals follow the rules within their group. Abstention
Two strands of the literature try to overcome the still occurs because for some agents (those with
paradox of voting by focusing on groups of like- higher costs of voting) the rule prescribes not to
minded people rather than on individual agents. vote, since their contribution to increase the
These are models of mobilization and models group’s utility from the election’s outcome does
incorporating a group based notion of welfare. not compensate the increase in the group’s total
In models of mobilization, the population of cost of voting. (Coate and Conlin 2004) provide
voters is assumed to be divided into groups, each some empirical support to the group rule.
Voting 539
The Common Value Setting with setting, if voters vote sincerely, as is assumed by
Strategic Agents Condorcet, the result that the aggregation of infor-
mation taking place during an election delivers the
Feddersen and Pesendorfer (1996, 1997) consider “right” social choice does not hold. In Feddersen
the Condorcet Jury Theorem in a strategic setting and Pesendorfer’s framework it is strategic voting
and reach a different conclusion than Condorcet. that induces the “correct” social choice. This is the
They model a voting problem in an almost com- major difference with Condorcet’s result, which
mon value setting as a game, that is, as a situation was driven by the assumption of sincere voting.
where agents interact strategically. The following Recognizing the strategic incentives that voters
simple example provides the basic insights of may have in an election, Feddersen and
their model: suppose that there are three voters, Pesendorfer (1998) apply a similar analysis to
1, 2 and 3, and two candidates, A and B. Suppose the unanimity rule in juries. They find that in the
that voter 1 is an A-partisan, meaning he prefers context of a common value setting, unanimity
candidate A in all states of the world. Agents 2 and voting might result in convicting the innocent
3 instead prefer candidate A in state SA and candi- more often than other rules because strategic
date B in state SB. Let p be the probability of state jurors consider the probability of being pivotal
SA and 1 – p the probability of SB. Now, suppose (like voter 3 in the example) and make their deci-
that agent 2 is informed, i.e. he knows the state of sions conditional on being pivotal.
the world before voting, while agent 3 is not. Now suppose there is another voter, 4, who is
Finally, suppose that there is no cost of voting uninformed and shares the same preferences as
and that the election is decided by simple majority 2 and 3. In this setup, even with a zero cost of
rule. In this situation, agent 1 votes for A, agent voting agent 4 would abstain, so that the informed
3 for B, and agent 2 for A if he observes SA and for voter is pivotal with probability one. Voter 4’s
B if he observes SB. To understand why the behavior is known as strategic abstention: the
uninformed agent 3 votes for B, notice that by act of abstaining by uninformed voters not
doing so the outcome is that A is always selected because voting is costly, but because by doing so
in state SA, while B is always selected in state SB. they allow the informed voters to be pivotal. By
This is clearly the best outcome for agent 3, and in abstaining, uninformed voters in effect delegate
all states this is also the best outcome for the the decision to the informed voters. In the exam-
majority of the population. In fact, if the true ple, voter 4’s strategic abstention allows for infor-
state is SA, A is selected, which is preferred to mation equivalence to arise: making voter
B by all voters; if the true state is SB, B is selected, 4 informed would not change the outcome of the
which is preferred to B by two voters out of three. election, as long as 4 strategically abstains so that
The uninformed agent in the example votes for the informed voter determines the outcome.
B no matter what the prior probability p is, even if Notice that this result is in the same spirit as
p is close to one, that is, even if he is almost sure Condorcet’s Jury Theorem’s result: there, having
that A is the right candidate. By voting for each voter’s belief accurate with a probability
B individual 3 counterbalances the A-partisan’s slightly higher than 50% or substantially higher
vote, thereby allowing the informed voter (2) to than 50% does not make a difference. As long as
induce the “right” outcome with probability one. In voters vote according to their belief, the outcome
Condorcet’s argument, if voters vote according to is the same no matter what the underlying belief
their belief about the state of the world, information accuracy is (as long as it is greater than a half).
is aggregated in a way that induces the right social Although for different reasons, in both Condorcet’s
choice to be made. In this setting, where prefer- setting and Feddersen and Pesendorfer’s model
ences are only partially aligned, information is (under some circumstances) the aggregation of
aggregated so as to always generate the decision information that takes place during an election
that is preferred by the majority of the population ensures that the outcome of the election does not
only if uninformed voters vote strategically. In this vary if the electorate is made more informed.
540 Voting
Morton R (1991) Groups in rational turnout models. Cox GW (1997) Making votes count: strategic coordina-
Am J Polit Sci 35:758–776 tion in the world’s electoral systems. Cambridge
Osborne MJ, Slivinski A (1996) Model a of political com- University Press, Cambridge
petition with citizen-candidates. Quart J Econ Dummett M (1984) Voting procedures. Clarendon Press,
111(1):65–96 Oxford
Palfrey T, Rosenthal H (1983) A strategic calculus of Duverger M (1959) Political parties: their organization and
voting. Public Choice 41(1):7–53 activity in the modern state. Methuen and Co Ltd,
Palfrey T, Rosenthal H (1985) Voter participation and London
strategic uncertainty. Am Polit Sci Rev 79(1):62–78 Dworkin R (1981) What is equality? Part 2: equality of
Persson T, Tabellini G (2001) Political economics. Press resources. Philos Public Aff 10:283–345
MIT, Boston Feddersen T (2004) Rational choice theory and the paradox
Rawls J (1971) A theory of justice. Harvard University of not voting. J Econ Perspect 18(1):99–112
Press, Cambridge Katz RS (1997) Democracy and elections. Oxford
Riker W, Ordeshook P (1968) A theory of the calculus of University Press, Oxford
voting. Am Polit Sci Rev 62:25–42 McLean I (1987) Public choice: an introduction. Basil
Robbins L (1938) Interpersonal comparisons of utility: Blackwell Inc., New York
A comment. Econ J 48(192):635–641 Milnor AJ (1969) Elections and political stability. Little
Satterthwaite M (1975) Strategy-proofness and Arrow’s Brown, Boston
conditions: existence and correspondence theorems Myerson R (2000) Large poisson games. J Econ Theory
for voting procedures and social welfare functions. 94(1):7–45
J Econ Theory 10:187–217 Mueller DC (1989) Public choice II: a revised edition of
Schram A (1991) Voter behavior in economic perspective. public choice. Cambridge University Press, Cambridge
Springer, Heidelberg Niemi RG, Weisberg HF (eds) (1972) Probability models
Shachar R, Nalebuff B (1999) Follow the leader: theory of collective decision making. Charles Merrill Publish-
and evidence on political participation. Am Econ Rev ing E company, Columbus
89(3):525–547 Ordeshook PC (1986) Game theory and political theory:
Tullock G (1967) Towards a mathematics of politics. The An introduction. Cambridge University Press,
University of Michigan Press, Ann Arbor Cambridge
Uhlaner C (1989) Rational turnout: the neglected role of Ordeshook PC (ed) (1989) Models of strategic choice in
groups. Am J Polit Sci 33(2):390–422 politics. The University of Michigan Press, Ann Arbor
Wolfinger R (1980) Who votes? Yale University Press, Rawls J (1958) Justice as fairness. Philos Rev
New Haven 67(2):164–194
Riker WH (ed) (1993) Agenda formation. The University
of Michigan Press, Ann Arbor
Books and Reviews Sen AK (1973) On economic inequality. Oxford
Arrow KJ (1951) Social choice and individual values. University Press, Oxford
Wiley, New York Sen AK (2002) Rationality and freedom. The Belknap
Arrow KJ, Sen AK, Suzumura K (eds) (2002) Handbook of Press of Harvard University Press, Cambridge
social choice and welfare, vol 1. Elsevier, Amsterdam Tidean N (2006) Collective decisions and voting: the
Austen-Smith D, Banks JS (1999) Positive political theory potential for public choice. Ashgate, Burlington
I: collective preference. The University of Michigan Tullock G (1998) On voting. Edward Elgar Publishing,
Press, Ann Arbor Northampton
over the set of candidates. Such a structure
Voting Procedures, depends on the chosen voting procedure and
Complexity of usually ranges between a binary relation on
one extreme and a linear order on the other.
Olivier Hudry Given a collection, called a profile, of individ-
École Nationale Supérieure des ual preferences defined on a set of candidates,
Télécommunications, Paris, France the aggregation problem consists in computing
a collective preference summarizing the profile
as well as possible (for a given criterion).
Article Outline Profile A profile P ¼ ðR1 , R2 , . . . , Rm Þ is an
ordered collection (or a multiset) of m relations
Glossary Ri ð1 i mÞ for a given integer m. As
Definition of the Subject the relations Ri can be the same, another repre-
Introduction sentation of a profile P consists in specifying
Common Voting Procedures only the q relations Ri which are different, for an
Complexity Results appropriate integer q, and the number mi of
Further Directions occurrences of each relation Ri ð1 i qÞ :
Bibliography P ¼ R1 , m1 ; R2 , m2 ; . . . ; Rq , mq .
Social choice function, social choice
Glossary correspondence A social choice function
maps a collection of individual preferences
Condorcet winner A candidate is a Condorcet specified on a set of candidates onto a unique
winner if he or she defeats any other candidate candidate, while a social choice correspon-
in a one-to-one matchup. Such a candidate may dence maps it onto a nonempty set of candi-
not exist; at most, there is only one. Though it dates. This provides a way to formalize what
could seem reasonable to adopt a Condorcet constitutes the most preferred choice for a
winner (if any) as the winner of an election, group of agents.
many common voting procedures bypass the Voting procedure, voting theory A voting pro-
Condorcet winner in favor of a winner chosen cedure is a rule defining how to elect a winner
by other criteria. (single-winner election) or several winners
Majority relation, strict majority relation In a (multiple-winner election) or to rank the can-
pairwise comparison method, each candidate is didates from the individual preferences of the
compared to all others, one at a time. If a voters. Voting theory studies the (axiomatic,
candidate x is preferred to a candidate y by at algorithmic, combinatorial, and so on) proper-
least m/2 voters (a majority), where m denotes ties of the voting procedures designed in order
the number of voters, x is said to be preferred to to reach collective decisions.
y according to the majority relation. The strict
majority relation is defined in a similar way,
but with ðm þ 1Þ=2 instead of m/2. If there is Definition of the Subject
no tie, the strict majority relation is a tourna-
ment, i.e., a complete asymmetric binary rela- One main concern of voting theory is to determine
tion, called the majority tournament. a procedure (also called, according to the context
Preference, preference aggregation A voter’s or the authors, rule, method, social choice func-
preference is some relational structure defined tion, social choice correspondence, system,
scheme, count, rank aggregation, principle, solu- to be able to announce the winner in a “reason-
tion, and so on), for choosing a winner from able” time. This raises the question of the com-
among a set of candidates, based on the prefer- plexity of the voting procedures, which should be
ences of the voters. Each voter’s preference may taken into account to the same extent as their
be expressed as the choice of a single individual axiomatic characteristics.
candidate or, more ambitiously, a ranked list Below, we will detail the complexity results
including all or some of the candidates. Such a about several procedures: plurality rule
situation occurs, obviously, in the field of social (one-round procedure), plurality rule with runoff
choice and welfare (for a broader presentation of (two-round procedure), preferential voting proce-
the field of social choice and welfare, see, for dure (STV), Borda’s procedure, Nanson’s proce-
instance, Aizerman and Aleskerov 1995; Arrow dure, Baldwin’s procedure, Condorcet’s
1963; Arrow and Raynaud 1986; Arrow et al. procedure, Condorcet-Kemeny problem, Slater
2002; Barnett et al. 1995; Barthélemy and problem, prudent orders (G. Köhler, K.J. Arrow,
Monjardet 1981; Elster and Hylland 1986; and H. Raynaud), maximin procedure
Fishburn 1973b; Johnson 1998; Kelly 1987; (P. B. Simpson), minimax procedure
Moulin 1983; Pattanaik and Salles 1983; and (K. J. Arrow and H. Raynaud), ranked pairs pro-
Rowley 1993) and especially of elections (for cedure (T. N. Tideman), Copeland’s procedure,
more about voting theory, see Brams and Fishburn the top cycle solution (J. H. Smith), the uncovered
2002; Dummett 1984; Fischer et al. 2013; Hudry set solution (P. C. Fishburn, N. Miller), the mini-
and Monjardet 2010; Hudry et al. 2009; Laslier mal covering set solution (B. Dutta), Banks’s
2004; Levenglick 1975; Levin and Nalebuff solution, the tournament equilibrium set solution
1995; Merrill and Grofman 1999; Nurmi 1987; (T. Schwartz), Dodgson’s procedure, Young’s
Saari 2001; Straffin 1980; Taylor 1995, 2005), but procedure, approval voting procedure, majority-
also in many other fields: games, sports, artificial choice approval procedure (F. Simmons), and
intelligence, spam detection, Web search engines, Bucklin’s procedure.
Internet applications, statistics, and so on. Section “Introduction” is devoted to a historic
For a long time, much attention has been paid overview and to basic definitions and notation.
to the axiomatic properties fulfilled by the differ- The common voting procedures are depicted in
ent procedures that have been proposed. These section “Common Voting Procedures.”
properties are important in choosing a procedure, Section “Complexity Results” specifies the com-
since there is no “ideal” procedure (see next sec- plexity of these procedures. Other considerations
tion). More recently, in the late 1980s and the linked to complexity in the field of voting theory
early 1990s, the question has arisen regarding can be found in section “Further Directions.”
the relative difficulty of computing winners
according to a given procedure (for an introduc-
tion to computational social choice, see, for Introduction
instance, Chevaleyre et al. (2007)). The first to
study the question may have been J. Orlin in The Search for a “Good” Voting Procedure,
1981, with a result which remained unpublished from Borda to Arrow
(Orlin J 1981, unpublished). The first published It is customarily agreed that the search for a “good”
results are maybe Y. Wakabayashi’s ones in her voting procedure goes back at least to the end of the
PhD thesis (Wakabayashi 1986) (see also eighteenth century, to the works of the chevalier
Wakabayashi 1998), where she deals with the Jean-Charles de Borda (1733–1799) (Borda 1784)
aggregation of binary relations into median and of Marie Jean Antoine Nicolas de Caritat,
orders. The first results on the complexity of the marquis de Condorcet (1743–1794) (Caritat and
aggregation of orders into median orders seem to marquis de Condorcet 1785), and maybe before
be those of (Bartholdi et al. 1989a, b; Hudry (for references upon the historical context, see
1989). From a practical point of view, it is crucial Arrow 1963; Barthélemy and Monjardet 1981;
Voting Procedures, Complexity of 545
Black 1958; Guilbaud 1952; Hägele and Spanish Joseph Isidoro Morales (1797), the French
Pukelsheim 2001; McLean 1995; McLean and Pierre Claude François Daunou (1761–1840)
Hewitt 1994; McLean and Urken 1995, 1997; (Daunou 1803), and Pierre Simon, marquis de
McLean et al. 1995, 2007, and references below). Laplace (1749–1827) (Laplace (marquis de) 1795),
In the 1770s to the 1780s, J. C. de Borda who, through a slightly different approach,
(1784), a member of the French Academy of rediscovered Borda’s procedure, it seems that the
Sciences, showed that the plurality rule used at history of the theory of social choice slowed down
that time by the academy was not satisfactory. and almost disappeared until the 1870s or even the
Indeed, with such a voting procedure, the winner 1950s (see Black 1958). In the 1870s, the English
can be contested by a majority of voters who reverend Charles Lutwidge Dodgson (1832–1898),
would all agree to choose another candidate also (or maybe better) known as Lewis Carroll,
instead of the elected winner. (We may notice proposed a voting system in which the winner is
that the plurality rule with runoff used in many the candidate who becomes a Condorcet winner
countries has the same defect.) Borda then with the fewest appropriate changes in voters’ pref-
suggested another procedure (see below). But, as erences (Dodgson 1873, 1874, 1876). Some years
pointed out by Condorcet in 1784 (Caritat and later, another English mathematician Edward
marquis de Condorcet 1785), the procedure advo- J. Nanson (1850–1936) (Nanson 1882) on the one
cated by Borda has the same defect as the one hand and the Australian Joseph M. Baldwin
depicted by Borda himself for the plurality rule: a (1878–1945) (Baldwin 1926) in 1926 on the other
majority of dissatisfied voters could agree to con- hand slightly modified Borda’s method by itera-
stitute a majority coalition against the candidate tively eliminating some candidates until only one
elected by Borda’s procedure in favor of another remains.
candidate. Condorcet then designed a method None of these methods is utterly satisfactory
based on pairwise comparisons. By nature, this when we consider the properties which are usually
method cannot elect a winner who will give rise to considered desirable. In 1951 and 1963, Kenneth
a majority coalition against him or her. But, unfor- J. Arrow (1963) shows that there does not exist a
tunately, Condorcet’s method does not always “good” voting method, with respect to some “rea-
succeed in finding a winner. If such a winner sonable” axiomatic properties; this is known as
does exist, i.e., if there exists a candidate who the famous “impossibility theorem.” More pre-
defeats any other candidate in such a pairwise cisely, assuming that the preferences of the voters
comparison, then this candidate is said to be a are complete preorders (see below; notice that the
Condorcet winner; if he or she exists, a Condorcet set of complete preorders includes the one of
winner is unique. Actually, the Academy of Sci- linear orders, quite often considered to model the
ences decided to adopt Borda’s method (until preferences of the voters) and that the result of the
1803). By the way, notice that, according to voting procedure should also be a complete pre-
I. McLean et al. (2007), “both Ramon Llull order, K. J. Arrow considered the following
(ca 1232–1316) and Nicolaus of Cusa (also properties:
known as Cusanus, 1401–1464) made contribu-
tions which have been believed to be centuries • Unrestricted domain or universality: the voting
more recent. Llull promotes the method of procedure must be able to provide a result
pairwise comparison, and proposes the Copeland whatever the preferences of the voters are.
rule to select a winner. Cusanus proposes the • Independence of irrelevant candidates: the col-
Borda rule, which should properly renamed the lective preference between candidates x and
Cusanus rule.” Despite these historical discover- y must depend only on the individual prefer-
ies, we shall keep the usual names. ences between x and y; in other words, the
After these seminal works, in spite of some collective preference between x and y must
works by the Swiss Simon Lhuilier (1750–1840) remain the same as long as the individual pref-
(Lhuilier 1794) (see also Monjardet 1976), the erences between x and y do not change.
546 Voting Procedures, Complexity of
• Unanimity (or Pareto property): if a candidate variants, Monjardet 1990); the problem is also
x is preferred to another candidate y by all the known under other names – Charon and Hudry
voters, then x must be preferred to y in the (2007). Another related problem, dealing with the
collective preference too. majority tournament (see below), is the one stated
explicitly by P. Slater in 1961 (Slater 1961) of
K. J. Arrow showed that if there are at least fitting a tournament into a linear order at mini-
three candidates (things are much more comfort- mum distance. This is one of the so-called tourna-
able with only two candidates!) and at least two ment solutions (see Laslier 1997; Moulin 1986);
voters, the only procedure which satisfies all these for references on tournaments, see also McKey
conditions at once is the dictatorship, in which one (2013), Moon (1968), Reid (2004), and Reid and
voter (the dictator) imposes his or her preference. Beineke (1978), of which the aim is to determine a
Though this impossibility theorem ruins the hope winner from a tournament. Besides the Slater
to design a voting procedure fulfilling the usual solution, we shall describe other tournament solu-
desirable properties, several procedures have been tions, such as the top cycle, the uncovered set, the
suggested since this date; we shall describe some minimal covering set, Banks’s solution, and the
of them below. Among the ways to escape tournament equilibrium set. The other common
Arrow’s impossibility theorem, we find the tournament solutions are polynomial, or their
following: complexities are not completely known (see
Hudry 2009 for more details).
• The definition of other axiomatic systems
which would lead to voting procedures which
Definitions, Notation, and Partially Ordered
would not be dictatorship
Sets Used to Model Preferences
• The restriction of the individual preferences to
Let us assume that we are dealing with m voters who
more constrained domains
must choose between n candidates denoted x1,
• Adapting the result, when this one is not satis-
x2, . . ., xn or x, y, z . . .; X denotes this set of candi-
factory with respect to the required axiomatic
dates; in the following, we suppose that n is large
properties, into a result fulfilling these proper-
enough. A binary relation R defined on X is a subset
ties and fitting the genuine result as well as
of X X ¼ fðx, yÞ : x X and y Xg: We use
possible, for some criterion which must be
the notation xRy instead of ðx, yÞ R and xRy
defined
instead of ðx, yÞ 2 = R.
The main questions associated with the first It is customary to represent the preferences
possibility are “given some axiomatic properties, Ri ð1 i mÞ of m voters as an ordered
what are the voting procedures satisfying these collection (or a multiset) P ¼ ðR1 , R2 , . . . , Rm Þ,
properties?” or, conversely, “given a voting pro- called the profile of the m binary relations.
cedure, what is the proper axiomatic system char- Another representation of the preferences of the
acterizing this procedure?”; we will not consider voters exists. Since these preferences may be the
this direction here. The second possibility will be same for two different voters, we may consider
illustrated below, by the restriction of the individ- only the different relations R1, R2, . . ., Rq arising
ual preferences to single-peaked preferences. The from the opinions of the voters, where q denotes
third direction was followed by J. G. Kemeny in this number of different opinions. In this way, we
1959 (Kemeny 1959), when he studied the aggre- combine all the voters sharing the same opinion.
gation of complete preorders into a median com- Last, if mi ð1 i qÞ denotes the number of
plete preorder (see below). Notice that the median voters sharing Ri ð1 i qÞ as their prefer-
Xq
procedure is also attributed to Condorcet; in the ence (notice the equality mi ¼ mÞ, we associ-
sequel, we will refer the search for a median linear i¼1
order as the Condorcet-Kemeny problem (as other ate this number mi of occurrences with each
people rediscovered this problem or some of its type Ri of relations to describe P. Then
Voting Procedures, Complexity of 547
P can be described as the set of such pairs: structures. But as reflexivity and irreflexivity do
P ¼ ðR1 , m1 Þ; ðR2 , m2 Þ; . . . ; Rq , mq : Such a not matter for complexity results (see Hudry
representation does represent the data more com- 2008), we give below only one version among
pactly when m is large with respect to n. Usually, these two possibilities.
the complexity results are the same for the two
representations, because their proofs stand even if • A partial order is an antisymmetric
m is bounded by a polynomial in n, and in this (if reflexive) or asymmetric (if irreflexive) and
case, there is no qualitative difference between the transitive binary relation (see Fig. 1); O will
two representations. With respect to the results denote the set of the partial orders defined on X.
stated in section “Complexity Results,” there • A linear order is a complete partial order (see
would be a difference only for the LNP- Fig. 2); L will denote the set of the linear orders
completeness results of Theorems 5, 14, and 15 defined on X. If L denotes a linear order defined
(which then should be replaced only by on X, we will represent L as xsð1Þ > xsð2Þ >
NP-hardness results). So, in the sequel, we shall > xsðnÞ for some appropriate permutation
consider only the first representation. Moreover, s, with the agreement that the notation
we shall assume that the preference of each voter xsðiÞ > xsðiþ1Þ ðf or 1 i < nÞ means that xs(i)
ið1 i mÞ is given by a binary relation Ri is preferred to xsðiþ1Þ according to L, the rela-
defined on X and that Ri is described by its char- tionship between the other elements of X being
acteristic vector (see below), which requires n2 involved by transitivity. The element xs(1) will
bits. So the size of the data set is about mn2. be called the winner of L.
We will consider two kinds of elections: we • A tournament is a complete and asymmetric
want to elect one candidate, or we want to rank all binary relation (see Fig. 3); T will denote the
of them into a partially ordered set (or poset). One set of the tournaments defined on X; notice that
of the most common posets is the structure of a transitive tournament is a linear order and
linear order. Other posets can be defined from conversely. As a tournament may contain cir-
the following basic properties: cuits, they are usually not considered as an
appropriate structure to represent the collective
Reflexivity 8x X, xRx
preference that we seek. Tournaments will be
Irreflexivity 8x X, xRx
Antisymmetry 8ðx, yÞ X2 , ðxRy and x 6¼ yÞ ) yRx
Asymmetry 8ðx, yÞ X2 , xRy ) yRx
Transitivity 8ðx, y, zÞ X3 , ðxRy and yRzÞ ) xRz
Completeness 8ðx, yÞ X2 , with x 6¼ y, xRy, or
(inclusively) yRx
Voting Procedures, Complexity of, Fig. 3 A Voting Procedures, Complexity of, Fig. 5 A complete
tournament preorder
Common Voting Procedures at the ith position by the considered voter. The
score of x is the total number of points that
In this section, we describe the main voting pro- x received. The winner is the candidate with the
cedures, i.e., plurality rule (one-round procedure), maximum score. If the aim is to rank the candi-
plurality rule with runoff (two-round procedure), dates, we may also sort them according to decreas-
preferential voting procedure (STV), Borda’s pro- ing scores and then consider the linear extensions
cedure, Nanson’s procedure, Baldwin’s proce- of the complete preorder provided by this sorting.
dure, Condorcet’s procedure, Condorcet- (Another possibility would be to apply the proce-
Kemeny problem, Slater problem, prudent orders dure n 1 times, after having removed the winner
(G. Köhler, K. J. Arrow, and H. Raynaud), maxi- of the current iteration.) For the plurality rule, the
min procedure (P. B. Simpson), minimax proce- score vector is (1, 0, 0, . . ., 0).
dure (K. J. Arrow and H. Raynaud), ranked pairs There are two rounds in the plurality rule with
procedure (T. N. Tideman), Copeland’s proce- runoff, also called two-round (or two-ballot) pro-
dure, the top cycle solution (J. H. Smith), the cedure. The first round is like the plurality rule
uncovered set solution (P. C. Fishburn, described above. At the end of this first step, if a
N. Miller), the minimal covering set solution candidate has gained at least ðm þ 1Þ=2 points
(B. Dutta), Banks’s solution, the tournament equi- (the strict majority), he or she is the winner. Oth-
librium set solution (T. Schwartz), Dodgson’s pro- erwise, the two candidates with the maximum
cedure, Young’s procedure, approval voting numbers of points remain for a second round,
procedure, majority-choice approval procedure and the others are removed from the election.
(F. Simmons), and Bucklin’s procedure (see also Then the plurality rule is applied again but only
Brams and Fishburn 2002). For other tournament to the remaining two candidates. This method is
solutions, see Laslier (1997) and Moulin (1986). designed to elect only one winner. If we want to
obtain k winners, it is sufficient to apply it k times.
Plurality Rule, Plurality Rule with Runoff, and Notice that the repetition of a given procedure
Preferential Voting Procedure always makes it possible to elect several winners.
One of the easiest voting procedures by which to A generalization of these procedures consists
elect one candidate as a winner is the plurality rule in performing a given number of rounds. For each
(also called one-round procedure or relative round, each voter gives 1 point for his or her
majority, or sometimes first-past-the-post, or win- favorite candidate. If, at the end of a round, there
ner-takes-all, or also majoritarian voting. . .; see is a candidate who has gained at least ðm þ 1Þ=2
Inada (1969)). In this procedure, each voter gives points, he or she is the winner. Otherwise, the
1 point to his or her favorite candidate (so it is not candidates with the lowest numbers of points are
necessary to know the preferences of the voters on eliminated from the competition; the number of
the whole set of candidates). The candidate who candidates eliminated at each round depends on
gains the maximum number of points is the the number of rounds but is such that only two
winner. candidates will remain for the last round. The
This procedure belongs to the family of scor- winner of this last round is the winner of the
ing procedures. In such a procedure, a score vec- election. Special cases are the plurality rule with
tor (s1, s2, . . ., sn) is fixed independently of the runoff, described above, and the one with at most
voters, with s1 s2 sn . For each n 1 rounds. For this latter case, exactly one
voter, a candidate x receives si points if x is ranked candidate is removed at each round. This variant,
550 Voting Procedures, Complexity of
in which the candidate who is the least often candidate ranked at the ith position, we may
ranked at the first position is removed, is also choose to assign other values. For instance,
known as preferential voting (or preference vot- given an integer k, we may credit k points to the
ing) or as single transferable vote (STV) or as candidate ranked at the first position, k 1 points
instant-runoff voting (IRV). to the second, and so on, through the candidate
Other variants of these voting procedures may ranked at the kth position, who gains 1 point,
be defined by the successive eliminations of the after which the following candidates gain
losers. For instance, the candidate who is most nothing (the shape of the score vector is
often ranked last is removed, and we iterate this ðk, k 1, . . . , 1, 0, . . . , 0Þ). For k ¼ 1, this system
process while there remain at least two candidates. is the plurality rule. For k n 1, this system
gives the same results as Borda’s procedure.
Borda’s Procedure and Some Variants Other systems are based on Borda’s procedure
(Nanson’s and Baldwin’s Procedures) but with the elimination of some candidates.
As related above, Borda considered the plurality Nanson’s procedure (Nanson 1882) modifies
rule unsatisfactory because the winner can be Borda’s procedure by eliminating the candidates
contested by a majority of voters who would all whose Borda scores are below the average Borda
agree to choose another candidate instead of the score and by repeating the computations of the
elected winner (the same holds for the plurality Borda scores with respect to the remaining candi-
rule with runoff, see Example 1). Borda suggested dates after these eliminations, until there remains
another procedure in which the voters rank the only one candidate. Another variant of Borda’s
n candidates according to their preferences. For procedure is one suggested by Baldwin in 1926
each voter, the candidate who is ranked first is (Baldwin 1926); as in Nanson’s procedure, candi-
given n 1 points, a candidate ranked second is dates are iteratively removed from the election.
given n 2 points, and so on: more generally, a But, in Baldwin’s procedure, only one candidate is
candidate ranked at the ith position is given n i removed at each iteration, the one whose Borda’s
points. Then, all these points are summed up for score is the lowest.
each candidate: this sum is the Borda score s B of
the candidate. The candidate with a maximum
Borda score is the Borda winner. So Borda’s pro-
cedure is also a scoring procedure, of which the Condorcet’s Procedure
score vector is ðn 1, n 2, . . . , 1, 0Þ. Condorcet designed a method based on pairwise
Using Borda’s procedure, one can easily obtain comparisons. More precisely, for each candidate
a ranking of all the candidates. A first possibility x and each candidate y with x 6¼ y, we compute
consists, as for the plurality rule or the plurality the number mxy that we will call the pairwise
rule with runoff, in iterating Borda’s procedure comparison coefficient below, of voters who
n 1 times, after having removed the Borda win- prefer x to y. Then x is considered as better
ner of the current iteration. But we may also apply than y if a majority of voters prefers x to y, i.e.,
Borda’s procedure only once and then rank the if we have mxy > myx . This defines the (strict)
candidates according to the decreasing values of majority relation T: xTy , mxy > myx . In some
their Borda scores. This gives a complete pre- cases, there exists a Condorcet winner, i.e., a
order. Any linear extension of this complete pre- candidate x defeating any other candidate: 8y 6¼
order can be considered as the collective ranking x, mxy > myx . If there exists a Condorcet winner,
according to Borda’s procedure. Notice that these then he or she is unique. It may even happen that
two possibilities do not necessarily provide the T is a linear order and allows us to rank all the
same rankings (see Example 1). candidates. Such is the case for the following
Several variants of Borda’s procedure have example (due to B. Monjardet, private commu-
been studied. For instance, we may apply other nication), which illustrates the previous voting
score vectors: instead of n i points given to the procedures.
Voting Procedures, Complexity of 551
Example 1 Assume that m ¼ 13 voters must sB ðxÞ ¼ 14 , sB ðyÞ ¼ 15 , and sB ðzÞ ¼ 10 . Thus,
rank n ¼ 4 candidates x, y, z, and t. Suppose the z is removed and the remaining computations are
preferences of the voters are given by the follow- as in Nanson’s procedure: here also x is the winner
ing linear orders: (but it may happen that Nanson’s procedure and
Baldwin’s procedure do not provide the same
winners).
Let us now compute the pairwise comparison
• The preferences of two voters are x > y >
coefficients mxy necessary to apply Condorcet’s
z > t.
procedure:
• The preference of one voter is y > z > x > t.
• mxy ¼ 2 þ 5 ¼ 7; myx ¼ 1 þ 1 þ 4 ¼ 6;
• The preference of one voter is y > z > t > x.
• mxz ¼ 2 þ 5 ¼ 7; mzx ¼ 1 þ 1 þ 4 ¼ 6
• The preferences of four voters are z > y >
• mxt ¼ 2 þ 1 þ 4 ¼ 7; mtx ¼ 1 þ 5 ¼ 6;
x > t.
• myz ¼ 2 þ 1 þ 1 þ 5 ¼ 9; mzy ¼ 4;
• The preferences of five voters are t > x >
y > z. • myt ¼ 2 þ 1 þ 1 þ 4 ¼ 8; mty ¼ 5;
• mzt ¼ 2 þ 1 þ 1 þ 4 ¼ 8; mtz ¼ 5.
According to the plurality rule, t is the winner The bold values are the ones greater than the
with 5 points (2 points for x and for y, 4 points for majority ðm þ 1Þ=2. These values show that the
z). According to the plurality rule with runoff, z is majority relation, here, is a linear order, namely,
the winner (the four voters who voted for x or x > y > z > t.
y prefer z to t). The Borda scores sB of the candi- By the way, this example shows the impor-
dates are: tance of the voting procedure, as already noticed
by Borda (see Mascart 1919): four procedures and
four different winners; so all four of our candi-
• sB ðxÞ ¼ 2 3 þ 5 2 þ 5 1 þ 1 0 ¼ 21; dates may claim to be the winners of the
• sB ðyÞ ¼ 2 3 þ 6 2 þ 5 1 þ 0 0 ¼ 23; election. . . . It is often the case that different pro-
• sB ðzÞ ¼ 4 3 þ 2 2 þ 2 1 þ 5 0 ¼ 18; cedures provide different winners.
• sB ðtÞ ¼ 5 3 þ 0 2 þ 1 1 þ 7 0 ¼ 16.
Median Orders, Condorcet-Kemeny Problem,
and Slater Problem
So the winner according to Borda’s procedure As Condorcet himself discovered, his procedure
is y, and the ranking of the four candidates is y > may lead to a majority relation which is not tran-
x > z > t (notice that if, while there are at least sitive. The simplest example in this respect is one
two candidates, we apply the variant consisting in with n ¼ 3 candidates x, y, and z and with m ¼ 3
removing the winner and reapplying Borda’s pro- voters whose preferences are, respectively, x >
cedure, then we obtain the orders y > x > z > t y > z, y > z > x, and z > x > y. It is easy to verify
and y > z > x > t as the possible rankings; the that the majority relation T is defined by xTy, yTz,
distance to the ranking provided by one applica- and zTx, hence a lack of transitivity. If there is no
tion of Borda’s procedure may be much more tie (which is necessarily the case if m is odd, since
important). In Nanson’s procedure, as the Borda the individual preferences of the voters are
scores of z and t are below the average (which is assumed to be linear orders), then T is a tourna-
equal to 19.5), z and t are removed and only x and ment, called the majority tournament. We may or
y remain. Then the Borda scores of x and y for the may not weigh T if we want to take the intensity of
second step become sB ðxÞ ¼ 7 and sB ðyÞ ¼ 6 ; the preferences into account. If T is not weighted,
hence, x is the winner according to Nanson’s the search for a winner or for a ranking of the
procedure. In Baldwin’s procedure, t is first candidates leads to the definition of tournament
removed. The Borda scores of x, y, and z become solutions (see Laslier (1997) for a comprehensive
552 Voting Procedures, Complexity of
study of these and Hudry (2009) for a survey of their Problems Pm ðY , Z Þ . For Y belonging to
complexities, some of which are given below). In fA, C , L, O, P , R, T g and Z also belonging to
the Condorcet-Kemeny problem, T is weighted, and fA, C , L, O, P , R, T g , for a positive integer m,
the aim is to compute a linear order or, more gener- Pm ðY , Z Þ denotes the following problem: given
ally, a poset fitting T “as well as possible.” a finite set X of n elements and given a profile P of
To specify what “as well as possible” means, we m binary relations all belonging to Y , find a
use the symmetric difference distance d defined, for relation R* belonging to Z minimizing D over Z:
two binary relations R and S defined on X, by DðP, R Þ ¼ MinfDðP, RÞ f or R Z g.
With this notation, the initial problem, possibly
considered by Condorcet, is Pm(L, L), consisting in
dðR, SÞ ¼ jfðx, yÞ X2 : xRy and xSy or aggregating m linear orders into a median linear
ðxRy and xSyÞjg: order. Similarly, the problem considered by J. G.
Kemeny (1959), consisting in aggregating
This quantity d(R, S) measures the number of m complete preorders into a median complete pre-
disagreements between R and S. Though it is order, is Pm ðC , C Þ. A Condorcet-Kemeny winner is
possible to consider other distances, d is widely the winner of any median linear order (for references
used and is appropriate for many applications. J. P. about the Condorcet-Kemeny problem, see, e.g.,
Barthélemy (1979) shows that d satisfies a number Charon and Hudry 2007; Jünger 1985; Monjardet
of naturally desirable properties. J. P. Barthélemy 2008a, b; Reinelt 1985; and references therein).
and B. Monjardet (1981) recall that d(R, S) is the We may easily state the problems Pm ðY , Z Þ as
Hamming distance between the characteristic vec- 0–1 linear programming problems (see Barthélemy
tors (see below) of R and S and point out the links and Monjardet 1981; Charon and Hudry 2007;
between d- and the L 1-metric or the square of the Hudry 1989, 2008; Wakabayashi 1986 for
Euclidean distance between these vectors (see instance) for any profile P ¼ ðR1 , R2 , . . . , Rm Þ of
also Monjardet 1979, 1990). m binary relations Ri ð1 i mÞ all belonging to
So for a profile P ¼ ðR1 , R2 , . . . , Rm Þ of Y . To this end, consider the characteristic vectors
m relations, we can define the remoteness r i ¼ r ixy 2
of the relations Ri ð1 i mÞ
ðx,yÞ X
D(P, R) between a relation R and the profile P by
defined by r ixy ¼ 1 if xRiy and r ixy ¼ 0 otherwise
X
m and similarly the characteristic vector r ¼
DðP, RÞ ¼ dðR, Ri Þ: r xy ðx,yÞ X2 of any binary relation R. Then, after
i¼1
somecomputations, we obtain DðP, RÞ ¼
X m X
X
The remoteness D(P, R) measures the total num- C axy r xy, where C ¼ r ixy is a con-
ber of disagreements between P and R (see Hudry ðx,yÞ X2 i¼1 ðx,yÞ X2
y and the majority. It is a non-positive or non- • Τhe preferences of three voters are x > y > z > t.
negative integer with the same parity as m. To • Τhe preferences of two voters are y > t > z > x.
obtain a 0–1 linear programming statement of • Τhe preference of one voter is t > z > x > y.
Pm ðY , Z Þ, it is then sufficient to express the con- • The preference of one voter is x > z > y > t.
straints defining the set Z , which is easy for the • The preference of one voter is t > y > x > z.
posets described above. For instance, the transitiv- • The preference of one voter is z > t > y > x.
ity of R can be expressed by the following
inequalities: The quantities mxy involved in Condorcet’s
procedure are the following, where the bold
8ðx, y, zÞ X3 , 0 r xy þ r yz r xz 1: values, again, denote those greater than the major-
ity ðm þ 1Þ=2:
As stated above, it is also common to represent • mxy ¼ 5; myx ¼ 4;
a preference R defined on X by a graph. The • mxz ¼ 5; mzx ¼ 4;
properties of the graph are the properties of R: it • mxt ¼ 4; mtx ¼ 5;
can be antisymmetric, complete, transitive, and so • myz ¼ 6; mzy ¼ 3;
on. Similarly, the profile P ¼ ðR1 , R2 , . . . , Rm Þ • myt ¼ 6; mty ¼ 3;
can also be represented by a directed, weighted,
• mzt ¼ 5; mtz ¼ 4.
complete, and symmetric graph GP ¼ ðX, U X Þ: its
set of vertices is X, and G P contains all the
Here, the majority relation is not a linear
possible arcs except the loops, i.e., UX ¼
order but the tournament of Fig. 7. Figure 7
X X fðx, xÞ f or x Xg . (The loops would
also displays the graph GP summarizing the
be associated with reflexivity; this property, as
data. We may observe that the majority tourna-
well as irreflexivity, has no impact on the com-
ment is given by the arcs of GP with a positive
plexity status of the studied problems, hence this
weight.
simplification.) The weights of the arcs (x, y) give
For these data, it is not too difficult to verify
the intensity of the preference for x over y. The
that there is only one median linear order, which
computations above lead us to assign axy as the
is x > y > z > t, this order keeps all the positive
weight of any arc (x, y) of GP. With this choice,
weights except that of the arc (t, x), of which
minimizing D(P, R) is the same, from the graph
the weight is the minimum, while the arcs with
theoretic point of view, as drawing from G P a
positive weights do not define a linear order.
subset of arcs with a maximum total weight and
Hence, x is the only Condorcet-Kemeny
satisfying the structural properties required from
winner.
R. Notice that characterizations of the graphs that
Attention has also been paid to P1 ðT , L Þ, i.e.,
we can associate with profiles P have been pro-
the approximation of a tournament (which can be
vided by different authors (see Charon and Hudry
the majority tournament of the election) by a
2007; Debord 1987a, b; Erdös and Moser 1964;
linear order at minimum distance. This problem
Hudry 2008; Mc Garvey 1953; Stearns 1959); the
is also known as Slater problem, since P. Slater
construction of the profiles can be done in poly-
explicitly stated it in 1961 (Slater 1961); see also
nomial time, which allows us to study the com-
Charon and Hudry (2007) for a survey of this
plexities of the problems Pm ðY , Z Þ through their
problem. A linear order L* at minimum distance
graph theoretic representations GP.
(with respect to the symmetric difference dis-
Example 2 illustrates these considerations.
tance) from the tournament T constituting the con-
sidered instance of P1 ðT , L Þ is called a Slater
Example 2 Assume that m ¼ 9 voters must rank order of T : dðT, LÞ ¼ min L dðT, LÞ. This mini-
n ¼ 4 candidates x, y, z, and t. Assume, also, that mum distance is usually called the Slater index i
the preferences of the voters are given by the (T) of T : dðT, LÞ ¼ iðT Þ. A Slater winner of T is
following linear orders: the winner of any Slater order of T.
554 Voting Procedures, Complexity of
Voting Procedures, Complexity of, Fig. 7 The majority tournament (left) and the graph GP (right) associated with the
data of Example 2 and weighted by the quantities axy ¼ mxy myx
cannot be winners. Then we must choose between Top Cycle: Smith’s Solution
adding xLy or xLz or tLx or zLt, since mxy, mxz, Another tournament solution is the so-called top
mtx, and mzt are equal. We cannot add all of them cycle (also called the Smith set) (Smith 1973).
simultaneously, because of incompatibilities with Any directed graph G can be decomposed into
yLz or yLt already fixed. If we choose to add xLy, its strongly connected components. We may then
xLz, and zLt, we obtain the linear order x > y > define another graph H derived from G. The ver-
z > t, and x is a winner. If we choose to add tLx tices of H are associated with the strongly
and zLt, we obtain the linear order y > z > t > x, connected components of G. Let x (respectively
and y is a winner. Here again, x and y are the y) be a vertex of G associated with the strongly
winners. connected components Cx (respectively Cy) of
G. There will exist an arc (x, y) from x to y if
there is at least one arc in G from Cx to Cy (notice
Tournament Solutions
that, in this case, all the arcs between Cx and Cy
The procedures described in this section apply to
are from Cx toward Cy). Then H does not contain
tournaments, which can function as the majority
any circuits. Moreover, if G is a tournament, H is a
tournament of an election. They deal with
linear order and admits a winner (but H contains
unweighted tournaments, but some of them can be
only one vertex if G is strongly connected). The
extended to weighted tournaments, as in the case of
top cycle of G is the strongly connected compo-
the solution designed by P. Slater (see above).
nent of G associated with the winner of H.
The top cycle solution, when applied to a
Number of Wins: Copeland’s Procedure tournament T, consists of considering all vertices
The procedure designed by A. H. Copeland in of the top cycle of T as the winners of T. For
1951 (Copeland 1951) is also based on the instance, for the tournament of Fig. 7, which is
mxy’s. For any two different candidates x and y, strongly connected, the top cycle contains the four
we set Cðx, yÞ ¼ 1 if mxy > myx , Cðx, yÞ ¼ 0:5 if vertices x, y, z, and t, which are the four winners
mxy ¼ myx , and Cðx, yÞ ¼ 0 if mxy < myx . The according to this solution.
Copeland score C(x) of x is CðxÞ ¼
X
y6¼x
Cðx, yÞ. The Copeland winner is any can-
Uncovered Set: Fishburn’s and Miller’s Solution
didate with a maximum Copeland score (here A refinement of the top cycle is provided by the
also, we may rank the candidates according to set of uncovered vertices. Given a tournament
decreasing Copeland scores). T and two vertices x and y of T, we say that x
The application of Copeland’s procedure to covers y if (x, y) is an arc of T and if, for any arc
Example 2 gives CðxÞ ¼ 2, CðyÞ ¼ 2, CðzÞ ¼ 1, (y, z), (x, z) is also an arc of T. (In other words,
and CðtÞ ¼ 1 . Here, x and y are the Copeland considering T as a majority tournament, x beats y
winners. and any vertex beaten by y is also beaten by x.)
The Copeland score has a meaning from a A vertex is said to be uncovered if no other vertex
graph theoretic point of view. If we assume that covers it. The uncovered set of T is noted UC(T).
there is no tie, the majority relation is a tourna- Adopting the elements of UC(T) as the winners of
ment (the majority tournament). Then the T has been independently suggested by
Copeland score of a candidate x is also the out- P. Fishburn in 1977 (Fishburn 1977) and by
degree of the vertex associated with x. Copeland’s N. Miller in 1980 (Miller 1980).
procedure is one of the common tournament solu- For the tournament of Fig. 7, it is easy to see
tions (see Laslier 1997). It can be extended to that z is covered by y, while x, y, and t are uncov-
weighted tournaments X by sorting the vertices x ered: here, x, y, and t are the three winners
according to the sum m of the pairwise
y6¼x xy according to this solution.
comparison coefficients mxy; this variant leads to The definition of a covered vertex can easily be
the same ranking as Borda’s procedure. extended to weighted tournaments, though the
556 Voting Procedures, Complexity of
possibility of weights equal to 0 makes the situa- the tournament equilibrium set. To define it, we
tion more difficult than expected (see Charon and need extra definitions. Let G be a directed graph.
Hudry (2006) for details). The top set TS(G) of G is defined as the union of
the strongly connected components of G with no
Minimal Covering Set: Dutta’s Solution in-coming arcs (if G is a tournament, then TS(G)
The uncovered set can also be refined, for exam- is equal to TC(G)).
ple, by the minimal covering set proposed by Let Sol be a tournament solution and T be a
B. Dutta (1988). Consider a tournament tournament. When Sol is applied to T, we define
T defined on X. We say that a subset Y of X is a the contestation graph G(Sol, T) associated with
covering set of T if we have the following prop- Sol and T as follows: The vertex set of G(Sol, T) is
erty: 8x X Y, x 2
= UCðY [ fxgÞ. For instance, X; there is an arc (x, y) from x to y in G(Sol, T) if
UC(T) is a covering set of T. Let X(T) denote the and only if (x, y) is an arc of T and x is a winner
set of the covering sets of T. Then there exists a according to Sol when Sol is applied to the sub-
minimal element of X(T) with respect to inclusion; tournament of T induced by the predecessors of y
this minimal element is called the minimal cover- in T. In other words, the arcs (x, y) of G(Sol, T)
ing set MC(T) of T. describe the following situation: if y is considered
For the tournament of Fig. 7, the minimal cov- as a possible winner, then x will contest the elec-
ering set contains x, y, and t, which are the three tion of y because x beats y in T and because x is a
winners according to this solution, here as for the winner among the candidates who beat y. We may
uncovered set. (But there exist tournaments for notice that G(Sol, T) is a subgraph of T. By con-
which the general inclusion MCðT Þ UCðT Þ is sidering the top set TS[G(Sol, T)] of G(Sol, T), we
strict.) get a new tournament solution. T. Schwartz pro-
ved in Schwartz (1990) that there exists a unique
Maximal Transitive Subtournaments: Banks’s tournament solution that he called the tournament
Solution equilibrium set TEQ, which is a fixed point with
Among other tournament solutions, one designed respect to this process: 8T, TEQðT Þ ¼
by J. Banks in 1985 (Banks 1985) is of interest. TS½GðTEQ, T Þ .
When the tournament being considered, T, is tran- For the majority tournament of Example 2 (see
sitive (i.e., T is a linear order), there exists a Fig. 7), the tournament equilibrium set contains x,
unique winner who is selected as the winner of y, and t.
T by the usual tournament solutions. If that is not
the case, we may consider the transitive sub- Dodgson’s Procedure
tournaments of T which are maximal with respect In the procedure proposed in 1876 by C. L. Dodg-
to inclusion and then select the winner of each of son (or Lewis Carroll) (Dodgson 1876), each
them as the winners of T. This defines the Banks’s voter ranks all the candidates into a linear order.
solution (Banks 1985): a Banks winner of T is the If a Condorcet winner exists, he or she is also the
winner of any maximal (with respect to inclusion) Dodgson winner. Otherwise, Dodgson’s proce-
transitive subtournament of T. dure consists of choosing as winners all the can-
If we consider the majority tournament of didates who are “closest” to being Condorcet
Example 2 (see Fig. 7), three Banks winners winners: for each candidate x X, let D(x) be the
exist: x (because of the maximal transitive sub- Dodgson score of x, defined as the minimum
tournament x > y > z ), y (because y > z > t ), number of swaps between consecutive candidates
and t (because t > x). in the preferences of the voters such that x
becomes a Condorcet winner; a Dodgson winner
Tournament Equilibrium Set: T. Schwartz’s is any candidate x* minimizing D: DðxÞ ¼
Solution MinxX DðxÞ.
The solution that we deal with in this subsection Specifically, consider a profile P ¼
was designed by T. Schwartz (1990) and is called ðL1 , L2 , . . . , Lm Þ of m linear orders. Let i be an
Voting Procedures, Complexity of 557
date, this voting procedure is the same as the plu- candidate has at least the absolute majority
rality rule. In this respect, the approval voting pro- ðm þ 1Þ=2 , he or she is the winner. Otherwise,
cedure can be seen as a generalization of the second choices are added to the first choices. Once
plurality rule. again, if a candidate has the absolute majority
More formally, this procedure assumes that the ðm þ 1Þ=2 , he or she becomes a winner. Other-
preferences of the voters are complete preorders wise, we consider third choices and so on, until at
with only two classes (one class for the approved least one candidate obtains the absolute majority:
candidates and one for the disapproved ones). then he or she becomes a winner. Since after the
These preorders must then be aggregated into a first round there are more votes than voters,
collective complete preorder also with two clas- Bucklin’s procedure is sometimes considered as
ses, one of them with only one element (the win- a variant of approval voting procedure, but a main
ner) and the other class with all the other elements difference is that Bucklin’s procedure may require
(the losers). Extension for electing several candi- several rounds. It can also be considered as a
dates simultaneously is immediate. variant of scoring procedures, since it is the
Variants (sometimes known as range voting, rat- same as iteratively applying a scoring procedure
ings summation, average voting, cardinal ratings, with successively (1, 0, 0, . . . , 0), (1, 1, 0, 0, . . . ,
0–99 voting, the score system, or the point system) 0), (1, 1, 1, 0, . . . , 0), . . . as the score vectors, until
can be based on the same idea. For instance, each a candidate obtains the majority.
voter has a maximum number of points, and he or Once again, consider Example 2. In the first round,
she can share them out among the candidates as he x gets 4 points; y, 2 points; z, 1 point; t, 2 points. As
or she pleases, with or without a constraint on the there is no candidate with at least 5 points, we con-
maximum number of points per candidate. We can sider the second choices. Then we obtain: x keeps
also add several rounds and thus define new pro- 4 points; y obtains 6 points; z, 3 points; t, 5 points. As
cedures. The majority-choice approval procedure y and t obtain at least 5 points, the process stops here
(MCA) designed by F. Simmons in 2002 can also and y and t are the Bucklin winners.
be seen as a variant of approval voting. In this A variant of Bucklin’s procedure would be to
system, each voter has three possibilities for rating consider, at the last iteration, the candidates with a
each candidate: “favored,” “accepted,” or maximum number of points as the winners (only y
“disapproved.” If a candidate is ranked “favored” for Example 2).
by an absolute majority of the voters, then any
candidate marked “favored” by a maximum number
of voters is a winner. Otherwise, the winner is any Complexity Results
candidate with the largest number of “favored” or
“accepted” marks. It is sometimes required that this After a reminder of the complexity classes that are
number be at least ðm þ 1Þ=2 ; otherwise, no one useful for our purposes, this section examines the
will be elected. Ties can be broken using the number complexity of the voting procedures described in
of “favored” marks. We may, of course, obtain new the previous section (see also Faliszewski et al.
variants by increasing the number of levels. 2009a). Among those tournament solutions which
can be applied to the majority tournament, we
Bucklin’s Procedure restrict ourselves to the solutions depicted above;
The procedure proposed by James W. Bucklin in the reader interested in the complexity of the other
the early twentieth century is also called the Grand common tournament solutions will find some
Junction system (Grand Junction is a city in Col- answers in Hudry (2009).
orado, where Bucklin’s procedure was applied
from 1909 to 1922). In this procedure, we first Main Complexity Classes
count, for each candidate x, the number of times Some complexity classes z are well known: P, NP,
that x is ranked first, as in the plurality rule. If a co-NP, and the sets of polynomial problems, of
Voting Procedures, Complexity of 559
NP-complete problems, and of NP-hard pro- with unbounded fan-in and a polynomial number
blems. . . . Let us recall some other notations (for of gates. The second one, TC0, consists of all the
references upon the theory of complexity, see, for problems solvable by polynomial-size, bounded-
instance Ausiello et al. 2003; Garey and Johnson depth, and unbounded fan-in Boolean circuits
1979; Hemaspaandra 2000; Johnson 1990); see augmented by so-called “threshold” gates, i.e.,
also Aaronson and Kuperberg (2013) for a list of unbounded fan-in gates that output “1” if and
about 500 complexity classes. Like D. S. Johnson only if more than half their inputs are nonzero
in Johnson (1990), we will distinguish between (see Johnson (1990) and the references therein
decision problems and other types of problems. for details). A problem of TC0 is said to be TC0-
Let n denote the size of the data. The class LNP, or complete if it is complete under AC0 Turing
PNP[log], PNP[log n], Yp2, or also PNP
jj , contains those reductions. Notice the inclusions AC0
TC0
decision problems which can be solved by a deter- L P.
ministic Turing machine with an oracle in the
following manner: the oracle can solve an appro-
priate NP-complete problem in unit time, and the Complexity Results for the Usual Voting
number of consultations of the oracle is upper Procedures
bounded by log(n). The class PNP or Dp2 We can now examine complexity results of the
(or sometimes simply D2) or also PNP½n ¼
∘ð1Þ
voting procedures described above. When dealing
[k0 PNP½n is defined similarly but with a
k
with linear orders, we assume that we can have
polynomial of n instead of log(n). Notice direct access to the ordered list of the candidates,
the inclusions: NP [ co NP LNP PNP : For especially to their winners (see above). Remem-
problems which are not decision problems (i.e., ber that n denotes the number of candidates and m
optimization problems in which we look for the the number of voters.
optimal value of a given function f, or search
problems in which we look for an optimal solution Theorem 1 The following procedures are
of f, or enumeration problems in which we look polynomial.
for all the optimal solutions of f; these problems are
sometimes called function problems), these classes • The plurality rule (one-round procedure) is in
are extended with an “F” in front of their names (see Oðn þ mÞ.
Johnson 1990); • The plurality rule with runoff (two-round pro-
for instance,
we thus obtain the
classes FLNP ¼ FYp2 . . . and FPNP ¼ FDp2 . . . . cedure) is in Oðn þ mÞ.
The usual definition of “completeness” is extended • The preferential voting procedure (STV) is in
to these classes in a natural way. Though the nota- Oðnm þ n2 Þ.
tion Dp2 or Yp2 is sometimes more common in com- • Borda’s procedure is in O(nm).
plexity theory (especially when dealing with the • If the preferences of the voters are known
polynomial hierarchy), we shall keep the pseudo- through a profile of complete preorders with
nyms PNP and LNP (as well as FPNP and FLNP), as two classes, the approval voting procedure
perhaps being more informative. (and its variant suggested by F. Simmons) is
Similarly, there exist refinements of P. In par- in O(nm).
ticular, the class L denotes the subset of • Bucklin’s procedure is in O(nm).
P containing the (decision) problems which can
be solved by an algorithm using only logarithmic More generally, the previous polynomial
space (in a deterministic Turing machine), the results can usually be extended to procedures
input itself not being counted as part of memory. based on score vectors, often with a complexity
The classes AC0 and TC0 are more technical. The of O(nm).
first one, AC0, consists of all the problems solv- The previous results are rather easy to obtain.
able by uniform constant-depth Boolean circuits For instance, for the plurality rule, an efficient
560 Voting Procedures, Complexity of
way to determine the winners consists of comput- Theorem 3 Let m be any integer with m 1 and
ing an n-vector V which provides, for each candi- let Y be any subset of R. Consider a profile
date x, the number of voters who prefer x. As the P Y m . Then Pm ðY , RÞ and Pm ðY , T Þ are
preferred candidate of each voter is assumed to be polynomial.
accessible directly, computing V requires
Oðn þ mÞ operations (or fewer if we assume that For the Condorcet-Kemeny problem, and more
voters sharing similar preferences are gathered, generally for the problems Pm ðY , Z Þ with Y
see subsection “Definitions, Notation and Par- belonging to fA, C , L, O, P , R, T g and Z to
tially Ordered Sets Used to Model Preferences”). fA, C , L, O, P g and where m is a positive integer,
More precisely, we first initialize V to 0 in O(n). most are NP-hard when m is large enough (see
Then, for each voter, we consider his or her pre- Alon 2006; Barthélemy et al. 1989; Bartholdi
ferred candidate x, and we increment V(x) by 1 in et al. 1989a; Charbit et al. 2007; Conitzer 2006;
O(1) for each voter and thus in O(m) for all the Dwork et al. 2001; Hemaspaandra et al. 2005;
voters. By scanning V in O(n), we determine the Hudry 1989, 2008, 2010, 2012, 2013b;
maximum value contained in Vand, once again by Wakabayashi 1986, 1998); see also Hudry
scanning V in O(n), the winners: they are the (2008) for the NP-hardness of similar problems
candidates x whose values V(x) are maximum. Pm ðY , Z Þ when extended to other posets, includ-
The whole process requires Oðn þ mÞ operations. ing interval orders, interval relations, semiorders,
When it succeeds in finding an order (or at least weak orders, and quasi-orders and Hudry (2013a)
a Condorcet winner), Condorcet’s procedure is for results dealing with other kinds of remoteness.
also polynomial. This is also the case for
Simpson’s procedure, since computing the Theorem 4 For m large enough, we have:
pairwise comparison coefficients m xy can obvi-
ously be done in polynomial time. The same hap- • For any set Y containing ℒ (this is the case in
pens for Tideman’s ranked pairs procedure, since particular for Y belonging to fA, C , L, O, P , R, T g
the detection of a circuit in a graph can be done in or to unions or intersections of such sets),
a time proportional to the number of arcs of this Pm ðY , Z Þ is NP-hard, and the decision problem
graph, i.e., here in O(n2) (see Cormen et al. 1990), associated with Pm ðY , Z Þ is NP-complete for
and for the prudent order procedure or the mini- Z fA, C , L g.
max procedure. • Pm ðR, Z Þ is also NP-hard, and the decision
problem associated with Pm ðR, Z Þ is also
NP-complete for Z fO, P g.
Theorem 2 When there exists a Condorcet win- • The complexity of Pm ðY , Z Þ is unknown for
ner, Condorcet’s procedure is polynomial, in Y fA, C , L, O, P , T g and for Z fO, P g, but
O(n2m). The prudent orders procedure, the mini- Pm ðY , O Þ and Pm ðY , P Þ have the same
max procedure, Simpson’s procedure (maximin complexity.
procedure), and Tideman’s procedure (ranked
pairs procedure) are also polynomial. The minimum value of m for which Pm ðY , Z Þ
is NP-hard in Theorem 4 is usually unknown, and
When there is no Condorcet winner, the situa- moreover, the parity of m plays a role. Table 2
tion is more difficult to manage. In this case, we gives the ranges of lower bounds of m from which
may pay attention to the problems called Pm ðY , Z Þ is known to be NP-hard; for lower
Pm ðY , Z Þ above. The problems Pm ðY , T Þ and values of m, when not trivial, the complexity of
Pm ðY , RÞ, i.e., the aggregation of m preferences Pm ðY , Z Þ is usually unknown. (This is the case,
into a tournament or into a binary relation in for instance, for P1 ðR, C Þ, P1 ðR, O Þ, or P1 ðR, P Þ;
which no special property is required, are polyno- notice that P2(L , L ) is polynomial.) A question
mial for any m and any set Y as specified by mark (?) means that the complexity of the problem
Theorem 3. is still unknown.
Voting Procedures, Complexity of 561
Voting Procedures, Complexity of, Table 2 Lower bounds of m from which Pm ðY , Z Þ is known to be NP-hard
LY LY Y ¼T Y ¼T Y ¼R Y ¼R
Median relation (Z) m odd m even m odd m even m odd m even
Acyclic relation (A) y(n2) 4 1 2 1 2
Complete preorder (C ) y(n2) 4 1 2 1 2
Linear order (L) y(n2) 4 1 2 1 2
Partial order (O) ? ? ? ? 3 2
Preorder (P ) ? ? ? ? 3 2
So the Condorcet-Kemeny problem Pm(L , L ) profiles P are given by the ordered list of
is NP-hard for m odd and large enough for m even the m preferences of the voters (and not by a
with m 4 (and, as noted above, is polynomial set of different linear orders with their mul-
for m ¼ 2). Similarly, Pm ðC , C Þ (i.e., the problem tiplicities; see subsection “Definitions,
considered by J. G. Kemeny in Kemeny (1959): Notation and Partially Ordered Sets Used
the aggregation of complete preorders into a to Model Preferences”), this problem is
median complete preorder) is NP-hard for m odd LNP-complete.
and large enough for m even with m 4 (see • Given P and a candidate x X , is x a
Hudry (2012) for the proof and for other results Condorcet-Kemeny winner? Moreover, if
dealing with median complete preorders or with we assume that the profiles P are given by
median weak orders). More specific results deal the ordered list of the m preferences of the
with Pm(L , L ) (see Bartholdi et al. 1989a; voters, this problem is LNP-complete.
Hemaspaandra et al. 2007; Hudry 2013b). To • Given P and a candidate x X , is x the
state them, let us define, for each element x of unique Condorcet-Kemeny winner? More-
X and for any given profile P, the Condorcet- over, if we assume that the profiles P are
Kemeny score K(x) of x (with respect to P) as given by the ordered list of the m prefer-
the minimum remoteness D(P, Lx) between P ences of the voters, this problem is LNP-
and the linear orders Lx with x as their winner. complete.
The Condorcet-Kemeny index of P, K(P) is the • Given P, determine a Condorcet-Kemeny
minimum remoteness between P and any linear winner of P. Moreover, this problem
order; thus, it is the minimum that we look for in belongs to FPNP.
Pm(ℒ, ℒ); it is also the minimum of the • Given P, determine all the Condorcet-
Condorcet-Kemeny scores over X. Kemeny winners of P. Moreover, this prob-
lem belongs to FPNP.
Theorem 5 Let P be a profile of m linear orders • Given P, determine a Kemeny order of P.
with m large enough. Moreover, this problem belongs to FPNP.
3. The following problem belongs to the class
1. The following problems are NP-complete: co-NP:
• Given P, a candidate x X and an integer k, • Given P and a linear order L defined on X,
is K(x) lower than or equal to k? is L a median linear order of P?
• Given an integer k, is K(P) lower than or Attention has also been paid to P1 ðT , ℒÞ, i.e.,
equal to k? the approximation of a tournament (which can be
2. The following problems are NP-hard: the majority tournament of the election) by a
• Given P and two candidates x X and linear order at minimum distance (Slater problem
y X with x 6¼ y , is K(x) lower than or (Slater 1961)). The complexity of the Slater prob-
equal to K(y) (in other words, is x “better” lem derives from a recent result dealing with the
than y?)? Moreover, if we assume that the problem called the feedback arc set problem,
562 Voting Procedures, Complexity of
which is known to be NP-complete from work by or by the sum of any of their positive powers, as
R. Karp (1972) for general graphs. This problem specified below (see Hudry 2013a):
consists of removing a minimum number of arcs
in a directed graph in order to obtain a graph Theorem 7 Let F denote any remoteness defined
without any circuits. Recent results (Alon 2006; for any profile P such that, when m is equal to
Charbit et al. 2007; Conitzer 2006) show that this 1 with P ¼ ðR1 Þ , then the minimization of
problem remains NP-complete even when F(P, R) over the relations R belonging to A yields
restricted to tournaments. For a tournament T, to the same optimal solutions as the minimization
removing a minimum number of arcs to obtain a of d(R1, R) over the same set. Then, for any fixed m
graph without any circuits is the same as reversing with m 1, the aggregation of m binary relations
a minimum number of arcs to make T transitive, or m tournaments obtained by minimizing F (the
i.e., a linear order (see Charon and Hudry (2007) problems similar to Pm ðR, Z Þ and Pm ðT , Z Þ for
and Hudry (2010)). From this, we may prove the Z fA, L g but with respect to F) is NP-hard, and
following theorem (see Hudry (2010) for details): the associated decision problems are NP-complete.
Theorem 6 For any tournament T, we have the An important case exists for which Pm(L , L )
following results: becomes polynomial for any m: it is the one for
which the voters’ preferences are single-peaked
• The computation of the Slater index i(T) of T is linear orders. To define them, assume that we can
NP-hard; this problem belongs to the class order the candidates on a line, from left to right,
FPNP; the associated decision problem is and assume that this linear order O does not
NP-complete. depend on the voters (from a practical point of
• The computation of a Slater winner of T is view, O is not always easy to define, even for
NP-hard; this problem belongs to the class political elections). For any voter a, let xa denote
FPNP. the candidate preferred by a. The preference of a
• Checking that a given vertex is a Slater winner is said to be O-singled-peaked if, for any candi-
of T is NP-hard; this problem belongs to the dates y and z with x 6¼ y 6¼ z 6¼ x and located on
class LNP. the same side of O from xa, y is preferred to z by a
• The computation of a Slater order of T is if and only if y is closer to xa than z with respect to
NP-hard; this problem belongs to the class O. Let U O denote the set of O-single-peaked linear
FPNP. orders. D. Black (1958) showed that, for any order
• The computation of all the Slater winners of T is O, the aggregation of O-single-peaked linear
NP-hard; this problem belongs to the class FPNP. orders is an O-single-peaked linear order (see
• The computation of all the Slater orders of T is also Conitzer 2007 for another study dealing
NP-hard. with single-peaked preferences). Hence, the poly-
• Checking that a given order is a Slater order is a nomiality of the aggregation of O-single-peaked
problem which belongs to the class co-NP. linear orders into an O-single-peaked linear order:
Similarly, the computation of a median com- Theorem 8 For any linear order O defined on
plete preorder of a profile of m tournaments (i.e., X and any positive integer m, for any set Z
with the previous notation, Pm(T, C)) is NP-hard containing U O as a subset (this is the case for the
for any m with m greater than or equal to 1 (see sets A, C , ℒ, O, P , R, T ), Pm ðU O , Z Þ is polyno-
Hudry 2012). mial. More precisely, Pm ðU O , Z Þ can be solved in
Some of the previous results may be general- O(n2m).
ized to other definitions of remoteness, for
instance, if the sum of the symmetric difference The NP-hardness of Slater’s problems shows
distances to the individual preferences is replaced that tournament solutions can be difficult to com-
by their maximum, or by the sum of their squares, pute. This obviously depends on the solutions
Voting Procedures, Complexity of 563
considered. For instance, Copeland’s procedure is contradiction between the two results if P and
polynomial, as specified by the next theorem. NP are different). The polynomiality of the mini-
mal covering set solution (through n resolutions of
Theorem 9 Copeland’s procedure is polynomial, a linear programming problem, which can be done
in O(n2m). in polynomial time using L. Khachiyan’s algo-
rithm (Khachiyan 1979)) and the NP-hardness of
If we assume that the tournament related to the tournament equilibrium set solution are new
Copeland’s procedure is already computed, and results, respectively, due to F. Brandt and
if we do not take the memory space necessary to F. Fischer (2008) and to Brandt et al. (2008).
code this tournament into account, it is easy to see
that the memory space necessary to compute the Theorem 12 Let T be a tournament.
maximum of the Copeland scores and then to
decide whether a given vertex is a Copeland win- • Computing the uncovered elements of T can be
ner can be bounded by a constant. So deciding done within the same complexity as the multi-
whether a given vertex is a Copeland winner is a plication of two (n n) matrices and so can be
problem belonging to class L. In fact, a stronger done in O(n2.38) operations.
result is shown by F. Brandt, F. Fischer, and • Computing the elements of the minimal cover-
P. Harrenstein in Brandt et al. (2006): ing set of T can be done in polynomial time
with respect to n.
Theorem 10 Checking that a given vertex is a • The following problem is NP-complete: given
Copeland winner is a TC0-complete problem. a tournament T and a vertex x of T, is x a Banks
winner of T?
Similar results can be stated for Smith’s tour- • Computing a Banks winner is polynomial and,
nament solution (top cycle): more precisely, can be done in O(n2)
operations.
Theorem 11 Let T be a tournament. • Computing all the Banks winners of T is an
NP-hard problem. More precisely, it is a prob-
• The computation of the Smith winners of lem belonging to the class FPNP.
T (the elements of the top cycle of T) can be • The following decision problem is NP-hard
done in O(n2) (or even in linear time with (but is not known to be inside NP): given a
respect to the cardinality of the top cycle if vertex x of T, does x belong to TEQ(T)?
the sorted scores are known).
• Checking that a given vertex is a Smith winner J. J. Bartholdi III, C. A. Tovey, and A. Trick
is a TC0-complete problem (Brandt stated in Bartholdi et al. (1989a) that Dodgson’s
et al. 2006). procedure is NP-hard. More precisely, they pro-
ved the NP-completeness of the problem of The-
For the other tournament solutions depicted orem 13 and the NP-hardness of the first two
above, we have the results stated in Theorem 12 problems of Theorem 14. Hemaspaandra et al.
(see Hudry 2009 for details and for results upon (1997) sharpened their results (Theorem 14),
other tournament solutions). This theorem shows assuming that P is given by the ordered list of
that checking whether a given vertex of a given the m preferences of the voters (and not by a set of
tournament is a Banks winner is NP-complete different linear orders with their multiplicities).
(Woeginger 2003) (and Brandt et al. 2008 for an
alternative proof), while computing a Banks win- Theorem 13 The following problem is
ner is polynomial (Hudry 2004). Of course, when NP-complete: given a profile P of linear orders
such a Banks winner is computed, we do not defined on X, a candidate x X, and an integer
choose the winner that we compute among the k, is the Dodgson score D(x) of x less than or
set of Banks winners (and so there is no equal to k?
564 Voting Procedures, Complexity of
Theorem 14 The following problems are Notice that all these problems become obvi-
NP-hard and, more precisely, LNP-complete if ously polynomial if we assume that the number of
the considered profiles P are given by the ordered candidates is upper bounded by a constant.
lists of the m preferences of the voters.
and S. Saurabh give several algorithms to solve 5, irrespective of how ties are broken (see
the parameterized version of the feedback arc set Coppersmith et al. 2006; Tideman 1987).
problem applied to tournaments. The best com- Probabilistic algorithms (see Alon and Spen-
plexity of their algorithms for this problem is cer (2000) for a global presentation of these
O(2.415kno), where o denotes the exponent of methods) have also been applied to the feedback
the running time of the best matrix multiplication arc set problem (or rather to the search for a
algorithm (for instance, in the method designed maximum subdigraph without circuits, which is
by D. Coppersmith and S. Winograd 1987, o is the same for tournaments), for unweighted
about 2.376). Parameterized complexity appears (Czygrinow et al. 1999; de la Vega 1983; Poljak
also in Christian et al. (2006), McCabe-Dansted and Turzík 1986; Poljak et al. 1988; Spencer
(2006), and McCabe-Dansted et al. (2006) for 1971, 1978, 1987) or weighted (Czygrinow
Condorcet-Kemeny’s, Dodgson’s, and Young’s et al. 1999) tournaments.
procedures. If NP-hardness is usually considered as a draw-
Another direction deals with algorithms with back (because the CPU time necessary to solve the
approximation guarantees (see Ausiello et al. instances becomes quickly prohibitive, because the
(2003) and Vazirani (2003) for a presentation of algorithms may be difficult to explain to the voters,
this field and Charon and Hudry (2007) and and so on), it can also be an asset with respect to
Fischer et al. (2013) for results on this topic in manipulation, bribery, or other attempts to control
the context of voting theory). For instance, while the election (by adding or deleting candidates or
the feedback arc set problem is APX-hard in gen- voters). This is an emerging but already flourishing
eral (Even et al. 1998) (remember that, from topic, which becomes a subject on its own (see
a practical point of view, this implies that Bartholdi and Orlin 1991; Bartholdi et al. 1989b,
we do not know polynomial-time approximation 1992; Chamberlin 1985; Conitzer and Sandholm
scheme – PTAS – to solve this problem), 2002a, b, 2003, 2006; Conitzer et al. 2003; Elkin
Ailon et al. (2005) designed randomized and Lipmaa 2006; Faliszewski et al. 2006, 2009a, b;
3-approximation and 2.5-approximation algo- Gibbard 1973; Hemaspaandra and Hemaspaandra
rithms for the feedback arc set problem 2007; Hemaspaandra et al. 2006, 2007; LeGrand
when restricted to unweighted tournaments. et al. 2006; Maus et al. 2006; Mitlöhner et al. 2006;
The 3-approximation algorithm has been Moulin 1980, 1985; Procaccia and Rosenschein
derandomized by A. van Zuylen (2005), still for 2006; Procaccia et al. 2006; Saari 1990;
unweighted tournaments and with an approxima- Satterthwaite 1975; Smith 1999; Taylor 2005).
tion ratio equal to 3. These methods may be A voting procedure is said to be manipulable when
adapted to weighted tournaments with an approx- a voter, who knows how the other voters vote, has
imation ratio equal to 5 (see Ailon et al. 2005; the opportunity to benefit by strategic or tactical
Coppersmith et al. 2006). N. Alon (2006) (see also voting, i.e., when the voter supports a candidate
Ailon and Alon 2007) shows that, for any fixed other than his or her sincerely preferred candidate
e > 0, it is NP-hard to approximate the minimum in order to prevent an undesirable outcome (the
size of a feedback arc set for a tournament on n main difference between manipulation and bribery
vertices up to an additive error of n2e (but is that for manipulability, the manipulators are
approximating it up to an additive error of en2 known as a part of the instance, and for bribery,
can be done in polynomial time). Ranking the the number of corrupted voters is bounded, but these
vertices of a tournament according to their out- manipulators are not given). It is known from the
degrees (Copeland’s procedure) for unweighted theorem by A. Gibbard (1973) and M. Satterthwaite
tournaments, or according to the sum of the (1975) that, for at least three candidates, any voting
weights of the arcs leaving them minus the sum procedure without a dictator is manipulable.
of the weights of the arcs entering them (Borda’s Because of this result, manipulation cannot be pre-
procedure) for weighted tournaments, also pro- cluded in any reasonable voting procedure on at
vides a method with approximation ratio equal to least three candidates. As suggested in Bartholdi
566 Voting Procedures, Complexity of
Borda J-C (1784) Mémoire sur les élections au scrutin. Conitzer V (2006) Computing slater rankings using simi-
Histoire de l’Académie Royale des Sciences pour larities among candidates. In: Proceedings of the 21st
1781, Paris, pp 657–665. English translation: de Grazia national conference on artificial intelligence, AAAI-06,
A (1953) Mathematical derivation of an election sys- Boston, pp 613–619
tem. Isis 44:42–51 Conitzer V (2007) Eliciting single-peaked preferences
Brams SJ, Fishburn PC (1978) Approval voting. Am Polit using comparison queries. In: Proceedings of the 6th
Sci Rev 72(3):831–857 international joint conference on autonomous agents
Brams SJ, Fishburn PC (1983) Approval voting. and multi agent systems (AAMAS-07), Honolulu,
Birkhauser, Boston pp 408–415
Brams SJ, Fishburn PC (2002) Voting procedures. In: Conitzer V, Sandholm T (2002a) Vote elicitation: complex-
Arrow K, Sen A, Suzumura K (eds) Handbook of social ity and strategy-proofness. In: Proceedings of the
choice and welfare, vol 1. Elsevier, Amsterdam, national conference on artificial intelligence (AAAI),
pp 175–236 pp 392–397
Brandt F, Fischer F (2008) Computing the minimal cover- Conitzer V, Sandholm T (2002b) Complexity of manipu-
ing set. Math Soc Sci 58(2):254–268 lating elections with few candidates. In: Proceedings of
Brandt F, Fischer F, Harrenstein P (2006) The computa- the 18th national conference on artificial intelligence
tional complexity of choice sets. In: Endriss U, Lang (AAAI), pp 314–319
J (eds) Proceedings of the conference computational Conitzer V, Sandholm T (2003) Universal voting protocol
social choice 2006. University of Amsterdam, Amster- tweaks to make manipulation hard. In: Proceedings of
dam, pp 63–76 the 18th international joint conference on artificial
Brandt F, Fischer F, Harrenstein P, Mair M (2008) intelligence (IJCAI-03), Acapulco, pp 781–788
A computational analysis of the tournament equilib- Conitzer V, Sandholm T (2006) Nonexistence of voting
rium set. In: Fox D, Gomes CP (eds) Proceedings of rules that are usually hard to manipulate. In: Proceed-
AAAI, pp 38–43 ings of the 21st national conference on artificial intelli-
Caritat MJAN, marquis de Condorcet (1785) Essai sur gence (AAAI-06), Boston, pp 627–634
l’application de l’analyse à la probabilité des décisions Conitzer V, Lang J, Sandholm T (2003) How many candi-
rendues à la pluralité des voix. Imprimerie Royale, dates are needed to make elections hard to manipulate?
Paris Theoretical aspects of rationality and knowledge
Caspard N, Monjardet B, Leclerc B (2007) Ensembles (TARK), pp 201–214
ordonnés finis: concepts, résultats et usages. Springer, Copeland AH (1951) A “reasonable” social welfare func-
Berlin tion. Seminar on applications of mathematics to the
Chamberlin JR (1985) An investigation into the effective social sciences. University of Michigan
manipulability of four voting systems. Behav Sci Coppersmith T, Winograd S (1987) Matrix multiplication
30:195–203 via arithmetic progression. In: Proceedings of 19th
Charbit P, Thomassé S, Yeo A (2007) The minimum feed- annual ACM symposium on theory of computing,
back arc set problem is NP-hard for tournaments. Comb pp 1–6
Probab Comput 16(1):1–4 Coppersmith D, Fleischer L, Rudra A (2006) Ordering by
Charon I, Hudry O (2006) A branch and bound algorithm weighted number of wins gives a good ranking for
to solve the linear ordering problem for weighted tour- weighted tournaments. In: Proceedings of the 17th
naments. Discret Appl Math 154:2097–2116 annual ACM-SIAM symposium on discrete algorithms
Charon I, Hudry O (2007) A survey on the linear ordering (SODA’06), pp 776–782
problem for weighted or unweighted tournaments. 4OR Cormen T, Leiserson C, Rivest R (1990) Introduction to
5(1):5–60 algorithms, 2nd edn. MIT Press, Cambridge, 2001
Charon I, Guénoche A, Hudry O, Woirgard F (1997) New Cox GW (1987) The cabinet and the development of
results on the computation of median orders. Discret political parties in Victorian England. Cambridge Uni-
Math 165–166:139–154 versity Press, New York
Chevaleyre Y, Endriss U, Lang J, Maudet N (2007) A short Czygrinow A, Poljak S, Rödl V (1999) Constructive quasi-
introduction to computational social choice. In: Pro- Ramsey numbers and tournament ranking. SIAM
ceedings of the 33rd conference on current trends in J Discret Math 12(1):48–63
theory and practice of computer science (SOFSEM- Daunou PCF (1803) Mémoire sur les élections au scrutin.
2007). Lecture notes in computer science, vol 4362. Baudoin, Paris, an XI
Springer, Berlin, pp 51–69 de la Vega WF (1983) On the maximal cardinality of a
Christian R, Fellows M, Rosamond F, Slinko A (2006) On consistent set of arcs in a random tournament. J Comb
complexity of lobbying in multiple referenda. In: Pro- Theor B 35:328–332
ceedings of the first international workshop on compu- Debord B (1987a) Caractérisation des matrices de préfér-
tational social choice (COMSOC 2006). University of ences nettes et méthodes d’agrégation associées. Math
Amsterdam, pp 87–96 Sci Hum 97:5–17
Colomer JM, McLean I (1998) Electing popes: approval Debord B (1987b) Axiomatisation de procédures
balloting and qualified-majority rule. J Interdiscip Hist d’agrégation de préférences. Ph D thesis, Université
29(1):1–22 scientifique technologique et médicale de Grenoble
568 Voting Procedures, Complexity of
Dodgson CL (1873) A discussion of the various methods Fischer F, Hudry O, Niedermeier R (2013) Weighted tour-
of procedure in conducting elections. Imprint by Gard- nament solutions. In: Brandt F, Conitzer V, Endriss U,
ner EB, Hall EP, Stacy JH. Printers to the University, Lang J, Procaccia A (eds) Handbook of computational
Oxford. Reprinted In: Black D (1958) The theory of social choice. Cambridge University Press, Cambridge,
committees and elections. Cambridge University Press, to appear
Cambridge, pp 214–222 Fishburn PC (1973a) Interval representations for interval
Dodgson CL (1874) Suggestions as to the best method of orders and semiorders. J Math Psychol 10:91–105
taking votes, where more than two issues are to be Fishburn PC (1973b) The theory of social choice.
voted on. Imprint by Hall EP, Stacy JH. Printers to the Princeton University Press, Princeton
University, Oxford. Reprinted In: Black D (1958) The Fishburn PC (1977) Condorcet social choice functions.
theory of committees and elections. Cambridge Uni- SIAM J Appl Math 33:469–489
versity Press, Cambridge, pp 222–224 Fishburn PC (1985) Interval orders and interval graphs, a
Dodgson CL (1876) A method of taking votes on more study of partially ordered sets. Wiley, New York
than two issues. Clarendon Press, Oxford. Reprint In: Garey MR, Johnson DS (1979) Computers and intractabil-
Black D (1958) The theory of committees and elec- ity, a guide to the theory of NP-completeness. Freeman,
tions, Cambridge University Press, Cambridge, New York
pp 224–234; and In: McLean I, Urken A (1995) Clas- Gibbard A (1973) Manipulation of voting schemes.
sics of social choice. University of Michigan Press, Econometrica 41:587–602
Ann Arbor Guilbaud GT (1952) Les théories de l’intérêt général et le
Dom M, Guo J, Hüffner F, Niedermeier R, Truß A (2006) problème logique de l’agrégation. Économie Appl
Fixed-parameter tractability results for feedback set 5(4):501–584; Éléments de la théorie des jeux, 1968.
problems in tournaments, vol 3998, Lecture notes in Dunod, Paris
computer science. Springer, Berlin, pp 320–331 Hägele G, Pukelsheim F (2001) Llull’s writings on elec-
Downey RG, Fellows MR (1999) Parameterized complex- toral systems. Stud Lulliana 3:3–38
ity. Springer, Berlin Hemaspaandra L (2000) Complexity classes. In: Rosen
Dummett M (1984) Voting procedures. Clarendon, Oxford KH (ed) Handbook of discrete and combinatorial math-
Dutta B (1988) Covering sets and a new Condorcet choice ematics. CRC Press, Boca Raton, pp 1085–1090
correspondence. J Econ Theor 44:63–80 Hemaspaandra E, Hemaspaandra L (2007) Dichotomy for
Dwork C, Kumar R, Naor M, Sivakumar D (2001) Rank voting systems. J Comput Syst Sci 73(1):73–83
aggregation methods for the Web. In: Proceedings of Hemaspaandra E, Hemaspaandra L, Rothe J (1997) Exact
the 10th international conference on World Wide Web analysis of Dodgson elections: Lewis Carroll’s 1876
(WWW10), Hong Kong, pp 613–622 voting system is complete for parallel access to NP. J
Elkin E, Lipmaa H (2006) Hybrid voting protocols and ACM 44(6):806–825
hardness of manipulation. In: Endriss U, Lang J (eds) Hemaspaandra E, Spakowski H, Vogel J (2005) The com-
Proceedings of the first international workshop on com- plexity of Kemeny elections. Theor Comput Sci
putational social choice (COMSOC 2006). University 349:382–391
of Amsterdam, pp 178–191 Hemaspaandra E, Hemaspaandra L, Rothe J (2006) Hybrid
Elster J, Hylland A (eds) (1986) Foundations of social elections broaden complexity-theoretic resistance to con-
choice theory. Cambridge University Press, New York trol. In: Proceedings of the first international workshop on
Erdös P, Moser L (1964) On the representation of directed computational social choice (COMSOC 2006), Univer-
graphs as unions of orderings. Magyar Tud Akad Mat sity of Amsterdam, pp 234–247; (2007) Proceedings of
Kutato Int Közl 9:125–132 the 20th international joint conference on artificial intelli-
Even G, Naor JS, Sudan M, Schieber B (1998) Approxi- gence (IJCAI 2007). AAAI Press, pp 1308–1314
mating minimum feedback sets and multicuts in Hemaspaandra E, Hemaspaandra L, Rothe J (2007) Any-
directed graphs. Algorithmica 20(2):151–174 one but him: the complexity of precluding an alterna-
Fagin R, Kumar R, Mahdian M, Sivakumar D, Vee tive. Artif Intell 171(5–6):255–285
E (2005) Rank aggregation: an algorithmic perspective. Homan C, Hemaspaandra L (2006) Guarantees for the
Unpublished manuscript success frequency of an algorithm for finding
Faliszewski P, Hemaspaandra E, Hemaspaandra L (2006) Dodgson-election winners. In: Proceedings of the 31st
The complexity of bribery in elections. In: Endriss U, international symposium on mathematical foundations
Lang J (eds) Proceedings of the first international work- of computer science. Lecture notes in computer sci-
shop on computational social choice (COMSOC 2006). ence, vol 4162. Springer, Berlin, pp 528–539
University of Amsterdam, pp 178–191 Hudry O (1989) Recherche d’ordres médians: complexité,
Faliszewski P, Hemaspaandra E, Hemaspaandra L, Rothe algorithmique et problèmes combinatoires. Ph D thesis,
J (2009a) A richer understanding of the complexity of ENST, Paris
election systems. In: Ravi S, Shukla S (eds) Fundamen- Hudry O (2004) A note on Banks winners. In: Woeginger
tal problems in computing: essays in honor of Professor GJ (ed) Tournaments are difficult to recognize. Soc
Daniel J. Rosenkrantz. Springer, Berlin, pp 375–406 Choice Welf 23:1–2
Faliszewski P, Hemaspaandra E, Hemaspaandra L, Rothe Hudry O (2008) NP-hardness results on the aggregation of
J (2009b) Llull and Copeland voting broadly resist linear orders into median orders. Ann Oper Res
bribery and control. J AI Res 35:275–341 163(1):63–88
Voting Procedures, Complexity of 569
Hudry O (2009) A survey on the complexity of tournament Lines M (1986) Approval voting and strategy analysis: a
solutions. Math Soc Sci 57:292–303 venetian. Ex Theor Decis 20:155–172
Hudry O (2010) On the complexity of Slater’s problems. Mascart J (1919) La vie et les travaux du chevalier Jean-
Eur J Oper Res 203:216–221 Charles de Borda (1733–1799): épisodes de la vie
Hudry O (2012) On the computation of median linear scientifique au XVIIIe siècle. Annales de l’université
orders, of median complete preorders and of median de Lyon vol. II (33). New edition, Presses de l’université
weak orders. Math Soc Sci 64:2–10 de Paris-Sorbonne, 2000
Hudry O (2013a) Complexity results for extensions of Maus S, Peters H, Storcken T (2006) Anonymous voting
median orders to different types of remoteness. Ann and minimal manipulability. In: Proceedings of the first
Oper Res. doi10.1007/s10479-013-1342-3 to appear international workshop on computational social choice
Hudry O (2013b) Complexity of computing median linear (COMSOC 2006), University of Amsterdam,
orders and variants. Electron Notes Discrete Math 42:57 pp 317–330
Hudry O, Monjardet B (2010) Consensus theories. An Mc Garvey D (1953) A theorem on the construction of
oriented survey. Math Soc Sci 190:139–167 voting paradoxes. Econometrica 21:608–610
Hudry O, Leclerc B, Monjardet B, Barthélemy J-P (2009) Met- McCabe-Dansted J (2006) Feasibility and approximability
ric and latticial medians. In: Bouyssou D, Dubois D, of Dodgson’s rule. Master’s thesis, University of
Pirlot M, Prade H (eds) Concepts and methods of Auckland
decision-making process. Wiley, New York, pp 771–812 McCabe-Dansted J, Pritchard G, Slinko A (2006)
Inada K (1969) The simple majority decision rule. Approximability of Dodgson’s rule. In: Proceedings
Econometrica 37:490–506 of the first international workshop on computational
Johnson DS (1990) A catalog of complexity classes. In: social choice (COMSOC 2006), University of Amster-
van Leeuwen J (ed) Handbook of theoretical computer dam, pp 234–247
science, vol A, Algorithms and complexity. Elsevier, McKey B (2013) http://cs.anu.edu.au/pp~bdm/data/
Amsterdam, pp 67–161 digraphs.html
Johnson PE (1998) Social choice theory and research, CA, McLean I (1995) The first golden age of social choice,
vol 123, Quantitative applications in the social sci- 1784–1803. In: Barnett WA, Moulin H, Salles M,
ences. Sage, Thousand Oaks Schofield NJ (eds) Social choice, welfare, and ethics:
Jünger M (1985) Polyhedral combinatorics and the acyclic proceedings of the eighth international symposium in
subdigraph problem. Heldermann, Berlin economic theory and econometrics. Cambridge Uni-
Karp RM (1972) Reducibility among combinatorial problems. versity Press, Cambridge, pp 13–33
In: Miller RE, Tatcher JW (eds) Complexity of computer McLean I, Hewitt F (1994) Condorcet: foundations of
computations. Plenum Press, New York, pp 85–103 social choice and political theory. Edward Elgar, Hants
Kelly JS (1987) Social choice theory: an introduction. McLean I, Urken A (1995) Classics of social choice. Uni-
Springer, Berlin versity of Michigan Press, Ann Arbor
Kemeny JG (1959) Mathematics without numbers. Daeda- McLean I, Urken A (1997) La réception des œuvres de
lus 88:571–591 Condorcet sur le choix social (1794–1803): Lhuilier,
Khachiyan L (1979) A polynomial algorithm in linear Morales et Daunou, in Condorcet, Homme des
programming. Sov Math Dokl 20:191–194 Lumières et de la Révolution, Chouillet A-M, Pierre
Köhler G (1978) Choix multicritère et analyse algébrique Crépel (eds) ENS éditions, Fontenay-aux-roses,
de données ordinales. Ph D thesis, université pp 147–160
scientifique et médicale de Grenoble McLean I, McMillan A, Monroe BL (1995) Duncan Black
Lang J (2004) Logical preference representation and com- and Lewis Carroll. J Theor Polit 7:107–124
binatorial vote. Ann Math Artif Intell 42:37–71 McLean I, Lorrey H, Colomer JM (2007) Social choice in
Laslier J-F (1997) Tournament solutions and majority vot- medieval Europe. Workshop Histoire des
ing. Springer, Berlin Mathématiques Sociales, Paris
Laslier J-F (2004) Le vote et la règle majoritaire. Analyse Merrill S III, Grofman B (1999) A unified theory of voting.
mathématique de la politique éditions du CNRS Cambridge University Press, Cambridge
LeGrand R, Markakis E, Mehta A (2006) Approval Miller N (1980) A new solution set for tournaments
voting: local search heuristics and approximation and majority voting: further graph-theoretical
algorithms for the minimax solution. In: Proceedings approaches to the theory of voting. Am J Polit Sci
of the first international workshop on computational 24(1):68–96
social choice (COMSOC 2006), University of Mitlöhner J, Eckert D, Klamler C (2006) Simulating the
Amsterdam, pp 234–247 effects of misperception on the manipulability of voting
Levenglick A (1975) Fair and reasonable election systems. rules. In: Proceedings of the first international work-
Behav Sci 20:34–46 shop on computational social choice (COMSOC 2006),
Levin J, Nalebuff B (1995) An introduction to vote- University of Amsterdam, p 234–247
counting schemes. J Econ Perspect 9(1):3–26 Monjardet B (1976) Lhuilier contre Condorcet au pays des
Lhuilier S (1794) Examen du mode d’élection proposé à la paradoxes. Math Sci Hum 54:33–43
Convention nationale de France en février 1793 et Monjardet B (1979) Relations à éloignement minimum de
adopté à Genève, Genève. Reprint In: (1976) Math relations binaires, note bibliographique. Math Sci Hum
Sci Hum 54:7–24 67:115–122
570 Voting Procedures, Complexity of
Monjardet B (1990) Sur diverses formes de la “règle de Reid KB (2004) Tournaments. In: Gross JL, Yellen J (eds)
Condorcet” d’agrégation des préférences. Math Inf Sci Handbook of graph theory. CRC Press, Boca Raton,
Hum 111:61–71 pp 156–184
Monjardet B (2008a) Acyclic domains of linear orders: a Reid KB, Beineke LW (1978) Tournaments. In: Beineke
survey. In: Brams S, Gehrlein WV, Roberts FS (eds) LW, Wilson RJ (eds) Selected topics in graph theory.
The mathematics of preference, choice and order, Academic, London, pp 169–204
essays in honor of Peter C. Fishburn. Springer, Berlin, Reinelt G (1985) The linear ordering problem: algorithms
pp 139–160 and applications, vol 8, Research and exposition in
Monjardet B (2008b) Mathématique Sociale and Mathe- mathematics. Heldermann, Berlin
matics. A case study: Condorcet’s effect and medians. Rothe J, Spakowski H (2006) On determining Dodgson
Electron J Hist Probab Stat 4(1):1–26 winners by frequently self-knowingly correct algo-
Moon JW (1968) Topics on tournaments. Holt, Rinehart rithms and in average-case polynomial time. In: Pro-
and Winston, New York ceedings of the first international workshop on
Morales JI (1797) Memoria matemática sobre el cálculo computational social choice (COMSOC 2006), Univer-
de la opinión en las elecciones. Imprenta Real, sity of Amsterdam, pp 234–247
Madrid. Translated in McLean I, Urken A (1995) Rothe J, Spakowski H, Vogel J (2003) Exact complexity of
Classics of social choice. University of Michigan the winner problem for Young elections. Theor Comput
Press, Ann arbor Syst 36(4):375–386
Moulin H (1980) On strategy-proofness and single peaked- Rowley CK (ed) (1993) Social choice theory, vol 1, The
ness. Public Choice 35:437–455 aggregation of preferences. Edward Elgar, London
Moulin H (1983) The strategy of social choice. North Saari D (1990) Susceptibility to manipulation. Public Choice
Holland, Amsterdam 64:21–41
Moulin H (1985) Fairness and strategy in voting. In: Young Saari D (2001) Decisions and elections, explaining the
HP (ed) Fair allocation, American Mathematical Soci- unexpected. Cambridge University Press, Cambridge
ety. Proc Symp Appl Math 33:109–142 Satterthwaite M (1975) Strategy-proofness and Arrow’s
Moulin H (1986) Choosing from a tournament. Soc Choice conditions: existence and correspondence theorems
Welf 3:272–291 for voting procedures and social welfare functions.
Nanson EJ (1882) Methods of election. Trans Proc R Soc J Econ Theor 10:187–217
Vic 18:197–240 Schwartz T (1990) Cyclic tournaments and cooperative
Nurmi H (1987) Comparing voting systems. D. Reidel, majority voting: a solution. Soc Choice Welf 7:19–29
Dordrecht Simpson PB (1969) On defining areas of voter choice. Q J
Pattanaik PK, Salles M (eds) (1983) Social choice and Econ 83(3):478–490
welfare. North-Holland, Amsterdam Slater P (1961) Inconsistencies in a schedule of paired
Poljak S, Turzík D (1986) A polynomial time heuristic for comparisons. Biometrika 48:303–312
certain subgraph optimization problems with Smith JH (1973) Aggregation of preferences with variable
guaranteed lower bound. Discret Math 58:99–104 electorate. Econometrica 41(6):1027–1041
Poljak S, Rödl V, Spencer J (1988) Tournament ranking Smith D (1999) Manipulability measures of common
with expected profit in polynomial time. SIAM social choice functions. Soc Choice Welf 16:639–661
J Discret Math 1(3):372–376 Spencer J (1971) Optimal ranking of tournaments. Net-
Procaccia A, Rosenschein J (2006) Junta distribution and works 1:135–138
the average-case complexity of manipulating elections. Spencer J (1978) Nonconstructive methods in discrete math-
In: Proceedings of the 5th international joint autono- ematics. In: Rota GC (ed) Studies in combinatorics. Math-
mous agents and multiagent systems, ACM Press, ematical Association of America, Washington, DC,
pp 497–504 pp 142–178
Procaccia A, Rosenschein J, Zohar A (2006) Multi-winner Spencer J (1987) Ten lectures on the probabilistic method.
elections: complexity of manipulation, control, and CBMS-NSF regional conference series in applied
winner-determination. In: Proceedings of the 8th Trad- mathematics N 52, SIAM, Philadelphia
ing Agent Design and Analysis and Agent Mediated Stearns R (1959) The voting problem. Am Math Mon 66:761–
Electronic Commerce Joint International workshop 763
(TADA/AMEC 2006), pp 15–28 Straffin PD Jr (1980) Topics in the theory of voting.
Laplace (marquis de) PS (1795) Journal de l’École Poly- Birkhäuser, Boston
technique, tome II vol. 7–8; Théorie analytique des Taylor AD (1995) Mathematics and politics strategy, vot-
probabilités. Essai philosophique sur les probabilités. ing, power, and proof. Springer, Berlin
Œuvres de Laplace, tome VII, Paris, 1847 Taylor AD (2005) Social choice and the mathematics of
Raman V, Saurabh S (2006) Parameterized algorithms for manipulation. Cambridge University Press, Cambridge
feedback set problems and their duals in tournaments. Tideman TN (1987) Independence of clones as criterion for
Theor Comput Sci 351:446–458 voting rules. Soc Choice Welf 4:185–206
Voting Procedures, Complexity of 571
van Zuylen A (2005) Deterministic approximation algo- Farquharson R (1969) Theory of voting. Yale University
rithms for ranking and clusterings. Cornell ORIE tech- Press, New Haven
nical report No. 1431 Feldman AM (1980) Welfare economics and social choice
Vazirani VV (2003) Approximation algorithms. Springer, theory. Martinus Nijhoff, Boston
Berlin Felsenthal DS, Machover M (1998) The measurement of
Wakabayashi Y (1986) Aggregation of binary relations: voting power: theory and practice, problems and para-
algorithmic and polyhedral investigations. Ph D thesis, doxes. Edward Elgar, Cheltenham
Augsburg Gaertner W (2001) Domains conditions in social choice
Wakabayashi Y (1998) The complexity of computing theory. Cambridge University Press, Cambridge
medians of relations. Resenhas 3(3):323–349 Greenberg J (1990) The theory of social situations. Cam-
Weber RJ (1995) Approval voting. J Econ Perspect 9(1):39–49 bridge University Press, Cambridge
Woeginger GJ (2003) Banks winner in tournaments are Grofman B (1981) When is the Condorcet winner the
difficult to recognize. Soc Choice Welf 20:523–528 Condorcet winner? University of California, Irvine
Young HP (1977) Extending Condorcet’s rule. J Econ Grofman B, Owen G (eds) (1986) Information pooling and
Theor 16(2):335–353 group decision making. JAI Press, Greenwich
Heal G (ed) (1997) Topological social choice. Springer,
Berlin
Hillinger C (2004) Voting and the cardinal aggregation
Books and Reviews cardinal of judgments. Discussion papers in economics
Aleskerov FT (1999) Arrovian aggregation models, math- 353, University of Munich
ematical and statistical methods, vol 39, Theory and Holler MJ (ed) (1978) Power voting and voting power.
decision library. Kluwer, Boston Physica, Wurtsburg
Aleskerov FT, Monjardet B (2002) Utility maximisation, Holler MJ, Owen G (eds) (2001) Indices and coalition
choice and preference. Springer, Berlin formation. Kluwer, Boston
Baker KM (1975) Condorcet from natural philosophy to Kemeny J, Snell L (1960) Mathematical models in the
social mathematics. The University of Chicago Press, social sciences. Ginn, Boston
Chicago. Reissued 1982 Laslier J-F (2006) Spatial approval voting. Polit Anal
Balinski M, Young HP (1982) Fair representation. Yale 14(2):160–185
University Press, New Haven Laslier J-F, Van Der Straeten K (2008) A live experiment
Barthélemy J-P, Monjardet B (1988) The median proce- on approval voting. Exp Econ 11:97–105
dure in data analysis: new results and open problems. Lieberman B (ed) (1971) Social choice. Gordon and
In: Bock HH (ed) Classification and related methods of Breach, New York
data analysis. North Holland, Amsterdam Mirkin BG (1979) Group choice. Winston, Washington,
Batteau P, Jacquet-Lagrèze É, Monjardet B (eds) DC
(1981) Analyse et agrégation des préférences dans les Moulin H (2003) Fair division and collective welfare.
sciences économiques et de gestion. Economica, Paris Institute of Technology Press, Boston
Black D (1996) Formal contributions to the theory of Nurmi H (1999) Voting paradoxes and how to deal with
public choice. In: Brady GL, Tullock G (eds) The them. Springer, Berlin
unpublished works of Duncan Black. Kluwer, Boston Nurmi H (2002) Voting procedures under uncertainty.
Bouyssou D, Marchant T, Pirlot M, Tsoukias A, Vincke Springer, Berlin
P (2006) Evaluation and decision models with multiple Pattanaik PK (1971) Voting and collective choice. Harvard
criteria. Springer, Berlin University Press, Cambridge
Campbell DE (1992) Equity, efficiency, and social choice. Pattanaik PK (1978) Strategy and group choice. North
Clarendon, Oxford Holland, Amsterdam
Coughlin P (1992) Probabilistic voting theory. Cambridge Peleg B (1984) Game theoretic analysis of voting in com-
University Press, Cambridge mittees. Cambridge University Press, Cambridge
Danilov V, Sotskov A (2002) Social choice mechanisms. Peleg B, Peters H (2010) Strategic social choice. Springer,
Springer, Berlin Berlin
Dubois D, Pirlot M, Bouyssou D, Prade H (eds) Rothschild E (2001) Economic sentiments: Adam Smith,
(2006) Concepts et méthodes pour l’aide à la décision. Condorcet, and the enlightenment. Harvard University
Hermès, Paris Press, Cambridge
Endriss U, Lang J (eds) (2006) Proceedings of the first Saari DG (1994) Geometry of voting. Springer, Berlin
international workshop on computational social choice, Saari DG (1995) Basic geometry of voting. Springer, Berlin
COMSOC 2006, University of Amsterdam Saari DG (2000) Chaotic elections! American Mathemati-
Enelow J, Hinich M (eds) (1990) Advances in the spatial cal Society, Providence
theory of voting. Cambridge University Press, Schofield N (1984) Social choice and democracy. Springer,
Cambridge Berlin
572 Voting Procedures, Complexity of
Schofield N (ed) (1996) Collective decision van Deemen A, Rusinowska A (eds) (2010) Collective
making: social choice and political economy. decision making. Springer, Berlin
Kluwer, Boston Woodall DR (1997) Monotonicity of single-seat preferen-
Schwartz T (1986) The logic of collective choice. Colum- tial election rules. Discret Appl Math 77:81–98
bia University Press, New York Young HP (1974) An axiomatization of Borda’s rule.
Sen AK (1979) Collective choice and social welfare. North J Econ Theor 9:43–52
Holland, Amsterdam Young HP (1986) Optimal ranking and choice from
Sen AK (1982) Choice, welfare and measurement. Basil pairwise comparisons. In: Grofman B, Owen G (eds)
Blackwell, Oxford Information pooling and group decision making. JAI
Suzumura K (1984) Rational choice, collective decisions Press, Greenwich, pp 113–122
and social welfare. Cambridge University Press, Young HP (1988) Condorcet theory of voting. Am Polit Sci
Cambridge Rev 82:1231–1244
Tanguiane AS (1991) Aggregation and representation of Young HP (1995) Optimal voting rules. J Econ Perspect
preferences, introduction to mathematical theory of 9(1):51–64
democracy. Springer, Berlin Young HP, Levenglick A (1978) A consistent extension of
Tideman N (2006) Collective decisions and voting: the Condorcet’s election principle. SIAM J Appl Math
potential for public choice. Ashgate, Burlington 35:285–300
Normal form game A normal form game is a
Evolutionary Game Theory strategic interaction in which each of n players
chooses a strategy and then receives a payoff
William H. Sandholm that depends on all agents’ choices choices of
Department of Economics, University of strategy. In a symmetric two-player normal
Wisconsin, Madison, USA form game, the two players choose from the
same set of strategies, and payoffs only depend
on own and opponent’s choices, not on a
Article Outline player’s identity.
Population game A population game is a strate-
Glossary gic interaction among one or more large
Definition of the Subject populations of agents. Each agent’s payoff
Introduction depends on his own choice of strategy and the
Normal Form Games distribution of others’ choices of strategies.
Static Notions of Evolutionary Stability One can generate a population game from a
Population Games normal form game by introducing random
Revision Protocols matching; however, many population games
Deterministic Dynamics of interest, including congestion games, do
Stochastic Dynamics not take this form.
Local Interaction Replicator dynamic The replicator dynamic is a
Applications fundamental deterministic evolutionary
Future Directions dynamic for games. Under this dynamic, the
Bibliography percentage growth rate of the mass of agents
using each strategy is proportional to the
Glossary excess of the strategy’s payoff over the
population’s average payoff. The replicator
Deterministic evolutionary dynamic A deter- dynamic can be interpreted biologically as a
ministic evolutionary dynamic is a rule for model of natural selection, and economically
assigning population games to ordinary differ- as a model of imitation.
ential equations describing the evolution of Revision protocol A revision protocol describes
behavior in the game. Deterministic evolution- both the timing and the results of agents’ deci-
ary dynamics can be derived from revision sions about how to behave in a repeated stra-
protocols, which describe choices tegic interaction. Revision protocols are used
(in economic settings) or births and deaths to derive both deterministic and stochastic evo-
(in biological settings) on an agent-by-agent lutionary dynamics for games.
basis. Stochastically stable state In Game-theoretic
Evolutionarily stable strategy (ESS) In a sym- models of stochastic evolution in games are
metric normal form game, an evolutionarily often described by irreducible Markov processes.
stable strategy is a (possibly mixed) strategy In these models, a population state is stochasti-
with the following property: a population in cally stable if it retains positive weight in the
which all members play this strategy is resis- process’s stationary distribution as the level of
tant to invasion by a small group of mutants noise in agents’ choices approaches zero, or as
who play an alternative mixed strategy. the population size approaches infinity.
equation models of game dynamics; section “Sto- Example 1 The game below, with strategies
chastic Dynamics” studies stochastic models of evo- C (“cooperate”) and D (“defect”), is an instance
lution based on Markov processes; and section of a Prisoner’s Dilemma:
“Local Interaction” presents deterministic and sto-
C D
chastic models of local interaction.
C 2 0
Section “Applications” records a range of applica-
D 3 1 .
tions of evolutionary game theory, and section
“Future Directions” suggests directions for future
research. Finally, section “Bibliography” offers an (To interpret this game, note that ACD = 0 is
extensive list of primary references. the payoff to cooperating when one’s opponent
defects.) Since 1 > 0, defecting is a symmetric
Nash equilibrium of this game. In fact, since
Normal Form Games 3 > 2 and 1 > 0, defecting is even a strictly
dominant strategy. But since 2 > 1, both players
In this section, we introduce a very simple model are better off when both cooperate than when both
of strategic interaction: the symmetric two-player defect.
normal form game. We then define some of the In many instances, it is natural to allow players
standard solution concepts used to analyze this to choose mixed (or randomized) strategies.
model, and provide some examples of games When a player chooses mixed strategy from the
P
and their equilibria. With this background in simplex X ¼ x Rnþ : i S xi ¼ 1 , his behav-
place, we turn in subsequent sections to evolu- ior is stochastic: he commits to playing pure strat-
tionary analysis of behavior in games. egy i S with probability xi.
In a symmetric two-player normal form game, When either player makes a randomized
each of the two players chooses a (pure) strategy choice, we evaluate payoffs by taking expecta-
from the finite set S, which we write generically as tions: a player choosing mixed strategy x against
S = {1, . . . , n}. The game’s payoffs are an opponent choosing mixed strategy y garners an
described by the matrix A Rn n. Entry Aij is expected payoff of
the payoff a player obtains when he chooses strat-
egy i and his opponent chooses strategy j; this XX
x0 Ay ¼ xi Aij yj : (3)
payoff does not depend on whether the player in iS jS
question is called player 1 or player 2.
The fundamental solution concept of noncoop- In biological contexts, payoffs are fitnesses,
erative game theory is Nash equilibrium (Nash and represent levels of reproductive success rel-
1951). We say that the pure strategy i S is a ative to some baseline level; Eq. (3) reflects the
symmetric Nash equilibrium of A if idea that in a large population, expected repro-
ductive success is what matters. In economic
Aii Aji for all j S: (1) contexts, payoffs are utilities: a numerical repre-
sentation of players’ preferences under which
Thus, if his opponent chooses a symmetric Eq. (3) captures players’ choices between uncer-
Nash equilibrium strategy i, a player can do no tain outcomes (von Neumann and Morgenstern
better than to choose i himself. 1944).
A stronger requirement on strategy i demands The notion of Nash equilibrium extends easily
that it be superior to all other strategies regardless to allow for mixed strategies. Mixed strategy x is a
of the opponent’s choice: symmetric Nash equilibrium of A if
When condition (2) holds, we say that strategy In words, x is a symmetric Nash equilibrium if
i is strictly dominant in A. its expected payoff against itself is at least as high
576 Evolutionary Game Theory
as the expected payoff obtainable by any other contesting a resource of value v > 0. The players
strategy y against x. Note that we can represent choose between two strategies: display (D) or
the pure strategy i S using the mixed strategy escalate (E). If both display, the resource is split;
ei X, the ith standard basis vector in Rn. If we if one escalates and the other displays, the escala-
do so, then definition (4) restricted to such strate- tor claims the entire resource; if both escalate,
gies is equivalent to definition (1). then each player is equally likely to claim the
We illustrate these ideas with a few examples. entire resource or to be injured, suffering a cost
of c > v in the latter case.
Example 2 Consider the Stag Hunt game: The payoff matrix for the Hawk-Dove game is
therefore
H S
D E
H h h
1
S 0 s . D 2v 0
1
E v 2 (v c) .
Each player in the Stag Hunt game chooses This game has no symmetric Nash equilibrium in
between hunting hare (H) and hunting stag (S). pure strategies. It does, however, admit the symmet-
A player who hunts hare always catches one, ric mixed equilibrium x ¼ xD , xE ¼ cv v
c , c .
obtaining a payoff of h > 0. But hunting stag is (In fact, it can be shown that every symmetric nor-
only successful if both players do so, in which mal form game admits at least one symmetric mixed
case each obtains a payoff of s > h. Hunting stag Nash equilibrium (Nash 1951).)
is potentially more profitable than hunting hare, In this example, our focus on symmetric behav-
but requires a coordinated effort. ior may seem odd: rather than randomizing sym-
In the Stag Hunt game, H and S (or, equiva- metrically, it seems more natural for players to
lently, eH and eS) are symmetric pure Nash equilib- follow an asymmetric Nash equilibrium in which
ria. This game also has a symmetric mixed Nash one player escalates and the other displays. But the
equilibrium, namely x ¼ xH , xS ¼ sh h
s , s . If a symmetric equilibrium is the most relevant one for
player’s opponent chooses this mixed strategy, the understanding natural selection in populations
player’s expected payoff is h whether he whose members are randomly matched in pairwise
chooses H, S, or any mixture between the two; in contests -;see section “Static Notions of Evolution-
particular, x is a best response against itself ary Stability.”
To distinguish between the two pure equilibria,
we might focus on the one that is payoff dominant,
Example 4 Consider the class of Rock-Paper-
in that it achieves the higher joint payoff. Alter-
Scissors games:
natively, we can concentrate on the risk dominant
equilibrium (Harsanyi and Selten 1988), which R P S
utilizes the strategy preferred by a player who R 0 l w
P w 0 l
thinks his opponent is equally likely to choose S l w 0 .
either option (that is, against an opponent playing
mixed strategy ðxH , xS Þ ¼ 12 , 12 ). In the present
case, since s > h, equilibrium S is payoff domi- Here w > 0 is the benefit of winning the match
nant. Which strategy is risk dominant depends on and l > 0 the cost of losing; ties are worth 0 to both
further information about payoffs. If s > 2h, then players. We call this game good RPS if w > l, so
S is risk dominant. But if s < 2h, H is risk dom- that the benefit of winning the match exceeds the
inant: evidently, payoff dominance and risk dom- cost of losing, standard RPS if w = l, and bad RPS
inance need not agree. if w < l. Regardless of the values of w and l, the
unique symmetric Nash equilibrium of this game, x
Example 3 In the Hawk-Dove game (Maynard ¼ xR , xP , xS ¼ 13 , 13 , 13 , requires uniform ran-
Smith 1982), the two players are animals domization over the three strategies.
Evolutionary Game Theory 577
(w = l) , x is a NSS but not an ESS, while in bad strategies. The simplest population games are
RPS (w < l) , x is neither an ESS nor an NSS. The generated by random matching in normal form
last case shows that neither evolutionary nor neu- games, but the population game framework
trally stable strategies need exist in a given game. allows for interactions of a more intricate nature.
The definition of an evolutionarily stable strat- We focus here on games played by a single
egy has been extended to cover a wide range of population (i.e., games in which all agents play
strategic settings, and has been generalized in a equivalent roles). We suppose that there is a unit
variety of directions. Prominent among these mass of agents, each of whom chooses a pure
developments are set-valued versions of ESS: in strategy from the set S = {1, . . . , n}. The
rough terms, these concepts consider a set of aggregate behavior of these agents is described
mixed strategies Y X to be stable if the no by a population state x X, with xj representing
population playing a strategy in the set can be the proportion of agents choosing pure strategy j.
invaded successfully by a population of mutants We identify a population game with a continuous
playing a strategy outside the set. Hines (1987) vector-valued payoff function F : X ! Rn. The
provides a thorough survey of the first 15 years of scalar Fi(x) represents the payoff to strategy
research on ESS and related notions of stability; i when the population state is x.
key references on set-valued evolutionary solu- Population state x is a Nash equilibrium of F if
tion concepts include (Balkenborg and Schlag no agent can improve his payoff by unilaterally
2001; Swinkels 1992; Thomas 1985). switching strategies. More explicitly, x is a Nash
Maynard Smith’s notion of ESS attempts to cap- equilibrium if
ture the dynamic process of natural selection using a
static definition. The advantage of this approach is xi > 0 implies that Fi ðxÞ
that his definition is often easy to check in applica- Fj ðxÞ for all j S: (8)
tions. Still, more convincing models of natural selec-
tion should be explicitly dynamic models, building Example 9 Suppose that the unit mass of agents are
on techniques from the theories of dynamical sys- randomly matched to play the symmetric normal
tems and stochastic processes. Indeed, this thorough- form game A. At population state x, the (expected)
going approach can help us understand whether and payoff to strategy i is the linear function
when the ESS concept captures the notion of robust- Fi(x) = j SAijxj; the payoffs to all strategies
ness to invasion in a satisfactory way. can be expressed concisely as F(x) = Ax. It is easy
The remainder of this article concerns explicitly to verify that x is a Nash equilibrium of the popu-
dynamic models of behavior. In addition to being lation game F if and only if x is a symmetric Nash
dynamic rather than static, these models will differ equilibrium of the symmetric normal form game A.
from the one considered in this section in two other While population games generated by random
important ways as well. First, rather than looking at matching are especially simple, many games that
populations whose members all play a particular arise in applications are not of this form. In the
mixed strategy, the dynamic models consider biology literature, games outside the random
populations in which different members play differ- matching paradigm are known as playing the
ent pure strategies. Second, instead of maintaining a field models (Maynard Smith 1982).
purely biological point of view, our dynamic models
will be equally well-suited to studying behavior in Example 10 Consider the following model of
animal and human populations. highway congestion (Beckmann et al. 1956;
Monderer and Shapley 1996; Rosenthal 1973;
Sandholm 2001b). A pair of towns, Home and
Population Games Work, are connected by a network of links. To
commute from Home to Work, an agent must
Population games provide a simple and general choose a path i S connecting the two towns.
framework for studying strategic interactions in The payoff the agent obtains is the negation of the
large populations whose members play pure delay on the path he takes. The delay on the path is
Evolutionary Game Theory 579
These protocols can be given a very simple inter- When an agent’s clock rings he chooses an oppo-
pretation: when an agent receives a revision nent at random. If the opponent is playing strategy
opportunity, he chooses an opponent at random j, the agent imitates him with probability propor-
and observes her strategy. If our agent is playing tional to pj. If the agent does not imitate this
strategy i and the opponent strategy j, the agent opponent, he draws a new opponent at random
switches from i to j with probability proportional and repeats the procedure.
^ ij. Notice that the value of the population share
to r
xj is not something the agent need know; this term Direct Evaluation Protocols
in (9) accounts for the agent’s observing a ran- In the previous examples, only strategies currently
domly chosen opponent. in use have any chance of being chosen by a
revising agent (or of being the programmed strat-
Example 11 Suppose that after selecting an oppo- egy of the newborn agent). Under other protocols,
nent, the agent imitates the opponent only if the agents’ choices are not mediated through the
opponent’s payoff is higher than his own, doing so population’s current behavior, except indirectly
in this case with probability proportional to the via the effect of behavior on payoffs. These direct
payoff difference: evaluation protocols require agents to directly
evaluate the payoffs of the strategies they con-
rij ðp, xÞ ¼ xj pj pi þ : sider, rather than to indirectly evaluate them as
under an imitative procedure.
This protocol is known as pairwise propor-
tional imitation (Schlag 1998).
Example 13 Suppose that choices are made
Protocols of form (9) also appear in biological
according to the logit choice rule:
contexts, (Moran 1962; Nowak 2006; Nowak
et al. 2004), where in these cases we refer to
them as natural selection protocols. The biologi- exp 1 pj
rij ðp, xÞ ¼ P : (11)
k S expð pk Þ
1
cal interpretation of (9) supposes that each agent is
programmed to play a single pure strategy. An
agent who receives a revision opportunity dies, The interpretation of this protocol is simple.
and is replaced through asexual reproduction. The Revision opportunities arrive at unit rate. When an
reproducing agent is a strategy j player with prob- opportunity is received by an i player, he switches
^ ij ðp, xÞ , which is propor-
ability rij ðp, xÞ ¼ xj r to strategy j with probability rij(p, x), which is
tional both to the number of strategy j players proportional to an exponential function of strategy
and to some function of the prevalences and fit- j’s payoffs. The parameter > 0 is called the
nesses of all strategies. Note that this interpreta- noise level. If Z is large, choice probabilities
tion requires the restriction under the logit rule are nearly uniform. But if Z
X is near zero, choices are optimal with probability
rij ðp, xÞ
1: close to one, at least when the difference between
jS the best and second best payoff is not too small.
Example 12 Suppose that payoffs are always Additional examples of revision protocols can be
positive, and let found in the next section, and one can construct new
revision protocols by taking linear combinations of
xj pj old ones; see (Sandholm 2017) for further
rij ðp, xÞ ¼ P : (10)
k S xk pk discussion.
replicator dynamic by Taylor and Jonker (1978), of revision opportunities received by agents cur-
and remains an vibrant area of research; Hofbauer rently playing strategy i is approximately Nxi Rdt.
and Sigmund (2003) and Sandholm (2017) offer Since an i player who receives a revision oppor-
recent surveys. In this section, we derive a determin- tunity switches to strategy j with probability rij/R,
istic model of evolution: the mean dynamic gener- the expected number of such switches during the
ated by a revision protocol and a population game. next dt time units is approximately Nxi rijdt.
We study this deterministic model from various Therefore, the expected change in the number of
angles, focusing in particular on local stability of agents choosing strategy i during the next dt time
rest points, global convergence to equilibrium, and units is approximately
nonconvergent limit behavior. !
While the bulk of the literature on deterministic X X
N xj rji ðFðxÞ, x xi rij ðFðxÞ, x dt:
evolutionary dynamics is consistent with the
jS jS
approach we take here, we should mention that
(12)
other specifications exist, including discrete time
dynamics (Akin and Losert 1984; Dekel and Dividing expression (12) by N and eliminating
Scotchmer 1992; Losert and Akin 1983; Weissing the time differential dt yields a differential equa-
1991), and dynamics for games with continuous tion for the rate of change in the proportion of
strategy sets (Bomze 1990, 1991; Friedman and agents choosing strategy i:
Yellin 1997; Hofbauer et al. 2005; Oechssler and X X
Riedel 2001, 2002) and for Bayesian population x_i ¼ xj rji ðFðxÞ, xÞ xi rij ðFðxÞ, xÞ:
games (Dokumaci and Sandholm 2007a; Ely and jS jS
Sandholm 2005; Sandholm 2007a). Also, determin- (M)
istic dynamics for extensive form games introduce
new conceptual issues; see (Binmore et al. 1995a; Equation (M) is the mean dynamic (or mean
Binmore and Samuelson 1999; Cressman 1996, field) generated by revision protocol r in popula-
2000; Cressman and Schlag 1998) and the mono- tion game F. The first term in (M) captures the
graph of Cressman (2003). inflow of agents to strategy i from other strategies,
while the second captures the outflow of agents to
other strategies from strategy i.
Mean Dynamics
As described earlier in section “Definition,” a revi- Examples
sion protocol r, a population game F,and a popu- We now describe some examples of mean dynam-
lation size N define a Markov process XNt on the ics, starting with ones generated by the revision
finite state space xN. We now derive a deterministic protocols from section “Examples.” To do so, we let
process – the mean dynamic
– that describes the
X
expected motion of XNt . In section “Deterministic Fð x Þ ¼ xi Fi ðxÞ
Approximation,” we will describe formally the iS
sense in which this deterministic process provides
a very good approximation
of the behavior of the denote the average payoff obtained by the mem-
stochastic process XNt , at least over finite time bers of the population, and define the excess pay-
horizons and for large population sizes. But having off to strategy i,
noted this result, we will focus in this section on the
deterministic process itself. ^ i ðxÞ ¼ Fi ðxÞ FðxÞ,
F
To compute the expected increment of XNt
over the next dt time units, recall first that each of to be the difference between strategy i’s payoff
the N agents receives revision opportunities via a and the population’s average payoff.
rate R exponential distribution, and so expects to
receive Rdt opportunities during the next dt time Example 14 In Example 11, we introduced the
units. If the current state is x, the expected number pairwise proportional imitation protocol rij(p, x)
582 Evolutionary Game Theory
= xj[pj pi]+. This protocol generates the mean If we take the noise level Z to zero, then the
dynamic probability with which a revising agent chooses
the best response approaches one whenever the
^ i ðxÞ:
x_i ¼ xi F (13) best response is unique. At such points, the logit
dynamic approaches the best response dynamic
Equation (13) is the replicator dynamic (Taylor (Gilboa and Matsui 1991):
and Jonker 1978), the best-known dynamic in
evolutionary game theory. Under this dynamic, x_ BF ðxÞ x, (16)
the percentage growth rate ̇ x i =xi , of each strategy
currently in use is equal to that strategy’s current where
excess payoff; unused strategies always remain
so. There are a variety of revision protocols BF ðxÞ ¼ argmaxy X y0 FðxÞ
other than pairwise proportional imitation that
generate the replicator dynamic as their mean defines the (mixed) best response correspondence
dynamics; see (Björnerstedt and Weibull 1996; for game F. Note that unlike the other dynamics we
Hofbauer 1995a; Hofbauer and Sigmund 2003; consider here, (16) is defined not by an ordinary
Weibull 1996). differential equation, but by a differential inclu-
sion, a formulation proposed in Hofbauer (1995b).
Example 15 In Example 12, we assumed that
payoffs are always positive, and introduced the Example 17 Consider the protocol
protocol rij / xjpj, which we interpreted both as a
h X i
model of biological natural selection and as a
rij ðp, xÞ ¼ pj kS
x k pk :
model of imitation with repeated sampling. The þ
resulting mean dynamic,
When an agent’s clock rings, he chooses a
xi F i ðxÞ ^ i ðxÞ
xi F strategy at random; if that strategy’s payoff is
x_i ¼ P xi ¼ , (14) above average, the agent switches to it with prob-
k S xk F k ðxÞ FðxÞ
ability proportional to its excess payoff. The
resulting mean dynamic,
is the Maynard Smith replicator dynamic
(Maynard Smith 1982). This dynamic only differs X
x_i BM ¼ F^ i ðxÞ x i ^ k ðxÞ ,
F
from the standard replicator dynamic (13) by a þ þ
kS
change of speed, with motion under (14) being
relatively fast when average payoffs are relatively
is called the Brown-von Neumann-Nash (BNN)
low. (In multipopulation models, the two dynam-
dynamic (Brown and von Neumann 1950); see
ics are less similar, and convergence under one
also (Hofbauer 2000; Sandholm 2005a; Skyrms
does not imply convergence under the other – see
1990; Swinkels 1993; Weibull 1996).
(Sandholm 2017; Weibull 1995).)
Example 18 Consider the revision protocol
Example 16 In Example 13 we introduced the
logit choice rule rij(p, x) / exp(1pj). The rij ðp, xÞ ¼ pj pi þ :
corresponding mean dynamic,
When an agent’s clock rings, he selects a strat-
expð1 Fi ðxÞÞ egy at random. If the new strategy’s payoff is
x_j ¼ P 1
xi , (15)
k S expð Fk ðxÞÞ higher than his current strategy’s payoff, he
switches strategies with probability proportional
is called the logit dynamic (Fudenberg and Levine to the difference between the two payoffs. The
1998). resulting mean dynamic,
Evolutionary Game Theory 583
X X
x_i ¼ x j F i ðx Þ F j ðx Þ þ x i Fj ðxÞ Fi ðxÞ þ , best response to the population state changes;
jS jS under the BNN and especially the Smith dynamic,
(17) solutions approach the Nash equlibrium in a less
angular fashion.
is called the Smith dynamic (Smith 1984); see also
(Sandholm 2006).
We summarize these examples of revision pro- Evolutionary Justification of Nash Equilibrium
tocols and mean dynamics in Table 1. One of the goals of evolutionary game theory is to
Figure 1 presents phase diagrams for the five justify the prediction of Nash equilibrium play. For
basic dynamics when the population is randomly this justification to be convincing, it must be based
matched to play standard Rock-Paper-Scissors on a model that makes only mild assumptions
(Example 4). In the phase diagrams, colors repre- about agents’ knowledge about one another’s
sent speed of motion: within each diagram, behavior. This sentiment can be captured by intro-
motion is fastest in the red regions and slowest ducing two desiderata for revision protocols:
in the blue ones.
The phase diagram of the replicator dynamic ðCÞContinuity : r is Lipschitz continuous:
reveals closed orbits around the unique Nash equi- ðSDÞScarcity of data : rij only depends on
librium x ¼ 13 , 13 , 13 . Since this dynamic is pi , pj , and xj :
based on imitation (or on reproduction), each face
and each vertex of the simplex X is an invariant Continuity (C) asks that revision protocols
set: a strategy initially absent from the population depend continuously on their inputs, so that
will never subsequently appear. small changes in aggregate behavior do not lead
The other four dynamics pictured are based on to large changes in players’ responses. Scarcity of
direct ecaluation, allowing agents to select strate- data (SD) demands that the conditional switch
gies that are currently unused. In these cases, the rate from strategy i to strategy j only depend on
Nash equilibrium is the sole rest point, and attracts the payoffs of these two strategies, so that agents
solutions from all initial conditions. (In the case of need only know those facts that are most germane
the logit dynamic, the rest point happens to coin- to the decision at hand (Sandholm 2017). (The
cide with the Nash equilibrium only because of dependence of rij, on xj is included to allow for
the symmetry of the game; see (Hofbauer and dynamics based on imitation.) Protocols that
Sandholm 2002, 2007).) Under the logit and best respect these two properties do not make unreal-
response dynamics, solution trajectories quickly istic demands on the amount of information that
change direction and then accelerate when the agents in an evolutionary model possess.
Evolutionary Game R
Theory, Fig. 1 Five basic a
deterministic dynamics in
standard Rock-Paper-
Scissors. Colors represent
speeds: red is fastest, blue is
slowest
P replicator S
b R c R
P S P S
logit(.08) best response
d R e R
P S P S
BNN Smith
Our two remaining desiderata impose restric- the mean dynamic be precisely the Nash equi-
tions on mean dynamics x_ ¼ V F ðxÞ, linking the libria of the game being played. Positive cor-
evolution of aggregate behavior to incentives in relation (PC) is a restriction on disequilibrium
the underlying game. adjustment: it requires that away from rest
points, strategies’ growth rates be positively
ðNSÞ Nash stationarity :
correlated with their payoffs. Condition
V F ðxÞ ¼ 0 if and only if x NEðFÞ:ðPCÞ
(PC) is among the weakest of the many condi-
Positive correlation :
F F 0 tions linking growth rates of evolutionary
V ðxÞ 6¼ 0 implies that V ðxÞ FðxÞ > 0:
dynamics and payoffs in the underlying game;
Nash stationarity (NS) is a restriction on for alternatives, see (Friedman 1991; Hofbauer
stationary states: it asks that the rest points of and Weibull 1996; Nachbar 1990; Ritzberger
Evolutionary Game Theory 585
Evolutionary Game Theory, Table 2 Families of deterministic evolutionary dynamics and their properties; yes*
indicates that a weaker or alternate form of the property is satisfied
Dynamic Family (C) (SD) (NS) (PC)
Replicator Imitation yes yes no yes
Best response no yes* yes* yes*
Logit Perturbed best response yes yes* no no
BNN Excess payoff yes no yes yes
Smith Pairwise comparison yes yes yes yes
and Weibull 1995; Samuelson and Zhang 1992; stability. As we noted at the onset, an original
Sandholm 2001b; Swinkels 1993). motivation for introducing game dynamics was to
In Table 2, we report how the the five basic provide an explicitly dynamic foundation for May-
dynamics fare under the four criteria above. For nard Smith’s notion of ESS (Taylor and Jonker
the purposes of justifying the Nash prediction, the 1978). Some of the earliest papers on evolutionary
most important row in the table is the last one, game dynamics (Hofbauer et al. 1979; Zeeman
which reveals that the Smith dynamic satisfies all 1980) established that being an ESS is a sufficient
four desiderata at once: while the revision proto- condition for asymptotically stablity under the
col for the Smith dynamic (see Example 18) replicator dynamic, but that it is not a necessary
requires only limited information on the part of condition. It is curious that this connection obtains
the agents who employ it, this information is despite the fact that ESS is a stability condition for
enough to ensure that rest points of the dynamic a population whose members all play the same
and Nash equilibria coincide. mixed strategy, while (the usual version of) the
In fact, the dynamics introduced above can be replicator dynamic looks at populations of agents
viewed as members of families of dynamics that choosing among different pure strategies.
are based on similar revision protocols and that In fact, the implications of ESS for local stabil-
have similar qualitative properties. For instance, ity are not limited to the replicator
the Smith dynamic is a member of the family of dynamic. Suppose that the symmetric normal
pairwise comparison dynamics (Sandholm 2006), form game A admits a symmetric Nash equilibrium
under which agents only switch to strategies that that places positive probability on each strategy in
outperform their current choice. For this reason, S. One can show that this equilibrium is an ESS if
the exact functional forms of the previous exam- and only if the payoff matrix A is negative definite
ples are not essential to establishing the properties with respect to the tangent space of the simplex:
noted above.
In interpreting these results, it is important to z0 Az < 0 for all z TX
remember that Nash stationarity only concerns the n X o
rest points of a dynamic; it says nothing about ¼ ^z Rn : ^
z
iS i
¼ 0 : (18)
whether a dynamic will converge to Nash equilib-
rium from an arbitrary initial state. The question Condition (18) and its generalizations imply
of convergence is addressed in sections “Global local stability of equilibrium not only under the
Convergence” and “Nonconvergence.” There we replicator dynamic, but also under a wide range of
will see that in some classes of games, general other evolutionary dynamics: see (Cressman
guarantees of convergence can be obtained, but 1997; Hofbauer 2000; Hofbauer and Hopkins
that there are some games in which no reasonable 2005; Hofbauer and Sandholm 2006a; Hopkins
dynamic converges to equilibrium. 1999; Sandholm 2007a) for further details.
The papers cited above use linearization and
Local Stability Lyapunov function arguments to establish local sta-
Before turning to the global behavior of evolution- bility. An alternative approach to local stability anal-
ary dynamics, we address the question of local ysis, via index theory, allows one to establish
586 Evolutionary Game Theory
restrictions on the stability properties of all rest for all i and j, or, equivalently, that the matrix A is
points at once – see (Demichelis and Ritzberger symmetric. Since DF(x) = A, this is precisely
2003). what we need for F to be a full potential game.
The full potential function for F is f ðxÞ ¼ 12 x0 Ax,
Global Convergence which is one-half of the average payoff function
P
While analyses of local stability reveal whether a FðxÞ ¼ i S xi Fi ðxÞ ¼ x0 Ax. The common inter-
population will return to equilibrium after a small est assumption defines a fundamental model from
disturbance, they do not tell us whether the popu- population genetics, this assumption reflects the
lation will approach equilibrium from an arbitrary shared fate of two genes that inhabit the same
disequilibrium state. To establish such global con- organism (Fisher 1930; Hofbauer and Sigmund
vergence results, we must restrict attention to 1988, 1998).
classes of games defined by certain interesting
payoff structures. These structures appear in Example 20 In Example 10, we introduced con-
applications, lending strong support for the Nash gestion games, a basic model of network conges-
prediction in the settings where they arise. tion. To see that these games are potential games,
observe that an agent taking path j S affects the
payoffs of agents choosing path i S through
Potential Games
the marginal increases in congestion on the links
A potential game (Beckmann et al. 1956;
f Fi \ Fj that the two paths have in common.
Hofbauer and Sigmund 1988; Monderer and
But since the marginal effect of an agent taking
Shapley 1996; Rosenthal 1973; Sandholm
path i on the payoffs of agents choosing path j is
2001b, 2007c) is a game that admits a potential
identical, full externality symmetry (20) holds:
function: a scalar valued function whose gradient
describes the game’s payoffs. In a full potential X @Fj
@Fi
game F : Rnþ ! Rn (see Sandholm 2007c), all ðxÞ ¼ c0f uf ðxÞ ¼ ðxÞ:
@xj f F \F
@xi
information about incentives is captured by the i j
X ð u f ð xÞ
If F is smooth, then it is a full potential game if f ðx Þ ¼ cf ðzÞ dz,
and only if it satisfies full externality symmetry. fF 0
d
f ðxt Þ ¼ ∇f ðxt Þ0 ̇ x t ¼ Fðxt Þ0 V F ðxt Þ 0:
dt
transported to the surface of the radius 2 sphere game in which strategies represent amounts of
using the Akin transformation. In this case, solu- time committed to waiting for a scarce resource.
tions cross the level sets of the potential function If the two players choose times i and j > i, then the
orthogonally, moving in the direction that j player obtains the resource, worth v, while both
increases potential most quickly. players pay a cost of ci: once the first player leaves,
the other seizes the resource immediately. If both
players choose time i, the resource is split, so
Stable Games
payoffs are 2v ci each It can be shown that for
A population game F is a stable game (Hofbauer
any resource value v R and any increasing cost
and Sandholm 2006a) if
vector c Rn, random matching in a war of attri-
tion generates a stable game (Hofbauer and
ðy xÞ0 ðFðyÞ FðxÞÞ 0 for all x, y X:
Sandholm 2006a).
(21) The flavor of the self-defeating externalities
condition (22) suggests that obedience of incen-
If the inequality in (21) always holds strictly, tives will push the population toward some “cen-
then F is a strictly stable game. tral” equilibrium state. In fact, the set of Nash
If F is smooth, then F is a stable game if and equilibria of a stable game is always convex, and
only if it satisfies self-defeating externalities: in the case of strictly stable games, equilibrium is
unique. Moreover, it can be shown that the
z0 DFðxÞz 0 for all z TX and x X, (22) replicator dynamic converges to Nash equilibrium
from all interior initial conditions in any strictly
where DF(x) is the derivative of F : X ! Rn at x. stable game (Akin 1990; Hofbauer et al. 1979;
This condition requires that the improvements in Zeeman 1980), and that the direct evaluation
the payoffs of strategies to which revising agents dynamics introduced above converge to Nash
are switching are always exceeded by the equilibrium from all initial conditions in all stable
improvements in the payoffs of strategies which games, strictly stable or not (Hofbauer 2000;
revising agents are abandoning. Hofbauer and Sandholm 2006a, 2007; Smith
1984). In each case, the proof of convergence is
Example 22 The symmetric normal form game based on the construction of a Lyapunov function
A is symmetric zero-sum if A is skew-symmetric that solutions of the relevant dynamic descend.
(i.e., if A = A0), so that the payoffs of the The Lyapunov functions for the five basic dynam-
matched players always sum to zero. ics are presented in Table 3.
(An example is provided by the standard Rock- Interestingly, the convergence results for direct
Paper-Scissors game (Example 4).) Under this evaluation dynamics are not restricted to the dynam-
assumption, z0Az = 0 for all z Rn; thus, the ics listed in Table 3, but extend to other dynamics in
population game generated by random matching the same families (cf Table 2). But compared to the
in A , F(x) = Ax, is a stable game that is not conditions for convergence in potential games, the
strictly stable. conditions for convergence in stable games demand
additional structure on the adjustment process
Example 23 Suppose that A satisfies the interior (Hofbauer and Sandholm 2006a).
ESS condition (18). Then (22) holds strictly, so
F(x) = Ax is a strictly stable game. Examples
satisfying this condition include the Hawk-Dove Perturbed Best Response Dynamics in
game (Example 3) and any good Rock-Paper- Supermodular Games
Scissors game (Example 4). Supermodular games are defined by the property
that higher choices by one’s opponents (with respect
Example 24 A war of attrition (Bishop and to the natural ordering on S = {1, . . . , n}) make
Cannings 1978) is a symmetric normal form one’s own higher strategies look relatively more
Evolutionary Game Theory 589
Evolutionary Game Theory, Table 3 Lyapunov functions for five basic deterministic dynamics in stable games
Dynamic Lyapunov function for stable games
P x
Replicator H x ðxÞ ¼ i Sðx Þ xi log xii
X
X
Logit ~ ðxÞ ¼ max y0 F
G ^ ð xÞ y logyi þ i S xi logxi
y intðXÞ iS i
desirable. Let the matrix S R(n 1) n satisfy and maximal Nash equilibria (Topkis 1979). One
Sij = 1 if j > i and Sij = 0 otherwise, so that can take advantage of the monotoncity of best
Sx Rn 1 is the “decumulative distribution func- responses in studying evolutionary dynamics by
tion” corresponding to the “density function” x. The appealing to the theory of monotone dynamical
population game F is a supermodular game if it systems (Smith 1995). To do so, one needs to
exhibits strategic complementarities: focus on dynamics that respect the monotonicity
of best responses and that also are smooth, so that
If Sy Sx, then Fiþ1 ðyÞ Fi ðyÞ the the theory of monotone dynamics can be
Fiþ1 ðxÞ Fi ðxÞ for all i applied. It turns out that the logit dynamic satisfies
< n and x X: (23) these criteria; so does any perturbed best response
dynamic defined in terms of stochastic payoff
If F is smooth, condition (23) is equivalent to perturbations. In supermodular games, these
dynamics define cooperative differential equa-
@ ðFiþ1 Fi Þ tions; consequently, solutions of these dynamics
ðxÞ 0 for all i, j
@ ejþ1 ej from almost every initial condition converge to an
approximate Nash equilibrium (Hofbauer and
< n and x X: (24) Sandholm 2007).
replicator dynamic converges in games with a stable, but not strictly stable (Example 22). In
strictly dominant strategy, and by iterating this argu- this case, for each interior Nash equilibrium x,
ment, one can show that this dynamic converges to the function H x is a constant of motion for the
equilibrium in any game that can be solved by replicator dynamic: its value is fixed along every
iterative deletion of strictly dominated strategies. In interior solution trajectory.
fact, this argument is not specific to the replicator
dynamic, but can be shown to apply to a range of Example 26 Suppose that agents are randomly
dynamics based on imitation (Hofbauer and Weibull matched to play the symmetric zero-sum game A,
1996; Samuelson and Zhang 1992). Even in games given by
which are not dominance solvable, arguments of a
1 2 3 4
similar flavor can be used to restrict the long run
1 0 1 0 1
behavior of imitative dynamics to better-reply 2 1 0 1 0
closed sets (Ritzberger and Weibull 1995); see sec- 3 0 1 0 1
tion “Convergence to Equilibria and to Better-Reply 4 1 0 1 0 .
Closed Sets” for a related discussion.
While the analysis here has focused on imita-
The Nash equilibria of F(x) = Ax are the
tive dynamics, it is natural to expect that elimina-
points on the line segment NE connecting states
tion of dominated strategies will extend to any 1 1
1 1
reasonable evolutionary dynamic. But we will 2 , 0, 2 , 0 and 0, 2 , 0, 2 , asegment that
passes
through the barycenter x ¼ 4 , 4 , 4 , 4 . Figure 3
1 1 1 1
see in section “Survival of Dominated Strategies”
shows solutions to the replicator dynamic that lie
that this is not the case: the elimination of domi-
on the level set H x ðxÞ ¼ :58. Evidently, each of
nated strategies that obtains under imitative
these solutions forms a closed orbit.
dynamics is the exception, not the rule.
Although solution trajectories of the replicator
Nonconvergence dynamic do not converge in zero-sum games, it
The previous section revealed that when certain can be proved that the the time average of each
global structural conditions on payoffs are satis- solution trajectory converges to Nash equilibrium
fied, one can establish global convergence to (Schuster et al. 1981).
equilibrium under various classes of evolution- The existence of a constant of motion is not the
ary dynamics. Of course, if these conditions are only conservative property enjoyed by replicator
not met, convergence cannot be guaranteed. In dynamics for symmetric zero-sum games: these
this section, we offer examples to illustrate some dynamics are also volume preserving after an
of the possibilities for nonconvergent limit appropriate change of speed or change of measure
behavior. (Akin and Losert 1984; Hofbauer 1995a).
until converging to its minimizer, the unique Nash Example 27 Suppose that players are randomly
equilibrium x. matched to play the following symmetric normal
Now, random matching in a symmetric zero- form game (Hofbauer and Sigmund 1998;
sum game generates a population game that is Hofbauer and Swinkels 1996):
Evolutionary Game Theory 591
Evolutionary Game
Theory, Fig. 3 Solutions
of the replicator dynamic in
a zero-sum game. The
solutions pictured lie on the
level set H x ðxÞ ¼ :58
4 10 2 2 0 .
equilibrium of Fe is the barycenter x, it follows that
solutions from most initial conditions converge to
an attractor far from any Nash equilibrium. Figure 5 presents a solution to the replicator
Other examples of games in which many dynamic for this game from initial condition
dynamics fail to converge include monocyclic x0 = (.24, .26, .25, .25). This solution spirals
592 Evolutionary Game Theory
Evolutionary Game
Theory, Fig. 4 Solutions
of the Smith dynamic in (a)
the potential game F0; (b)
the perturbed potential
game Fe , e ¼ 10
1
clockwise about x. Near the rightmost point of each Hofbauer and Sandholm 2006b) prove that dynam-
circuit, where the value of x3 gets close to zero, ics that satisfy continuity (C), Nash stationarity
solutions sometimes proceed along an “outside” (NS), and positive correlation (PC) and that are
path on which the value of x3 surpasses .6. But not based exclusively on imitation must fail to
they sometimes follow an “inside” path on which eliminate strictly dominated strategies in some
x3 remains below .4, and at other times do something games. Thus, evolutionary support for a basic
in between. Which of these alternatives occurs is rationality criterion is more tenuous than the results
difficult to predict from approximate information for imitative dynamics suggest.
about the previous behavior of the system.
While the game in Example 28 has a compli- Example 29 Figure 6a presents the Smith
cated payoff structure, in multipopulation con- dynamic for “bad RPS with a twin”:
texts one can find chaotic evolutionary dynamics R P S T
in very simple games (Sato et al. 2002). R 0 2 1 1
P 1 0 2 2
S 2 1 0 0
Survival of Dominated Strategies T 2 1 0 0 .
In section “Imitation Dynamics in Dominance
Solvable Games,” we saw that dynamics based
on imitation eliminate strictly dominated strategies
along solutions from interior initial conditions. The Nash equilibria of this game are the states on
While this result seems unsurprising, it is actually line segment NE ¼ x X : x ¼ 13 , 13 , c, 13 c ,
extremely fragile: (Berger and Hofbauer 2006; which is a repellor under the Smith dynamic. Under
Evolutionary Game Theory 593
Deterministic Approximation
Stochastic Dynamics In section “Revision Protocols,” we
Ndefined
the
Markovian evolutionary process Xt from a
In section “Revision Protocols” wedefined the revision protocol r, a population game F, and a
stochastic evolutionary process XNt in terms of finite population size N. In section “Mean
594 Evolutionary Game Theory
x_i ¼ V Fi ðxÞ X3
x3
X
¼ xj rji ðFðxÞ, xÞ
jS
X
xi rij ðFðxÞ, xÞ: (M)
X2
jS x2
X1
The
basic link between the Markov process
XNt and its mean dynamic (M) is provided by x1
Kurtz’s Theorem (Kurtz 1970), variations and
extensions of which have been offered in a num- X0=x0
ber of game-theoretic contexts (Benaïm and
Weibull 2003; Binmore and Samuelson 1997; Evolutionary Game Theory, Fig. 7 Deterministic
Börgers and Sarin 1997; Boylan 1995; Sandholm approximation of the Markov process XNt
2003; Tanabe 2006).nConsider the sequence of
o1
Markov processes XNt t0 , supposing the same expected increments. The law of large
N¼N 0
numbers
therefore suggests that the change in
that the initial conditions XN0 converge
to
XNt during this interval should be almost
x0 X. Let {xt}t 0 be the solution to the mean
completely
determined by the expected motion
dynamic (M) starting from x0. Kurtz’s Theorem
of XNt , as described by the mean dynamic (M).
tells us that for each finite time horizon T < 1
and error bound e > 0, we have that
Convergence to Equilibria and to Better-Reply
!
Closed Sets
lim P sup XNt xt < e ¼ 1: (25) Stochastic models of evolution can also be used to
N!1 t ½0, T
address directly the question of convergence to
equilibrium (Dindoš and Mezzetti 2006; Fried-
Thus, when the population size N is large, man and Mezzetti 2001; Josephson 2008; Joseph-
nearly
all sample paths of the Markov process son and Matros 2004; Kukushkin 2004; Monderer
XNt stay within e of a solution of the mean and Shapley 1996; Sandholm 2001a; Peyton
dynamic (M) through time T. By choosing Young 1993a). Suppose that a society of agents
N large enough, we can ensure that with probabil- is randomly matched to play an (asymmetric)
ity close to one, XNt and xt differ by no more than e normal form game that is weakly acyclic in better
for all times t between 0 and T (Fig. 7). replies: from each strategy profile, there exists a
The intuition for this result comes from the law sequence of profitable unilateral deviations lead-
of large numbers. At each revision opportunity,
N ing to a Nash equilibrium. If agents switch to
the increment in the process Xt is strategies that do at least as well as their current
stochastic. Still, at most population states the one against the choices of random samples of
expected number of revision opportunities that opponents, then the society will eventually escape
arrive during the brief time interval I = [t, t + dt] any better-response cycle, ultimately settling
is large – in particular, of order Ndt. Since each upon a Nash equilibrium.
opportunity leads to an increment of the state of Importantly, many classes of normal form
size N1 , the size of the overall change in the state games are weakly acyclic in better replies: these
during time interval I is of order dt. Thus, during include potential games, dominance solvable
this interval there are a large number of revision games, certain supermodular games, and certain
opportunities, each following nearly the same aggregative games, in which each agent’s payoffs
transition probabilities, and hence having nearly only depend on opponents’ behavior through a
Evolutionary Game Theory 595
subsets of the set of strategy profiles that cannot lim P XkN, e ¼ yj X0N, e ¼ x
k!1
be escaped without a player switching to an
inferior strategy (cf. Basu and Weibull 1991; ¼ mN, e ðyÞ for all x, y xN :
Ritzberger and Weibull 1995). Stochastic Second, mN, e almost surelyndescribes
o the limit-
better-reply procedures must lead to a cluster of N, e
ing empirical distribution of Xt :
population states corresponding to a better-reply
closed set; once the society enters such a cluster,
1XK 1
it never departs. P lim 1fXN, e Ag ¼ mN , e ðAÞÞ ¼ 1 for anyA xN :
K!1 K k
k¼0
Stochastic Stability and Equilibirum Selection Thus, if most of the mass in the stationary
To this point, we used stochastic evolutionary distribution mN, e were placed on a single state,
dynamics to provide foundations for deterministic then this state would provide a unique prediction
dynamics and to address the question of conver- of long run behavior.
gence to equilibrium. But stochastic evolutionary With this motivation, consider a sequence of
nn o1 o
dynamics introduce an entirely new possibility: Markov chains X N, e
parametrized
k
that of obtaining unique long-run predictions of k¼0 e ð0eÞ
play, even in games with multiple locally stable by noise levels e that approach zero. Population
equilibria. This form of analysis, which we con- state x xN is said to be stochastically stable if it
sider next, was pioneered by Foster and Peyton retains positive weight in the stationary distribu-
Young (1990), Kandori et al. (1993), and Peyton tions of these Markov chains as e becomes arbi-
Young (1993a), building on mathematical tech- trarily small:
niques due to Freidlin and Wentzell (1998).
lim mN, e ðxÞ > 0:
e!0
Stochastic Stability
To minimize notation, let us describe the evolu- When the stochastically stable state is unique,
tion ofnbehavior it offers a unique prediction of play that is relevant
o1 using a discrete-time Markov
N, e over sufficiently long time spans.
chain X k on xN, where the parameter
k¼0
e > 0 represents the level of “noise” in agents’
decision procedures. The noise ensures that the Bernoulli Arrivals and Mutations
Markov chain is irreducible and aperiodic: any Following the approach of many early contribu-
state in xN can be reached from any other, and tors to the literature, let us consider a model of
there is positive probability that a period passes stochastic evolution based on Bernoulli arrivals
without a change in the state. of revision opportunities and best responses with
mutations. The former assumption means that
n Under o these conditions, the Markov chain during each discrete time period, each agent has
N, e
Xk admits a unique stationary distribution,
probability y (0 , 1] of receiving an opportu-
mN , e, a measure on the state space xN that is nity to update his strategy. This assumption differs
invariant under the Markov chain: than the one we proposed in section “Revision
X
Protocols”; the key new implication is that all
N, e
mN, e ðxÞP Xkþ1 ¼ yj XkN, e ¼ x agents may receive revision opportunities simul-
x XN
taneously. (Models that assume this directly gen-
¼ mN, e ðyÞ for all y xN : erate similar results.) The latter assumption posits
596 Evolutionary Game Theory
that when an agent receives a revision opportu- To determine the stochastically stable state, we
nity, he plays a best response to the current strat- must compute and compare the “improbabilities” of
egy distribution with probability 1 e, and these transitions. If the current state is eH, a transi-
chooses a strategy at random with probability e. tion to eS requires mutations to cause roughly NxS
agents to switch to the suboptimal strategy S, send-
Example 30 Suppose that a population of ing the population into the basin of attraction of eS,
N agents is randomly matched to play the Stag the probability of this event is of order eNxS . Simi-
Hunt game (Example 2): larly, to transit from
eS to eH, mutations must cause
roughly NxH ¼ N 1 xS to switch from S to H;
this probability of this event is of order eN ð1xS Þ.
H S
H h h Which of these rare events is more likely ones
S 0 s .
depends on whether xS is greater than or less than 12 :
If s > 2h, so that xS < 12, then eNxS is much smaller
than eNð1xS Þ , when e is small; thus, state eS is
Since s > h > 0, hunting hare and hunting
stag are both symmetric pure equilibria; the stochastically stable (Fig. 8a). If instead s < 2h, so
that xS > 12, then eN ð1xS Þ < eNxS so eH is stochas-
game also admits the symmetric mixed equilib-
rium x ¼ xH , xS ¼ sh h
s , s .
tically stable (Fig. 8b).
If more than fraction xH of the agents hunt hare, These calculations show that risk dominance –
then hare is the unique best response, while if being the optimal response against a uniformly
more than fraction xS of the agents hunt stag, randomizing opponent – drives stochastic stabil-
then stag is the unique best response. Thus, ity 2 2 games. In particular, when s < 2h, so
under any deterministic dynamic that respects that risk dominance and payoff dominance dis-
payoffs, the mixed equilibrium x divides the agree, stochastic stability favors the former over
state space into two basins of attraction, one for the latter.
each of the two pure equilibria. This example illustrates how under Bernoulli
Now consider our stochastic evolutionary pro- arrivals and mutations, stochastic stability analy-
cess. If the noise level e is small, this process sis is based on mutation counting: that is, on
typically behaves like a deterministic process, determining how many simultaneous mutations
moving quickly toward one of the two pure states, are required to move from each equilibrium into
eH = (1, 0) or es = (0, 1), and remaining there the basin of attraction of each other equilibrium.
for some time. But since the process is ergodic, In games with more than two strategies, complet-
it will eventually leave the pure state it reaches ing the argument becomes more complicated than
first, and in fact will switch from one pure state to in the example above: the analysis, typically
the other infinitely often. based on the tree-analysis techniques of Freidlin
Evolutionary Game
a e Nx*s e N(1–x*s)
Theory, Fig. 8 Equi-
librium selection via
mutation counting in Stag
Hunt games eH X*s = 2 es
5
h = 2, s = 5
e Nx*s e N(1–x*s)
b
eH X*s = 2 es
3
h = 2, s = 3
Evolutionary Game Theory 597
and Wentzell (1998) and Peyton Young (1993a), Another important criticism of the stochastic
requires one to account for the relative difficulties stability literature concerns the length of time
of transitions between all pairs of equilibria. Elli- needed for its predictions to become relevant
son (2000) develops a streamlined method of (Binmore et al. 1995b; Ellison 1993). If the pop-
computing the stochastically stable state based ulation size N is large and the mutation rate e is
on radius-coradius calculations; while this small, then the probability ecN that a transition
approach is not always sufficiently fine to yield a between equilibria occurs during given period is
complete analysis, in the cases where it works it miniscule; the waiting time between transitions is
can be considerably simpler to apply than the tree- thus enormous. Indeed, if the mutation rate falls
analysis method. over time, or if the population size grows over
These techniques have been employed suc- time, then ergodicity may fail, abrogating equilib-
cessfully to variety of classes of games, including rium selection entirely (Robles 1998; Sandholm
pure coordination games, supermodular games, and Pauzner 1998). These analyses suggest that
games satisfying “bandwagon” properties, and except in applications with very long time hori-
games with equilibria that satisfy generalizations zons, the unique predictions generated by ana-
of risk dominance (Ellison 2000; Kandori and lyses of stochastic stability may be
Rob 1995, 1998; Maruta 1997). A closely related inappropriate, and that modelers would do better
literature uses stochastic stability as a basis for to focus on history-dependent predictions of the
evaluating traditional solution concepts for sort provided by deterministic models. At the
extensive form games (Hart 2002; Jacobsen same time, there are frameworks in which sto-
et al. 2001; Kim and Sobel 1995; Kuzmics chastic stability becomes relevant much more
2004; Nöldeke and Samuelson 1993; Samuelson quickly. The most important of these are local
1994, 1997). interaction models, which we discuss in section
A number of authors have shown that varia- “Local Interaction.”
tions on the Bernoulli arrivals and mutations
model can lead to different equilibrium selection Poisson Arrivals and Payoff Noise
results. For instance, Robson and Vega-Redondo Combining the assumption of Bernoulli arrivals
(1996) and Vega-Redondo (1996) show that if of revision opportunities with that of best
choices are determined from the payoffs from a responses with mutations creates a model in
single round of matching (rather than from which the probabilities of transitions between
expected payoffs), the payoff dominant equilib- equilibria are easy to compute: one can focus on
rium rather than the risk dominant equilibrium is events in which large numbers of agents switch to
selected. If choices depend on strategies’ relative a suboptimal strategy at once, each doing so with
performances rather than their absolute perfor- the same probability. But the simplicity of this
mances, then long run behavior need not resemble argument also highlights the potency of the
a Nash equilibrium at all (Bergin and Bernhardt assumptions behind it.
2004; Rhode and Stegeman 1996; Sandholm An appealing alternative approach is to model
1998; Stegeman and Rhode 2004). Finally, if the stochastic evolution using Poisson arrivals of
probability of mutation depends on the current revision opportunities and payoff noise (Binmore
population state, then any recurrent set of the and Samuelson 1997; Binmore et al. 1995b;
unperturbed process (e. g., any pure equilibrium Blume 1997, 2003; Dokumaci and Sandholm
of a coordination game) can be selected in the 2007b; Maruta 2002; Myatt and Wallace 2003;
long run if the mutation rates are specified in an Ui 1998; van Damme and Weibull 2002; Peyton
appropriate way (Bergin and Lipman 1996). This Young 1998b). (One can achieve similar effects
last result suggests that mistake probabilities by looking at models defined in terms of stochas-
should be provided with an explicit foundation, a tic differential equations; see (Beggs 2002;
topic we take up in section “Poisson Arrivals and Cabrales 2000; Foster and Peyton Young 1990;
Payoff Noise.” Fudenberg and Harris 1992; Imhof 2005).) By
598 Evolutionary Game Theory
allowing revision opportunities to arrive in con- the edges of the simplex, implying that the prob-
tinuous time, as we did in section “Revision Pro- abilities of transitions between vertices can be
tocols,” we ensure that agents do not receive determined using birth-and-death chain methods
opportunities simultaneously, ruling out the (Nowak et al. 2004). As a consequence, one can
simultaneous mass revisions that drive the reduce the problem of finding the stochastically
Bernoulli arrival model. (One can accomplish stable state in an n strategy coordination game to
the same end using a discrete time model by that of computing the limiting stationary distribu-
assuming that one agent updates during each tion of an n state Markov chain.
period; the resulting process is a random time
change away from the Poisson arrivals model.) Stochastic Stability via Large Population
Under Poisson arrivals, transitions between Limits
equilibria occur gradually, as the population The approach to stochastic stability followed thus
works its way out of basins of attraction one far relies on small noise limits: that is, on evalu-
agent at a time. In this context, the mutation ating the limit of the stationary distributions mN , e
assumption becomes particularly potent, ensuring as the noise level e approaches zero. Binmore and
that the probabilities of suboptimal choices do not Samuelson (1997) argue that in the contexts
vary with their payoff consequences. Under the where evolutionary models are appropriate, the
alternative assumption of payoff noise, one sup- amount of noise in agents decisions is not negli-
poses that agents play best responses to payoffs gible, so that taking the low noise limit may not be
that are subject to random perturbations drawn desirable. At the same time, evolutionary models
from a fixed multivariate distribution. In this are intended to describe behavior in large
case, suboptimal choices are much more likely populations, suggesting an alternative approach:
near basin boundaries, where the payoffs of that of evaluating the limit of the stationary distri-
second-best strategies are not much less than butions mN , e as the population size N grows large.
those of optimal ones, than they are at stable In one respect, this approach complicates the
equilibria, where payoff differences are larger. analysis. When N is fixed and e varies, each sta-
Evidently, assuming Poisson arrivals and pay- tionary distribution mN , e is a measure on the fixed
off noise means that stochastic stability cannot be state space xN = {x X : Nx Zn}. But when
assessed by way of mutation counting. To deter- e is fixed and N varies, the state space xN varies as
mine the unlikelihood of escaping from an equi- well, and one must introduce notions of weak
librium’s basin of attraction, one must not only convergence of probability measures in order to
account for the “width” of the basin of attraction define stochastic stability.
(i.e., the number of suboptimal choices needed to But in other respects taking large population
escape it), but also for its “depth” (the unlikeli- limits can make analysis simpler. We saw in sec-
hood of each of these choices). In two-strategy tion “Deterministic Approximation” that by tak-
games this is not difficult to accomplish: in this ing the large population limit, we can approximate
case the evolutionary process is a birth-and-death the finite-horizon sample
chain, and its stationary distribution can be n paths o of the stochastic
evolutionary process XNt , e by solutions to the
expressed using an explicit formula. Beyond this
case, one can employ the Freidlin and Wentzell mean dynamic (M). Now we are concerned with
(1998) machinery, although doing so tends to be infinite horizon behavior, but it is still reasonable
computationally demanding. to hope that the large population limit will again
This computational burden is less in models reduce some of our computations to a calculus
that retain Poisson arrivals, but replace perturbed problems.
optimization with decision rules based on imita- As one might expect, this approach is easiest to
tion and mutation (Fudenberg and Imhof 2006). follow in the two-strategy case, where for each
Because agents imitate successful opponents, the fixed population
n o size N, the evolutionary process
N, e
population spends the vast majority of periods on Xt is a birth-and-death chain. When one
Evolutionary Game Theory 599
takes the large population limit, the formulas for formula accounts both for the typical procession
waiting times and for the stationary distribution of the process along solutions of the mean
can be evaluated using integral approximations dynamic, and for the rare sojourns of the process
(Benaïm and Weibull 2003; Binmore and against this deterministic flow.
Samuelson 1997; Blume 2003; Peyton Young
1998b). Indeed, the approximations so obtained
take an appealing simple form (Sandholm 2007d).
Local Interaction
The analysis becomes more complicated
beyond the two-strategy case, but certain models
All of the game dynamics considered so far have
have proved amenable to analysis. For instance
been based implicitly on the assumption of global
Fudenberg and Imhof (2006), characterizes large
interaction: each agent’s payoffs depend directly
population stochastic stability in models based on
on all agents’ actions. In many contexts, one
imitation and mutation. Imitation ensures that the
expects to the contrary that interactions will be
population spends nearly all periods on the edges
local in nature: for instance, agents may live in
of the simplex X, and the large population limit
fixed locations and interact only with neighbors.
makes evaluating the probabilities of transitions
In addition to providing a natural fit for these
along these edges relatively simple.
applications, local interaction models respond to
If one supposes that agents play best responses
some of the criticisms of the stochastic stability
to noisy payoffs, then one must account
n odirectly literature. At the same time, once one moves
for the behavior of the process XNt , e in the beyond relatively simple cases, local interaction
interior of the simplex. One possibility is to com- models become exceptionally complicated, and so
bine the deterministic approximation results from lend themselves to methods of analysis very dif-
section “Deterministic Approximation” with tech- ferent from those considered thus far.
niques from the theory of stochastic approxima-
tion (Benaïm 1998; Benaïm and Hirsch 1999) to
Stochastic Stability and Equilibrium Selection
show that the large N limiting stationary distribu-
Revisited
tion is concentrated on attractors of the mean
In section “Stochastic Stability and Equilibirum
dynamic. By combining this idea with conver-
Selection,” we saw the prediction of risk dominant
gence results for deterministic dynamics from
equilibrium play provided by stochastic stability
section “Global Convergence,” Hofbauer and
models is subverted by the waiting-time critique:
Sandholm (2007) shows that the limiting station-
namely, that the length of time required before this
ary distribution must be concentrated around
equilibrium is reached may be extremely long.
equilibrium states in potential games, stable
Ellison (1993, 2000) shows that if interactions
games, and supermodular games.
are local, then selection of the risk dominant equi-
The results in Hofbauer and Sandholm (2007)
librium persists, and waiting times are no longer
do not address the question of equilibrium selec-
an issue.
tion. However, for the specific case of logit evo-
lution in potential games, a complete
Example 31 In the simplest local interaction
characterization
n ofothe large population limit of model, a population of N agents are located at
the process XtN, e has been obtained (Benaïm N distinct positions around a circle. During each
and Sandholm 2007), By combining deterministic period of play, each agent plays the Stag Hunt
approximation results, which describe the usual game (Examples 2 and 30) with his two nearest
behavior of the process within basins of attraction, neighbors, following the same action against both
with a large deviations analysis, which character- of his opponents. If we suppose that s (h, 2h), so
izes the rare escapes from basins of attraction, one that hunting hare is the risk dominant strategy, then
can obtain a precise asymptotic formula for the by definition, an agent whose neighbors play differ-
large N limiting stationary distribution. This ent strategies finds it optimal to choose H himself.
600 Evolutionary Game Theory
Now suppose that there are Bernoulli arrivals literature on evolution and local interaction focuses
of revision opportunities, and that decisions are on cases with complex dynamics, where instead of
based on best responses and rare mutations. To settling quickly into a homogeneous, static configu-
move from the all S state to the all H state, it is ration, behavior remains in flux, with multiple strat-
enough that a single agent mutates S to H. This egies coexisting for long periods of time.
one mutation begins a chain reaction: the mutat-
ing agent’s neighbors respond optimally by Example 32 Cooperating is a dominated strategy
switching to H themselves; they are followed in in the Prisoner’s Dilemma, and is not played in
this by their own neighbors; and the contagion equilibrium in finitely repeated versions of this
continues until all agents choose H. Since a sin- game. Nevertheless, a pair of Prisoner’s Dilemma
gle mutation is always enough to spur the tran- tournaments conducted by Axelrod (1984) were
sition from all S to all H, the expected wait before won by the strategy Tit-for-Tat, which cooperates
this transition is small, even when the population against cooperative opponents and defects against
is large. defectors. Axelrod’s work spawned a vast literature
In contrast, the transition back from all H to all aiming to understand the persistence of individu-
S is extremely unlikely. Even if all but one of the ally irrational but socially beneficial behavior.
agents simultaneously mutate to S, the contagion To address this question, Nowak and May
process described above will return the population (Nowak 2006; Nowak et al. 1994a, b; Nowak
to the all-H state. Thus, while the transition from and May 1992, 1993) consider a population of
all-S to all-H occurs quickly, the reverse transition agents who are repeatedly matched to play the
takes even longer than in the global interaction Prisoner’s Dilemma
setting.
C D
The local interaction approach to equilibrium
C 1 ε
selection has been advanced in a variety of direc- D g 0 ,
tions: by allowing agents to choose their locations
(Ely 2002), or to pay a cost to choose different
strategies against different opponents (Goyal and where the greedy payoff g exceeds 1 and e > 0 is
Janssen 1997), and by basing agents’ decisions on small. The agents are positioned on a two-
the attainment of aspiration levels (Anderlini and dimensional grid. During each period, each
Ianni 1996), or on imitation of successful oppo- agent plays the Prisoner’s Dilemma with the
nents (Alós-Ferrer and Weidenholzer 2006a, b). eight agents in his (Moore) neighborhood. In the
A portion of this literature initiated by Blume simplest version of the model, all agents simulta-
develops connections between local interaction neously update their strategies at the end of each
models in evolutionary game theory with models period. If an agent’s total payoff that period is as
from statistical mechanics (Blume 1993, 1995, high as that of any of neighbor, he continues to
1997; Kosfeld 2002; Miękisz 2004). These play the same strategy; otherwise, he switches to
models provide a point of departure for research the strategy of the neighbor who obtained the
on complex spatial dynamics in games, which we highest payoff.
consider next. Since defecting is a dominant strategy in the
Prisoner’s Dilemma, one might expect the local
Complex Spatial Dynamics interaction process to converge to a state at which
The local interaction models described above all agents defect, as would be the case in nearly
address the questions of convergence to equilibrium any model of global interaction. But while an
and selection among multiple equilibria. In the cases agent is always better off defecting himself, he
where convergence and selection results obtain, also is better off the more of his neighbors coop-
behavior in these models is relatively simple, as erate; and since evolution is based on imitation,
most periods are spent with most agents coordinat- cooperators tend to have more cooperators as
ing on a single strategy. A distinct branch of the neighbors than do defectors.
Evolutionary Game Theory 601
In Figs. 9, 10, and 11, we present snapshots of cooperative behavior. Nevertheless, under
the local interaction process for choices of the Poisson arrivals of revision opportunities, or prob-
greedy payoff g from each of three distinct param- abilistic decision rules, or both, cooperation can
eter regions. If g > 53 (Fig. 9), the process quickly persist for very long periods of time for values of
converges to a configuration containing a few g significantly larger than 1 (Nowak et al. 1994a, b).
rectangular islands of cooperators in a sea of The literature on complex spatial dynamics in
defectors; the exact configuration depending on evolutionary game models is large and rapidly
the initial conditions. If instead g < 85 (Fig. 10), the growing, with the evolution of behavior in the
process moves towards a configuration in which spatial Prisoners’ Dilemma being the single
in a “web” of defectors
agents other than those most-studied environment. While analyses are
cooperate. But for g 85 , 53 (Fig. 11), the system typically based on simulations, analytical results
evolves in a complicated fashion, with clusters of have been obtained in some relatively simple set-
cooperators and of defectors forming, expanding, tings (Eshel et al. 1998; Herz 1994).
disappearing, and reforming. But while the config- Recent work on complex spatial dynamics has
uration of behavior never stabilizes, the proportion considered games with three or more strategies,
of cooperators appears to settle down to about .30. including Rock- Paper-Scissors games, as well as
The specification of the dynamics considered public good contribution games and Prisoner’s
above, based on simultaneous updating and Dilemmas with voluntary participation. Introducing
certain imitation of the most successful neighbor, more than two strategies can lead to qualitatively
presents a relatively favorable environment for novel dynamic phenomena, including large-scale
Evolutionary Game Theory, Fig. 9 Local interaction in the highest payoff. Blue cells represent cooperators who
a Prisoner’s Dilemma; greedy payoff g = 1.7. In Figs. 9, also cooperated last period, green cells represent new
10, and 11, agents are arrayed on a 100 100 grid with cooperators; red cells represent defectors who also
periodic boundaries (i.e., a torus). Initial conditions are defected last period, yellow cells represent new defectors.
random with 75% cooperators and 25% defectors. Agents (Figs. 9, 10, and 11 created using VirtualLabs (Hauert
update simultaneously, imitating the neighbor who earned 2007))
Evolutionary Game Theory, Fig. 10 Local interaction in a Prisoner’s Dilemma; greedy payoff g = 1.55
602 Evolutionary Game Theory
Evolutionary Game Theory, Fig. 11 Local interaction in a Prisoner’s Dilemma; greedy payoff g = 1.65
spatial cycles and traveling waves (Hauert et al. of understanding ritualized animal conflicts. Since
2002; Szabó and Hauert 2002; Tainaka 2001). In these early contributions, evolutionary game the-
addition to simulations, the analysis of complex ory has been used to study a diverse array of
spatial dynamics is often based on approximation biological questions, including mate choice,
techniques from non-equilibrium statistical physics, parental investment, parent-offspring conflict,
and much of the research on these dynamics has social foraging, and predator-prey systems. For
appeared in the physics literature. Szabó and Fáth overviews of research on these and other topics
(2007) offers a comprehensive survey of work on in biology, see (Dugatkin and Reeve 1998; Ham-
this topic. merstein and Selten 1994).
The early development of evolutionary game
theory in economics was motivated primarily by
Applications theoretical concerns: the justification of traditional
game-theoretic solution concepts, and the develop-
Evolutionary game theory was created with bio- ment of methods for equilibrium selection in
logical applications squarely in mind. In the pre- games with multiple stable equilibria. More
history of the field, Fisher (1930) and Hamilton recently, evolutionary game theory has been
(1967) used game-theoretic ideas to understand applied to concrete economic environments, in
the evolution of sex ratios. Maynard Smith some instances as a means of contending with
(1972, 1974, 1982; Maynard Smith and Price equilibrium selection problems, and in others to
1973) introduced his definition of ESS as a way obtain an explicitly dynamic model of the
Evolutionary Game Theory 603
phenomena of interest. Of course, these applica- offer the more open terrain for further explorations.
tions are most successful when the behavioral But while it is true that we know less about these
assumptions that underlie the evolutionary models than about deterministic evolutionary
approach are appropriate, and when the time hori- dynamics, even our knowledge of the latter is
zon needed for the results to become relevant cor- limited: while dynamics on one and two dimen-
responds to the one germane to the application sional state spaces, and for games satisfying a few
at hand. interesting structural assumptions, are well-
Topics in economics theoretical studied using understood, the dynamics of behavior in the vast
the methods of evolutionary game theory range majority of many-strategy games are not.
from behavior in markets (Agastya 2004; Alós- The prospects for further applications of the tools
Ferrer 2005; Alós-Ferrer et al. 2000, 2006; Ania of evolutionary game theory are brighter still. In
et al. 2002; Ben-Shoham et al. 2004; Droste et al. economics, and in other social sciences, the analysis
2002; Hopkins and Seymour 2002; Lahkar 2007; of mathematical models has too often been synony-
Vega-Redondo 1997), to bargaining and hold-up mous with the computation and evaluation of equi-
problems (Binmore et al. 2003; Burke and Peyton librium behavior. The questions of whether and how
Young 2001; Dawid and Bentley MacLeod 2008; equilibrium will come to be are often ignored, and
Ellingsen and Robles 2002; Robles 2008; Tröger the possibility of long-term disequilibrium behavior
2002; Peyton Young 1993b, 1998a, b), to exter- left unmentioned. For settings in which its assump-
nality and implementation problems (Cabrales tions are tenable, evolutionary game theory offers a
1999; Cabrales and Ponti 2000; Mathevet 2007; host of techniques for modeling the dynamics of
Sandholm 2002, 2005b, 2007b), to questions of economic behavior. The exploitation of the possibil-
public good provision and collective action ities for a deeper understanding of human social
(Myatt and Wallace 2007, 2008a, b). The tech- interactions has hardly begun.
niques described here are being applied with
increasing frequency to problems of broader Acknowledgments The figures in sections “Determinis-
social science interest, including residential seg- tic Dynamics” and “Local Interaction” were created using
Dynamo (Sandholm and Dokumaci 2007) and VirtualLabs
regation (Bøg 2006; Dokumaci and Sandholm
(Hauert 2007), respectively. I am grateful to Caltech for its
2007a; Möbius 2000; Peyton Young 1998b, hospitality as I completed this article, and I gratefully
2001; Zhang 2004a, b) and cultural evolution acknowledge financial support under NSF Grant
(Bisin and Verdier 2001; Kuran and Sandholm SES-0617753.
2008), and to the study of behavior in transporta-
tion and computer networks (Fischer and Vöcking
2006; Monderer and Shapley 1996; Nagurney and Bibliography
Zhang 1997; Sandholm 2001b, 2003, 2005b;
Smith 1984). A proliferating branch of research Agastya M (2004) Stochastic stability in a double auction.
extends the approaches described in this article to Games Econ Behav 48:203–222
address the evolution of structure and behavior in Akin E (1979) The geometry of population genetics.
Springer, Berlin
social networks; a number of recent books (Goyal Akin E (1980) Domination or equilibrium. Math Biosci
2007; Jackson 2017; Vega-Redondo 2007) offer 50:239–250
detailed treatments of work in this domain. Akin E (1990) The differential geometry of population
genetics and evolutionary games. In: Lessard
S (ed) Mathematical and statistical developments of
evolutionary theory. Kluwer, Dordrecht, pp 1–93
Future Directions Akin E, Losert V (1984) Evolutionary dynamics of zero-
sum games. J Math Biol 20:231–258
Evolutionary game theory is a maturing field; Alós-Ferrer C (2005) The evolutionary stability of per-
fectly competitive behavior. Econ Theory 26:497–516
many basic theoretical issues are well understood, Alós-Ferrer C, Weidenholzer S (2006a) Contagion and
but many difficult questions remain. It is tempting efficiency. J Econ Theory, University of Konstanz and
to say that stochastic and local interaction models University of Vienna
604 Evolutionary Game Theory
Alós-Ferrer C, Weidenholzer S (2006b) Imitation, local Binmore K, Samuelson L, Vaughan R (1995b) Musical
interactions, and efficiency. Econ Lett 93:163–168 chairs: modeling noisy evolution. Games Econ Behav
Alós-Ferrer C, Ania AB, Schenk-Hoppé KR (2000) An 11:1–35
evolutionary model of Bertrand oligopoly. Games Econ Binmore K, Samuelson L, Peyton Young H (2003) Equi-
Behav 33:1–19 librium selection in bargaining models. Games Econ
Alós-Ferrer C, Kirchsteiger G, Walzl M (2006) On the evo- Behav 45:296–328
lution of market institutions: the platform design paradox. Bishop DT, Cannings C (1978) A generalised war of attri-
Unpublished manuscript, University of Konstanz tion. J Theor Biol 70:85–124
Anderlini L, Ianni A (1996) Path dependence and learning Bisin A, Verdier T (2001) The economics of cultural trans-
from neighbors. Games Econ Behav 13:141–177 mission and the dynamics of preferences. J Econ The-
Ania AB, Tröger T, Wambach A (2002) An evolutionary ory 97:298–319
analysis of insurance markets with adverse selection. Björnerstedt J, Weibull JW (1996) Nash equilibrium and
Games Econ Behav 40:153–184 evolution by imitation. In: Arrow KJ et al (eds) The
Arneodo A, Coullet P, Tresser C (1980) Occurrence of rational foundations of economic behavior. St. Martin’s
strange attractors in three-dimensional Volterra equa- Press, New York, pp 155–181
tions. Phys Lett 79A:259–263 Blume LE (1993) The statistical mechanics of strategic
Axelrod R (1984) The evolution of cooperation. Basic interaction. Games Econ Behav 5:387–424
Books, New York Blume LE (1995) The statistical mechanics of best response
Balkenborg D, Schlag KH (2001) Evolutionarily stable strategy revision. Games Econ Behav 11:111–145
sets. Int J Game Theory 29:571–595 Blume LE (1997) Population games. In: Arthur WB, Durlauf
Basu K, Weibull JW (1991) Strategy sets closed under SN, Lane DA (eds) The economy as an evolving complex
rational behavior. Econ Lett 36:141–146 system II. Addison-Wesley, Reading, pp 425–460
Beckmann M, McGuire CB, Winsten CB (1956) Studies in Blume LE (2003) How noise matters. Games Econ Behav
the economics of transportation. Yale University Press, 44:251–271
New Haven Bøg M (2006) Is segregation robust? Unpublished manu-
Beggs AW (2002) Stochastic evolution with slow learning. script, Stockholm School of Economics
Econ Theory 19:379–405 Bomze IM (1990) Dynamical aspects of evolutionary sta-
Ben-Shoham A, Serrano R, Volij O (2004) The evolution bility. Monatsh Math 110:189–206
of exchange. J Econ Theory 114:310–328 Bomze IM (1991) Cross entropy minimization in
Benaïm M (1998) Recursive algorithms, urn processes, uninvadable states of complex populations. J Math
and the chaining number of chain recurrent sets. Biol 30:73–87
Ergod Theory Dyn Syst 18:53–87 Börgers T, Sarin R (1997) Learning through reinforcement
Benaïm M, Hirsch MW (1999) On stochastic approxima- and the replicator dynamics. J Econ Theory 77:1–14
tion algorithms with constant step size whose average is Boylan RT (1995) Continuous approximation of dynami-
cooperative. Ann Appl Probab 30:850–869 cal systems with randomly matched individuals. J Econ
Benaïm M, Sandholm WH (2007) Logit evolution in Theory 66:615–625
potential games: reversibility, rates of convergence, Brown GW, von Neumann J (1950) Solutions of games by
large deviations, and equilibrium selection. differential equations. In: Kuhn HW, Tucker AW (eds)
Unpublished manuscript, Université de Neuch^atel and Contributions to the theory of games I. Annals of math-
University of Wisconsin ematics studies, vol 24. Princeton University Press,
Benaïm M, Weibull JW (2003) Deterministic approxima- Princeton, pp 73–79
tion of stochastic evolution in games. Econometrica Burke MA, Peyton Young H (2001) Competition and cus-
71:873–903 tom in economic contracts: a case study of Illinois
Benaïm M, Hofbauer J, Hopkins E (2006) Learning in agriculture. Am Econ Rev 91:559–573
games with unstable equilibria. Unpublished manu- Cabrales A (1999) Adaptive dynamics and the implemen-
script, Université de Neuch^atel, University of Vienna tation problem with complete information. J Econ The-
and University of Edinburgh ory 86:159–184
Berger U, Hofbauer J (2006) Irrational behavior in the Cabrales A (2000) Stochastic replicator dynamics. Int
Brown-von Neumann-Nash dynamics. Games Econ Econ Rev 41:451–481
Behav 56:1–6 Cabrales A, Ponti G (2000) Implementation, elimination of
Bergin J, Bernhardt D (2004) Comparative learning weakly dominated strategies and evolutionary dynam-
dynamics. Int Econ Rev 45:431–465 ics. Rev Econ Dyn 3:247–282
Bergin J, Lipman BL (1996) Evolution with state- Crawford VP (1991) An “evolutionary” interpretation of
dependent mutations. Econometrica 64:943–956 Van Huyck, Battalio, and Beil’s experimental results on
Binmore K, Samuelson L (1997) Muddling through: noisy coordination. Games Econ Behav 3:25–59
equilibrium selection. J Econ Theory 74:235–265 Cressman R (1996) Evolutionary stability in the finitely
Binmore K, Samuelson L (1999) Evolutionary drift and repeated prisoner’s dilemma game. J Econ Theory
equilibrium selection. Rev Econ Stud 66:363–393 68:234–248
Binmore K, Gale J, Samuelson L (1995a) Learning to be Cressman R (1997) Local stability of smooth selection
imperfect: the ultimatum game. Games Econ Behav dynamics for normal form games. Math Soc Sci
8:56–90 34:1–19
Evolutionary Game Theory 605
Cressman R (2000) Subgame monotonicity in extensive form Fudenberg D, Harris C (1992) Evolutionary dynamics with
evolutionary games. Games Econ Behav 32:183–205 aggregate shocks. J Econ Theory 57:420–441
Cressman R (2003) Evolutionary dynamics and extensive Fudenberg D, Imhof LA (2006) Imitation processes with
form games. MIT Press, Cambridge small mutations. J Econ Theory 131:251–262
Cressman R, Schlag KH (1998) On the dynamic (in)stabil- Fudenberg D, Imhof LA (2008) Monotone imitation dynam-
ity of backwards induction. J Econ Theory 83:260–285 ics in large populations. J Econ Theory 140:229–245
Dafermos S, Sparrow FT (1969) The traffic assignment Fudenberg D, Levine DK (1998) Theory of learning in
problem for a general network. J Res Natl Bur Stand games. MIT Press, Cambridge
B 73:91–118 Gaunersdorfer A, Hofbauer J (1995) Fictitious play,
Dawid H, Bentley MacLeod W (2008) Hold-up and the shapley polygons, and the replicator equation. Games
evolution of investment and bargaining norms. Games Econ Behav 11:279–303
Econ Behav 62:26–52. forthcoming Gilboa I, Matsui A (1991) Social stability and equilibrium.
Dawkins R (1976) The selfish gene. Oxford University Econometrica 59:859–867
Press, Oxford Goyal S (2007) Connections: an introduction to the eco-
Dekel E, Scotchmer S (1992) On the evolution of optimiz- nomics of networks. Princeton University Press,
ing behavior. J Econ Theory 57:392–407 Princeton
Demichelis S, Ritzberger K (2003) From evolutionary to Goyal S, Janssen MCW (1997) Non-exclusive conventions
strategic stability. J Econ Theory 113:51–75 and social coordination. J Econ Theory 77:34–57
Dindoš M, Mezzetti C (2006) Better-reply dynamics and Hamilton WD (1967) Extraordinary sex ratios. Science
global convergence to Nash equilibrium in aggregative 156:477–488
games. Games Econ Behav 54:261–292 Hammerstein P, Selten R (1994) Game theory and evolu-
Dokumaci E, Sandholm WH (2007a) Schelling redux: an tionary biology, Chapter 28. In: Aumann RJ, Hart
evolutionary model of residential segregation. S (eds) Handbook of game theory, vol 2. Elsevier,
Unpublished manuscript, University of Wisconsin Amsterdam, pp 929–993
Dokumaci E, Sandholm WH (2007b) Stochastic evolution Harsanyi JC, Selten R (1988) A general theory of equilib-
with perturbed payoffs and rapid play. Unpublished rium selection in games. MIT Press, Cambridge
manuscript, University of Wisconsin Hart S (2002) Evolutionary dynamics and backward induc-
Droste E, Hommes C, Tuinstra J (2002) Endogenous fluc- tion. Games Econ Behav 41:227–264
tuations under evolutionary pressure in Cournot com- Hart S, Mas-Colell A (2003) Uncoupled dynamics do not
petition. Games Econ Behav 40:232–269 lead to Nash equilibrium. Am Econ Rev 93:1830–1836
Dugatkin LA, Reeve HK (eds) (1998) Game theory and Hauert C (2007) Virtual Labs in evolutionary game theory.
animal behavior. Oxford University Press, Oxford Software. http://www.univie.ac.at/virtuallabs.
Ellingsen T, Robles J (2002) Does evolution solve the Accessed 31 Dec 2007
hold-up problem? Games Econ Behav 39:28–53 Hauert C, De Monte S, Hofbauer J, Sigmund K (2002)
Ellison G (1993) Learning, local interaction, and coordi- Volunteering as red queen mechanism for cooperation
nation. Econometrica 61:1047–1071 in public goods games. Science 296:1129–1132
Ellison G (2000) Basins of attraction, long run equilibria, Herz AVM (1994) Collective phenomena in spatially
and the speed of step-by-step evolution. Rev Econ Stud extended evolutionary games. J Theor Biol 169:65–87
67:17–45 Hines WGS (1987) Evolutionary stable strategies: a review
Ely JC (2002) Local conventions. Adv Econ Theory 2:1(30) of basic theory. Theor Popul Biol 31:195–272
Ely JC, Sandholm WH (2005) Evolution in Bayesian Hofbauer J (1995a) Imitation dynamics for games.
games I: theory. Games Econ Behav 53:83–109 Unpublished manuscript, University of Vienna
Eshel I, Samuelson L, Shaked A (1998) Altruists, egoists, Hofbauer J (1995b) Stability for the best response dynam-
and hooligans in a local interaction model. Am Econ ics. Unpublished manuscript, University of Vienna
Rev 88:157–179 Hofbauer J (2000) From Nash and Brown to Maynard
Fischer S, Vöcking B (2006) On the evolution of selfish Smith: equilibria, dynamics and ESS. Selection 1:81–88
routing. Unpublished manuscript, RWTH Aachen Hofbauer J, Hopkins E (2005) Learning in perturbed asym-
Fisher RA (1930) The genetical theory of natural selection. metric games. Games Econ Behav 52:133–152
Clarendon Press, Oxford Hofbauer J, Sandholm WH (2002) On the global conver-
Foster DP, Peyton Young H (1990) Stochastic evolutionary gence of stochastic fictitious play. Econometrica
game dynamics. Theor Popul Biol 38:219–232. also in 70:2265–2294
Corrigendum 51:77–78 (1997) Hofbauer J, Sandholm WH (2006a) Stable games.
Freidlin MI, Wentzell AD (1998) Random perturbations of Unpublished manuscript, University of Vienna and
dynamical systems, 2nd edn. Springer, New York University of Wisconsin
Friedman D (1991) Evolutionary games in economics. Hofbauer J, Sandholm WH (2006b) Survival of dominated
Econometrica 59:637–666 strategies under evolutionary dynamics. Unpublished
Friedman JW, Mezzetti C (2001) Learning in games by manuscript, University of Vienna and University of
random sampling. J Econ Theory 98:55–84 Wisconsin
Friedman D, Yellin J (1997) Evolving landscapes for pop- Hofbauer J, Sandholm WH (2007) Evolution in games
ulation games. Unpublished manuscript, UC Santa with randomly disturbed payoffs. J Econ Theory
Cruz 132:47–69
606 Evolutionary Game Theory
Hofbauer J, Sigmund K (1988) Theory of evolution and Kuzmics C (2004) Stochastic evolutionary stability in
dynamical systems. Cambridge University Press, extensive form games of perfect information. Games
Cambridge Econ Behav 48:321–336
Hofbauer J, Sigmund K (1998) Evolutionary games and Lahkar R (2007) The dynamic instability of dispersed price
population dynamics. Cambridge University Press, equilibria. Unpublished manuscript, University Col-
Cambridge lege London
Hofbauer J, Sigmund K (2003) Evolutionary game dynam- Lahkar R, Sandholm WH (2017) The projection dynamic
ics. Bull Am Math Soc (New Ser) 40:479–519 and the geometry of population games. Games Econ
Hofbauer J, Swinkels JM (1996) A universal Shapley Behav
example. Unpublished manuscript, University of Losert V, Akin E (1983) Dynamics of games and genes:
Vienna and Northwestern University discrete versus continuous time. J Math Biol
Hofbauer J, Weibull JW (1996) Evolutionary selection 17:241–251
against dominated strategies. J Econ Theory Lotka AJ (1920) Undamped oscillation derived from the
71:558–573 law of mass action. J Am Chem Soc 42:1595–1598
Hofbauer J, Schuster P, Sigmund K (1979) A note on Mailath GJ (1992) Introduction: symposium on evolution-
evolutionarily stable strategies and game dynamics. ary game theory. J Econ Theory 57:259–277
J Theor Biol 81:609–612 Maruta T (1997) On the relationship between risk-
Hofbauer J, Oechssler J, Riedel F (2005) Brown-von dominance and stochastic stability. Games Econ
Neumann-Nash dynamics: the continuous strategy Behav 19:221–234
case. Unpublished manuscript, University of Vienna Maruta T (2002) Binary games with state dependent sto-
Hopkins E (1999) A note on best response dynamics. chastic choice. J Econ Theory 103:351–376
Games Econ Behav 29:138–150 Mathevet L (2007) Supermodular Bayesian implementa-
Hopkins E, Seymour RM (2002) The stability of price tion: learning and incentive design. Unpublished man-
dispersion under seller and consumer learning. Int uscript, Caltech
Econ Rev 43:1157–1190 Maynard Smith J (1972) Game theory and the evolution of
Imhof LA (2005) The long-run behavior of the stochastic fighting. In: Maynard Smith J (ed) On evolution. Edin-
replicator dynamics. Ann Appl Probab 15:1019–1045 burgh University Press, Edinburgh, pp 8–28
Jackson MO (2017) Social and economic networks. Maynard Smith J (1974) The theory of games and the
Princeton University Press, Princeton evolution of animal conflicts. J Theor Biol 47:209–221
Jacobsen HJ, Jensen M, Sloth B (2001) Evolutionary learn- Maynard Smith J (1982) Evolution and the theory of
ing in signalling games. Games Econ Behav 34:34–63 games. Cambridge University Press, Cambridge
Jordan JS (1993) Three problems in learning mixed- Maynard Smith J, Price GR (1973) The logic of animal
strategy Nash equilibria. Games Econ Behav conflict. Nature 246:15–18
5:368–386 Miękisz J (2004) Statistical mechanics of spatial evolution-
Josephson J (2008) Stochastic better reply dynamics in ary games. J Phys A 37:9891–9906
finite games. Econ Theory 35:381–389 Möbius MM (2000) The formation of ghettos as a local
Josephson J, Matros A (2004) Stochastic imitation in finite interaction phenomenon. Unpublished manuscript, MIT
games. Games Econ Behav 49:244–259 Monderer D, Shapley LS (1996) Potential games. Games
Kandori M, Rob R (1995) Evolution of equilibria in the Econ Behav 14:124–143
long run: a general theory and applications. J Econ Moran PAP (1962) The statistical processes of evolution-
Theory 65:383–414 ary theory. Clarendon Press, Oxford
Kandori M, Rob R (1998) Bandwagon effects and long run Myatt DP, Wallace CC (2003) A multinomial probit model
technology choice. Games Econ Behav 22:84–120 of stochastic evolution. J Econ Theory 113:286–301
Kandori M, Mailath GJ, Rob R (1993) Learning, mutation, Myatt DP, Wallace CC (2007) An evolutionary justification
and long run equilibria in games. Econometrica for thresholds in collective-action problems.
61:29–56 Unpublished manuscript, Oxford University
Kim Y-G, Sobel J (1995) An evolutionary approach to pre- Myatt DP, Wallace CC (2008a) An evolutionary analysis of
play communication. Econometrica 63:1181–1193 the volunteer’s dilemma. Games Econ Behav 62:67–76
Kimura M (1958) On the change of population fitness by Myatt DP, Wallace CC (2008b) When does one bad apple
natural selection. Heredity 12:145–167 spoil the barrel? An evolutionary analysis of collective
Kosfeld M (2002) Stochastic strategy adjustment in coor- action. Rev Econ Stud 75:499–527
dination games. Econ Theory 20:321–339 Nachbar JH (1990) “Evolutionary” selection dynamics in
Kukushkin NS (2004) Best response dynamics in finite games: convergence and limit properties. Int J Game
games with additive aggregation. Games Econ Behav Theory 19:59–89
48:94–110 Nagurney A, Zhang D (1997) Projected dynamical systems
Kuran T, Sandholm WH (2008) Cultural integration and its in the formulation, stability analysis and computation
discontents. Rev Econ Stud 75:201–228 of fixed demand traffic network equilibria. Transp Sci
Kurtz TG (1970) Solutions of ordinary differential equa- 31:147–158
tions as limits of pure jump Markov processes. J Appl Nash JF (1951) Non-cooperative games. Ann Math
Probab 7:49–58 54:287–295
Evolutionary Game Theory 607
Nöldeke G, Samuelson L (1993) An evolutionary analysis Sandholm WH (1998) Simple and clever decision rules in a
of backward and forward induction. Games Econ model of evolution. Econ Lett 61:165–170
Behav 5:425–454 Sandholm WH (2001a) Almost global convergence to
Nowak MA (2006) Evolutionary dynamics: exploring the p-dominant equilibrium. Int J Game Theory
equations of life. Belknap/Harvard, Cambridge 30:107–116
Nowak MA, May RM (1992) Evolutionary games and Sandholm WH (2001b) Potential games with continuous
spatial chaos. Nature 359:826–829 player sets. J Econ Theory 97:81–108
Nowak MA, May RM (1993) The spatial dilemmas of Sandholm WH (2002) Evolutionary implementation and
evolution. Int J Bifurcat Chaos 3:35–78 congestion pricing. Rev Econ Stud 69:81–108
Nowak MA, Bonhoeffer S, May RM (1994a) More spatial Sandholm WH (2003) Evolution and equilibrium under
games. Int J Bifurcat Chaos 4:33–56 inexact information. Games Econ Behav 44:343–378
Nowak MA, Bonhoeffer S, May RM (1994b) Spatial Sandholm WH (2005a) Excess payoff dynamics and other
games and the maintenance of cooperation. Proc Natl well-behaved evolutionary dynamics. J Econ Theory
Acad Sci U S A 91:4877–4881 124:149–170
Nowak MA, Sasaki A, Taylor C, Fudenberg D (2004) Sandholm WH (2005b) Negative externalities and evolu-
Emergence of cooperation and evolutionary stability tionary implementation. Rev Econ Stud 72:885–915
in finite populations. Nature 428:646–650 Sandholm WH (2006) Pairwise comparison dynamics.
Oechssler J, Riedel F (2001) Evolutionary dynamics on Unpublished manuscript, University of Wisconsin
infinite strategy spaces. Econ Theory 17:141–162 Sandholm WH (2007a) Evolution in Bayesian games II:
Oechssler J, Riedel F (2002) On the dynamic foundation of stability of purified equilibria. J Econ Theory
evolutionary stability in continuous models. J Econ 136:641–667
Theory 107:141–162 Sandholm WH (2007b) Pigouvian pricing and stochastic
Peyton Young H (1993a) The evolution of conventions. evolutionary implementation. J Econ Theory
Econometrica 61:57–84 132:367–382
Peyton Young H (1993b) An evolutionary model of Sandholm WH (2007c) Large population potential games.
bargaining. J Econ Theory 59:145–168 Unpublished manuscript, University of Wisconsin
Peyton Young H (1998a) Conventional contracts. Rev Sandholm WH (2007d) Simple formulas for stationary
Econ Stud 65:773–792 distributions and stochastically stable states. Games
Peyton Young H (1998b) Individual strategy and social Econ Behav 59:154–162
structure. Princeton University Press, Princeton Sandholm WH (2017) Population games and evolutionary
Peyton Young H (2001) The dynamics of conformity. In: dynamics. MIT Press, Cambridge
Durlauf SN, Peyton Young H (eds) Social dynamics. Sandholm WH, Dokumaci E (2007) Dynamo: phase dia-
Brookings Institution Press/MIT Press, Washington, grams for evolutionary dynamics. Software. http://
DC/Cambridge, pp 133–153 www.ssc.wisc.edu/~whs/dynamo
Rhode P, Stegeman M (1996) A comment on “learning, Sandholm WH, Pauzner A (1998) Evolution, population
mutation, and long run equilibria in games”. growth, and history dependence. Games Econ Behav
Econometrica 64:443–449 22:84–120
Ritzberger K, Weibull JW (1995) Evolutionary selection in Sandholm WH, Dokumaci E, Lahkar R (2017) The projec-
normal form games. Econometrica 63:1371–1399 tion dynamic and the replicator dynamic. Games Econ
Robles J (1998) Evolution with changing mutation rates. Behav
J Econ Theory 79:207–223 Sato Y, Akiyama E, Doyne Farmer J (2002) Chaos in
Robles J (2008) Evolution, bargaining and time prefer- learning a simple two-person game. Proc Natl Acad
ences. Econ Theory 35:19–36 Sci U S A 99:4748–4751
Robson A, Vega-Redondo F (1996) Efficient equilibrium Schlag KH (1998) Why imitate, and if so, how?
selection in evolutionary games with random matching. A boundedly rational approach to multi-armed bandits.
J Econ Theory 70:65–92 J Econ Theory 78:130–156
Rosenthal RW (1973) A class of games possessing pure Schuster P, Sigmund K (1983) Replicator dynamics.
strategy Nash equilibria. Int J Game Theory 2:65–67 J Theor Biol 100:533–538
Samuelson L (1988) Evolutionary foundations of solution Schuster P, Sigmund K, Hofbauer J, Wolff R (1981)
concepts for finite, two-player, normal-form games. In: Selfregulation of behaviour in animal societies I: sym-
Vardi MY (ed) Proceedings of the second conference metric contests. Biol Cybern 40:1–8
on theoretical aspects of reasoning about knowledge Selten R (1991) Evolution, learning, and economic behav-
(Pacific Grove, CA, 1988). Morgan Kaufmann Pub- ior. Games Econ Behav 3:3–24
lishers, Los Altos, pp 211–225 Shahshahani S (1979) A new mathematical framework for
Samuelson L (1994) Stochastic stability in games with the study of linkage and selection. Mem Am Math Soc
alternative best replies. J Econ Theory 64:35–65 211:34
Samuelson L (1997) Evolutionary games and equilibrium Shapley LS (1964) Some topics in two person games. In:
selection. MIT Press, Cambridge Dresher M, Shapley LS, Tucker AW (eds) Advances in
Samuelson L, Zhang J (1992) Evolutionary stability in game theory. Annals of mathematics studies, vol 52.
asymmetric games. J Econ Theory 57:363–391 Princeton University Press, Princeton, pp 1–28
608 Evolutionary Game Theory
Skyrms B (1990) The dynamics of rational deliberation. Tröger T (2002) Why sunk costs matter for bargaining
Harvard University Press, Cambridge outcomes: an evolutionary approach. J Econ Theory
Skyrms B (1992) Chaos in game dynamics. J Log Lang Inf 102:28–53
1:111–130 Ui T (1998) Robustness of stochastic stability.
Smith MJ (1984) The stability of a dynamic model of Unpublished manuscript, Bank of Japan
traffic assignment -an application of a method of van Damme E, Weibull JW (2002) Evolution in games
Lyapunov. Transp Sci 18:245–252 with endogenous mistake probabilities. J Econ Theory
Smith HL (1995) Monotone dynamical systems: an intro- 106:296–315
duction to the theory of competitive and cooperative Vega-Redondo F (1996) Evolution, games, and economic
systems. American Mathematical Society, Providence behaviour. Oxford University Press, Oxford
Stegeman M, Rhode P (2004) Stochastic Darwinian equi- Vega-Redondo F (1997) The evolution of Walrasian
libria in small and large populations. Games Econ behavior. Econometrica 65:375–384
Behav 49:171–214 Vega-Redondo F (2007) Complex social networks. Cam-
Swinkels JM (1992) Evolutionary stability with equilib- bridge University Press, Cambridge
rium entrants. J Econ Theory 57:306–332 Volterra V (1931) Lecons sur la Theorie Mathematique de
Swinkels JM (1993) Adjustment dynamics and rational la Lutte pour la Vie. Gauthier-Villars, Paris
play in games. Games Econ Behav 5:455–484 von Neumann J, Morgenstern O (1944) Theory of games
Szabó G, Fáth G (2007) Evolutionary games on graphs. and economic behavior. Prentice-Hall, Princeton
Phys Rep 446:97–216 Weibull JW (1995) Evolutionary game theory. MIT Press,
Szabó G, Hauert C (2002) Phase transitions and Cambridge
volunteering in spatial public goods games. Phys Rev Weibull JW (1996) The mass action interpretation. Excerpt
Lett 89:11801(4) from “The work of John Nash in game theory: nobel
Tainaka K-I (2001) Physics and ecology of rock-paper- seminar, December 8, 1994”. J Econ Theory
scissors game. In: Marsland TA, Frank I (eds) Com- 69:165–171
puters and games, second international conference Weissing FJ (1991) Evolutionary stability and dynamic
(Hamamatsu 2000). Lecture notes in computer science, stability in a class of evolutionary normal form
vol 2063. Springer, Berlin, pp 384–395 games. In: Selten R (ed) Game equilibrium models
Tanabe Y (2006) The propagation of chaos for interacting I. Springer, Berlin, pp 29–97
individuals in a large population. Math Soc Sci Zeeman EC (1980) Population dynamics from game the-
51:425–152 ory. In: Nitecki Z, Robinson C (eds) Global theory of
Taylor PD, Jonker L (1978) Evolutionarily stable strategies dynamical systems (Evanston, 1979). Lecture notes in
and game dynamics. Math Biosci 40:145–156 mathematics, vol 819. Springer, Berlin, pp 472–497
Thomas B (1985) On evolutionarily stable sets. J Math Zhang J (2004a) A dynamic model of residential segrega-
Biol 22:105–115 tion. J Math Sociol 28:147–170
Topkis D (1979) Equilibrium points in nonzero-sum Zhang J (2004b) Residential segregation in an all-
n-person submodular games. SIAM J Control Optim integrationist world. J Econ Behav Organ 24:533–550
17:773–787
dominance relation > on G. Given networks
Networks and Stability G and G0 in G, G0 p G (read G0 path domi-
nates G) if either G0 = G or there is a finite
Frank H. Page Jr.1 and Myrna Wooders2 sequence of networks in G beginning with
1
Department of Economics, Indiana University, G and ending with G0 such that each network
Bloomington, IN, USA along the sequence dominates its predecessor.
2
Department of Economics, Vanderbilt Heterogeneous networks A heterogeneous net-
University, Nashville, TN, USA work consists of a finite set of nodes together
with a finite set of mathematical objects called
labeled links or labeled arcs, each identifying a
Article Outline particular type of connection between a pair of
nodes. Given finite node set N with typical
Glossary element i and given finite label set Awith typical
Definition of the Subject element a, a heterogeneous linking network G is
Introduction a finite collection of ordered pairs of the form
The Primitives (a, {i, i0 }) called labeled links. Labeled link (a,
Abstract Games of Network Formation and {i, i0 }) G indicates that nodes i and i0 are
Stability connected in network G via a type a link.
Strong Stability, Pairwise Stability, Nash Stability, A heterogeneous directed network G is a finite
and Farsighted Consistency collection of ordered pairs of the form (a, (i, i0 ))
Singleton Basins of Attraction called labeled arcs. Labeled arc (a, (i, i0 ))
Future Directions G indicates that nodes i and i0 are connected in
Bibliography network G via a type a arc running from i to i0 .
In a heterogeneous network (whether it be a
Glossary linking network or a directed network) connec-
tions can differ and are distinguished by type.
Abstract game of network formation with Homogeneous networks A homogeneous net-
respect to irreflexive dominance An abstract work consists of a finite set of nodes together
game of network formation with respect to with a finite set of mathematical objects called
irreflexive dominance consists of a feasible links or arcs, each identifying a connection
set of networks G equipped with a irreflexive between a pair of nodes. Given finite node set
dominance relation >. A dominance relation N with typical element i, a homogeneous linking
on G is a binary relation on G such that for all network G is a finite collection of sets of the form
G and G0 in G, G0 > G (read G0 dominates G) is {i, i0 } called links. Link {i, i0 } G indicates that
either true or false. The dominance relation is nodes i and i0 are connected in network G.
irreflexive if G > G is always false. A homogeneous directed network G is a finite
Abstract game of network formation with collection of ordered pairs (i, i0 ) called arcs. Arc
respect to path dominance An abstract (i, i0 ) G indicates that nodes i and i0 are
game of network formation with respect to connected in network G via a connection running
path dominance consists of a feasible set of from i to i0 . In a homogeneous network (whether
networks G equipped with a path dominance it be a linking network or a directed network) all
relation p induced by an irreflexive connections are of the same type.
important areas of research that are beyond the irreflexive binary relation defined on the feasible
scope of this entry. set of networks. Thus, we will assume that players
have strong (or strict) preferences over networks
Feasible Sets The feasible set may consist of Under strong preferences, if a player prefers one
networks as simple as homogenous linking net- network to another, then the player’s preference is
works or as complex as heterogeneous directed strict. However, we will comment where appro-
networks. All networks consist of a finite set of priate on weak preferences. Under weak prefer-
nodes (representing, for example, economic ences, if a player prefers one network to another,
agents or players) together with a finite set of then the player’s preference is either strict or
mathematical objects called links, labeled links, indifferent.
arcs, or labeled arcs describing the connections
between nodes. Here we will focus on homoge- Rules of Network Formation We will focus here
neous linking networks, as does most of literature on three different sets of rules: Jackson-Wolinsky
(see, for example, Myerson (1977), Jackson and rules (Jackson and Wolinsky 1996), Jackson-van
Wolinsky (1996), and Jackson and van den den Nouweland rules (Jackson and van den
Nouweland (2005)), except in our discussion of Nouweland 2005), and Bala-Goyal rules (Bala
Nash stability where we will consider homoge- and Goyal 2000). In particular, in our discussions
neous directed networks (as in Bala and Goyal of pairwise stable homogeneous linking networks
2000). What distinguishes homogeneous net- we will assume that the rules of network formation
works (linking or directed) from heterogeneous are the Jackson-Wolinsky rules. Under the
networks (linking or directed) is that in a homo- Jackson-Wolinsky rules the addition of a link is
geneous network all connections between nodes bilateral (i.e., the two players that would be
are of the same type whether represented by a link involved in the link must agree to adding the
as in a linking network or by an arc as in a directed link), the subtraction of a link is unilateral (i.e., at
network. Thus, in a homogeneous linking net- least one player involved in the link must agree to
work all links are of the same type and in a subtract or delete the link), and network changes
homogeneous directed network all arcs are of the take place one link at a time (i.e., only one link can
same type. While homogeneous networks are be added or subtracted at a time). In our discussion
quite restrictive, they have been very important of strongly stable homogeneous linking networks,
in developing our understanding of social and we will assume that the rules of network formation
economic networks and have proved very useful are the Jackson-van den Nouweland rules. Under
in many economic applications (see, for example, the Jackson-van den Nouweland rules link addition
Belleflamme and Bloch (2004), Bramoulle and is bilateral, link subtraction is unilateral, and in any
Kranton (2007a), Calvo-Armengol (2004), and one play of the game several links can be added
Furusawa and Konishi (2007)). Page and and/or subtracted. Thus the Jackson-van den
Wooders (2005, 2008; Page and Kamat 2005; Nouweland rules are the Jackson-Wolinsky rules
Page et al. 2005) extend the existing literature on without the one-link-at-a-time restriction. Finally,
economic and social networks by introducing the in our discussion of Nash homogeneous directed
notion of heterogeneous directed networks. These networks we will assume that the rules of network
types of networks potentially have a rich set appli- formation are the Bala-Goyal rules. Under the
cations (in the natural sciences, engineering, soci- Bala-Goyal rules an arc may be added or subtracted
ology, politics, as well as economics) because unilaterally by the initiating player involved in the
connections or interactions between nodes can arc and in any one play of the game only network
be distinguished by direction or intent as well as changes brought about by an individual player are
by type, intensity, or purpose. allowed. Note that all three of these sets of rules can
be described as being uniform across networks.
Players’ Preferences We will assume throughout Under uniform rules the rules for changing a net-
that each player’s preferences are given by an work are the same no matter which status quo
612 Networks and Stability
network is being changed. Page et al. (2005) allow 2. A stable set of homogeneous networks (in the
nonuniform rules and introduce a network repre- sense of von Neumann-Morgenstern) with
sentation of nonuniform rules. respect to path dominance consists of one net-
work from each basin of attraction.
Dominance Relations Given players’ prefer- 3. The path dominance core, defined as the set
ences and the rules of network formation we will networks having the property that no network
define a dominance relation over the feasible set in the set is path dominated by any other homo-
of networks that incorporates both players prefer- geneous network, consists of one network
ences and the rules. Here we will focus on domi- from each basin of attraction containing a sin-
nance relations that are either direct or indirect. gle network. Note that the path dominance core
Under direct dominance players are concerned is contained in each stable set and is nonempty
with immediate consequences of their network if and only if there is a basin of attraction
formation strategies whereas under indirect dom- containing a single network. As a corollary,
inance players are farsighted and consider the we conclude that any homogeneous network
eventual consequences of their strategies. contained in the path dominance core is
constrained Pareto efficient.
General Results A specification of the primitives 4. From the results above it follows that if the
induces two types of abstract games over homo- dominance relation is transitive and irreflexive,
geneous networks: (i) a network formation game then the path dominance core is nonempty.
with respect to the irreflexive dominance relation
induced by preferences and rules, and (ii) a net- These results are a special cases of results due
work formation game with respect to path domi- to Page and Wooders (2008).
nance induced by this irreflexive dominance
relation. We will begin by considering the game Specific Results for Pairwise Stability, Strong
with respect to irreflexive dominance and present Stability, Nash Stability, and Farsighted
results on the existence of quasi-stable and stable Consistency
networks. These results provide a network rendi- What are the connections between our notions of
tion of classical results from graph theory on the stability for homogeneous networks (basins of
existence of quasi-stable sets and stable sets due to attraction, path dominance stable sets, and path
Chvatal and Lovasz (1972), Berge (2001), and dominance core) and the notions of strong stabil-
Richardson (1953). We will also present a result ity (Dutta and Mutuswami 1997; Jackson and van
on the existence and nonemptiness of the set of den Nouweland 2005), pairwise stability (Jackson
farsightedly consistent networks. This result is a and Wolinsky 1996), Nash stability (Bala and
network rendition of a result due to Chwe (1994) Goyal 2000), and farsighted consistency (Chwe
for abstract games. 1994; Page et al. 2005)? From the general results
Next we will consider the game over homoge- in (Page and Wooders 2005; Page and Wooders
neous networks with respect to path dominance, 2008) for heterogeneous directed networks, we
and we will conclude that the following results will conclude for the case of homogeneous net-
hold: works (linking or directed) that, depending on
how we specialize the primitives of the model,
1. Given preferences and the rules governing net- the path dominance core is equal to the set of
work formation, the set of homogeneous net- strongly stable networks, the set of pairwise stable
works (linking or directed) contains a unique, networks, or the set of Nash networks. In particu-
finite, disjoint collection of nonempty subsets lar, we will conclude that:
each constituting a strategic basin of attrac-
tion. These basins of attraction are the absorb- 1. If path dominance is induced by a direct dom-
ing sets of the competitive process of network inance relation, then in the set of homoge-
formation modeled via the game. neous linking networks the path dominance
Networks and Stability 613
core is equal to the set of strongly stable Vannetelbosch (2004), Bhattacharya (2005), Her-
networks. rings et al. (2006).
2. If, in addition, the rules of network formation We remark that solution concepts defined
are the Jackson-Wolinsky rules, then in the set using dominance relations have a long and distin-
of homogeneous linking networks the path guished history in the literature of game theory.
dominance core is equal to the set of pairwise First, consider the von Neuman-Morgenstern sta-
stable networks. ble set (see Richardson 1953; von Neumann and
3. If path dominance is induced by a direct dom- Morgenstern 1944). The vN-M stable set is
inance relation and if the rules of network defined with respect to a dominance relation on a
formation are the Bala-Goyal rules, then in set of outcomes and consists of those outcomes
the set of homogeneous directed networks the that are externally and internally stable with
path dominance core is equal to the set of Nash respect to the given dominance relation. Similarly,
networks. Gillies (1959) defines the core based on a given
dominance relation. These solution concepts, with
We can then conclude from (3) above that the a few exceptions, have typically been applied to
existence of at least one basin of attraction models of economies or cooperative games where
containing a single network is, depending on the notion of dominance is based on what a coa-
how we specialize primitives, both necessary lition can achieve using only the resources owned
and sufficient for either (i) the existence of a by its members (cf., Aumann 1964) or a given set
strongly stable network, or (ii) a pairwise stable of utility vectors for each possible coalition (cf.,
network, or (iii) a Nash network. Scarf 1967). Particularly notable exceptions are
For path dominance induced by an indirect Schwartz (1974), Kalai et al. (1976), Kalai and
dominance relation, we can conclude from our Schmeidler (1977), Shenoy (1980), Inarra et al.
prior results that for the case of homogeneous (2005), and van Deemen (1991). Their motiva-
linking networks with Jackson-Wolinsky or tions are in part similar to ours in that they take as
Jackson-van den Nouweland rules or for the case given a set of possible choices for players (here
of homogeneous directed networks with Bala- consisting of set of networks) and a dominance
Goyal rules, each strategic basin of attraction has relation and, based on these, describe a set of
a nonempty intersection with the largest farsight- possible or likely outcomes called, by Kalai and
edly consistent set of networks. This result, Schmeidler, the admissible set. While their exam-
together with (2) above, implies that there always ples treat direct dominance, their general results
exists a path dominance stable set of homoge- have wider applications.
neous networks contained in the largest farsight- Because our objective here is to provide a
edly consistent set. Thus, the path dominance core unified game theoretic treatment of the main sta-
is contained in the largest consistent set. In light of bility notions for network formation games, many
our results on the path dominance core and stabil- topics related to strategic networks are not cov-
ity (both strong and pairwise), we conclude that if ered here. For example, we do not discuss the
path dominance is induced by an indirect domi- conflict between stability and efficiency which is
nance relation, then any homogeneous network the main focus of the important papers by Dutta
contained in the path dominance core (i.e., the and Mutuswami (1997) and Currarini and Morelli
farsighted core) is not only farsightedly consistent (2000), and Mutuswami and Winter (2002). Nor
but also strongly stable, as well as pairwise stable. do we treat the topic of network formation and
Other papers using indirect dominance cooperative games, the topic of the seminal paper
(or variations thereof) and farsighted consistency by Myerson (1977) and the excellent book by
in games (not necessarily network formation Slikker and van den Nouweland (2001) among
games) include Li (1992, 1993), Xue (1998, many other contributions, or the topic of network
2000), Luo (2001), Mariotti and Xue (2002), formation and evolution treated in Hojman and
Diamantoudi and Xue (2003), and Mauleon and Szeidl (2006). Our game theoretic approach
614 Networks and Stability
G1 ¼ ffi1 , i2 g, fi1 , i3 gg a3
in a heterogeneous linking network, links are still direction, all connections are of the same type –
without orientation or direction and loops are not that is, connections are homogeneous. Finally,
possible. note that in a directed network loops are allowed.
For example, (i3, i1) G3 and therefore in net-
Directed Networks The link orientation prob- work G3 there is an arc running from node i1 to
lem as well as the problem of loops is resolved node i1.
by moving to directed networks. As is the case The following definition, from (Page et al.
with linking networks, there are two categories of 2005), allows for heterogeneous, multiple arcs
directed networks: Homogeneous directed net- by labeling arcs using the set A of arc types.
works and heterogeneous directed networks.
Definition 4 (Heterogeneous Directed Net-
Definition 3 (Homogeneous Directed Net- works) A heterogeneous directed network, G,
works) A homogeneous directed network, G, is a is a subset of A (N N). Given any G A
subset of N N. Given any G N N, each (N N), each ordered pair (a,(i, i0 ) G consisting
ordered pair (i, i0 }) G consisting of a beginning of an arc type and an arc is called a labeled arc in G.
node i and an ending node i0 is called an arc in G. The collection of all labeled directed networks is
The collection of all directed networks is denoted by denoted by P(A (N N)). Thus, A (N N) is
P(N N). the set of all possible labeled arcs and a heteroge-
Thus, N N is the set of all possible arcs and a neous directed network G is simply a subset of all
homogeneous directed network G is simply a possible labeled arcs. For example, given N = {il,
subset of all possible arcs. For example, given i2, i3} and A = {al, a2, a3} the subset
N = {il, i2, i3},
G4 ¼ fða2 , ði1 , i1 ÞÞ, ða1 , ði1 , i2 ÞÞ; ða1 , ði1 , i3 ÞÞ;
G3 ¼ fði1 , i1 Þ, ði1 , i2 Þ, ði1 , i3 Þ, ði3 , i1 Þg
ða1 , ði3 , i1 ÞÞ; ða3 , ði1 , i3 ÞÞg
is a homogeneous directed network. Figure 3
of A (N N) is a heterogeneous directed net-
depicts homogeneous directed network G3.
work. Figure 4 depicts heterogeneous directed
Here, the arc (i1, i3) G3 denotes that nodes i1
network G4.
and i3 are connected by an arc running from node
Here, the labeled arc (a1, {il, i3}) G4 denotes
i1 to node i3. Note that because (i1, i3) G3 there
that nodes il and i3 are connected by an arc of type
is also an arc running in the opposite direction
al running from node i1 to node i3. Note that nodes
from i3 to i1. Also, note that in a homogeneous
directed network, while connections have
a2
i1
i1
a3
a1
a1
a1
i2 i3 i2 i3
Networks and Stability, Fig. 3 Homogeneous directed Networks and Stability, Fig. 4 Heterogeneous directed
network G3 network G4
Networks and Stability 617
i1 and i3 are also connected by an arc of type a3 indegree of node i for arc type a in network G.
running from node i1 to node i3. Thus, in addi- For directed network G4, we have for example,
tion to having direction, connections are hetero-
geneous. Also, note that arc type a1 is used three
times in network G4: Once in describing the j Gþ
4 ði2 Þ j¼ 0 and j G
4 ði2 Þ j¼ 1,
club network G, there is an arc of type a running A sequence of arcs {(i, i0)k}k in G G P(N
from node (player) d to node (club) c. Thus, in this N) constitutes a path if the beginning node i of arc
example the feasible set is a set of bipartite directed {i, i0 })k coincides with the ending node i0 of pre-
networks. ceding arc (i, i0 )k1. A circuit is a finite path
0 h
We remark that the basic model of club forma- ði, i Þk k¼1 in G such that node i of arc (i, i0 )1
tion underlying this example has a long history in and node i0 of arc (i, i0 )h are the same node. The
the literature, going back to economies with essen- length of a path is the number of arcs in the path.
tially homogeneous agents modeled as games in Finally, a sequence of labeled arcs {(a, (i, i0))k}k
characteristic function form (Shubik (bridge game) in G G P(N N) constitutes a path if
(Shubik 1971)) and serves as an example of several the beginning node i of labeled arc (a, {i, i0 })k,
models in more recent literature on coalitional coincides with the ending node i0 of preceding
games (cf., Banerjee et al. 2001), Bogomolnaia arc (a,(i, i0 ))k1. A circuit is a finite path
and Jackson (2002), Diamantoudi and Xue (2003), h
ða, ði, i0Þ k k¼1 in G such that node i of labeled
and in economies with clubs (cf., Arnold and
arc (a,(i, i0 ))1 and node i0 of labeled arc (a,(i, i0 ))h are
Wooders 2006 and Allouch and Wooders 2007).
the same node. The length of a path is the number
As in Konishi et al. (1998) and Demange (1994),
of labeled arcs in the path.
for example, we allow “free entry” into clubs.
In Fig. 5, {(a1, (i3, i1))1, (a2, (i1, i1))2, (a1, (i1,
i2))3} is a path in G4 of length 3, while {(a1 (i3,
Paths and Circuits i1))1, (a2,(i1, i1))2,(a3, (i1, i3))3)} circuit in G4 of
A sequence of links {{i, i0}k}k in G G length 3.
P(P2(N)) constitutes a path if each link {i, i0 }k,
has one node in common with the preceding link
Players’ Preferences
{i, i0 }k1 and the other node in common with the
For the remainder of this entry we will assume
succeeding link {i, i0 }k+1. A circuit is a finite path
0 h that the set of players is given by the set of
fi, i gk k¼1 in G which begins at a node i and nodes N. Thus, henceforth the nodes represent
returns the same node. The length of a path is the players in the game of network formation.
number of links in the path. Let G(N) denote the collection of all coalitions
A sequence of labeled links {(a, {i, i0})k}k in of players (i.e., nonempty subsets of N) with typ-
G G P(A P2(N)) constitutes a path if each ical element denoted by S.
labeled link (a, {i, i0 })k has one node in common For each player i N let i be an irreflexive
with the preceding labeled link (a, {i, i0 })k1 and the binary relation on G( = P(P2(N)) or P(N N))
other node in common with the succeeding link (a,
h and write G0 i G if player i N prefers network
{i, i0 })k+1. A circuit is a finite path ða, fi, i0 g k k¼1 G0 G to network G G. Because i is irre-
in G which begins at a node i and ends at the same flexive, G 6i G for all networks G G. Coalition
node. The length of a path is the number of labeled S0 G(N) prefers network G0 to network G,
links in the path. written G0 S0 G, if G0 6i G for all players i S0 .
a1 a1
i2 i3 i2 i3
Path Circuit
Networks and Stability 619
Note that because players’ preferences {i}i N Definition 5 (Coalitional Preference Super-
are irreflexive, coalitional preferences, {S}S networks, Page et al. (2005)) Given feasible
G(N), are also irreflexive. set G ( = P(P2(N)) or P(N N)), a coalitional
preference supernetwork P is a subset of P
A Remark on Weak Preferences (G G) such that (pS0 , ðG, G0 Þ ) is contained in
Players are said to have weak preferences on P if and only if G0 S0 G.
G( = P(P2(N)) or P(N N)) denoted by ≳i if G0
≳i G means that player i either strongly prefers G0 The Rules of Network Formation
to G (denoted G0 i G) or is indifferent between G0 The rules of network formation are specified via a
and G (denoted G0 ~i G). If coalitional preferences collection of coalitional effectiveness relations
are based on weak preference, then we say that {!S}S G(N) defined on the feasible set of net-
coalition S0 P(N) weakly prefers network G0 to works G ( = P(P2(N)) or P(N N)). Each effec-
network G, written G0 wS0 G, if for all players i tiveness relation !S represents what a coalition
S0 , G0 ≳i G and if for at least one player i0 S0 , S can do. Thus, if G !S G0 this means that under
G0 i0 G. Note that if preferences are weak and G0 the rules of network formation coalition S G(N)
wS0 G where S0 consists of a single player i, so that can change network G G to network G0 G by
S0 = {i0 }, then G0 i0 G . Finally, note that weak adding, subtracting, or replacing connections in
coalitional preferences {wS}S G (N), are irreflex- G (where, depending on the feasible set, a con-
ive (i.e., G 6wS G for all G G and S G(N)). nection is a link or an arc).
denote the set of arc labels for preference arcs To illustrate, consider Fig. 6 depicting two
(or p-arcs for short). homogeneous linking networks G and G0 .
620 Networks and Stability
i2 i3 i2 i3
i2 i3 i2 i3
unilateral). G00 ! G:
fi1 , i2 , i3 g
Thus, the Jackson-van den Nouweland rules
are the Jackson-Wolinsky rules without the Note that under the one-link-at-a-time restric-
one-link-at-a-time restriction. Note that if link tion, it is not possible under the Jackson-Wolinsky
Networks and Stability 621
rules to move directly from network G to network arc to player i0 without regard to the preferences of
G00 or directly from network G00 to network player i0 and can add and/or subtract arcs to sev-
G (i.e., G and G00 are not related under the effec- eral players simultaneously and can do so without
tiveness relations {!S}S G (N)). Instead, under the regard to those players’ preferences. Thus in gen-
Jackson-Wolinsky rules, the change from G to G00 eral under noncooperative rules, effectiveness
or from G00 to G requires two moves. For example, relations display a type of symmetry, and in par-
ticular, if G !{i}G0 then G0 !{i} G.
first G ! G0 , and then, G0 ! G00 or G0 ! G00 ; To illustrate, consider Fig. 8 depicting three
fi2 , i3 g fi3 g f i1 g
homogeneous directed networks G, G0 , and G00 .
Under the effectiveness relations implied by
or
noncooperative rules for networks G and G0 in
Fig. 8 we have
first G00 ! G0 , and then, G0 ! G or G0 ! G:
fi1 , i3 g fi3 g fi2 g
G ! G0 , G ! G0 :
fi1 g fi1 g
Bala-Goyal Rules (Bala and Goyal 2000)
(Noncooperative Rules – Unilateral-Unilateral Note that under noncooperative rules, net-
Rules) works G and G00 in Fig. 8 are not related under
Now assume that the feasible set of networks is the effectiveness relations {!{i}}i N. However,
equal to the set of homogeneous directed net- under the noncooperative rules we have, for
works P(N N). Translating Bala and Goyal example, the following effectiveness relations
rules into our notation and terminology,
G!fi1 g G0 , G0 !fi2 g G00
1. adding an arc from player i to player i0 requires
only that player i agree to add the arc (i.e., arc and
addition is unilateral and can be carried out
only by the initiator, player i); G00 !fi2 g G0 , G0 !fi1 g G:
2. subtracting an arc from player i to player i0
requires only that player i agree to subtract Rules Supernetworks
the arc (i.e., arc subtraction is unilateral Again by viewing each network G in feasible set
and can be carried out only by the initiator, G as a node in a larger network, we can represent
player i); the rules of network formation as a heterogeneous
3. G !SG0 implies that |S| = 1 (i.e., only network directed network. To begin, let
changes brought about by individual players
are allowed). M :¼ fmS : S GðNÞg
We shall also refer to rules (i)-(iii) as noncoop- denote the set of arc labels for move arcs
erative. Note that a player i can add or subtract an (or m-arcs for short).
a b c
i1 i1 i1
i2 i3 i2 i3 i2 i3
Networks and Stability, Fig. 8 (a) Network G. (b) Network G0 . (c) Network G00
622 Networks and Stability
Definition 6 (Rules Supernetworks, Page et al. ðpS0 , ðG, G0 ÞÞ Gr and ðmS0 , ðG, G0 ÞÞ Gr for
(2005)) Given feasible set G ( = P(P2(N)) or some coalition S0 .
P(N N)), a rules supernetwork Rr is a subset
of M (G G) such that ðmS0 , ðG, G0 ÞÞ is
Indirect Dominance
contained in Rr if and only if G!S0 G0 , where r
Given feasible set G (= P(P2(N)) or P(N N)),
denotes the name of the network formation rules
coalitional preferences {S}S G (N) and coa-
in force. We shall adopt the convention that r = jw
litional effectiveness relations {!S}S G (N) net-
if the rules are Jackson-Wolinsky, r = jn if the
work G0 G indirectly dominates network
rules are Jackson-van den Nouweland, and r = bg
G G, written G0 ▷ ▷ G, if there is a finite
if the rules are Bala-Goyal or noncooperative.
sequence of networks,
Supernetworks G0 , G1 , . . . , Gh ,
Given feasible set G( = P(P2(N)) or P(N N)), with G = G0, G0 = Gh, and Gk G for k = 0, 1,
coalitional preferences {S}S G (N) coalitional . . . h, and a corresponding sequence of coalitions,
effectiveness relations {!S}S G (N) can be
represented by a heterogeneous directed network S1 , S2 , . . . , Sh ,
called a supernetwork (see Page and Kamat 2005; such that for k = 1, 2, . . ., h
Page et al. 2005). In particular, given preference
supernetwork P and rules supernetwork Rr the Gk1 ! Gk , and Gk1 ≺ Sk Gh :
sk
corresponding supernetwork is given by
Note that if network G0 indirectly dominates
Gr :¼ P [ Rr network G (i.e., if G0 ▷ ▷ G), then what matters to
the initially deviating coalition S1, as well as all
Letting A :¼ P [ M (i.e., the union of all the coalitions along the way, is that the ultimate
preference arcs and move arcs), then network outcome G0 = Gh be preferred. Thus, for
example, the initially deviating coalition S1 will
Gr A ðG GÞ: not be deterred from changing network G0 to
network G1 even if network G1 is not preferred
to network G = G0, as long as the ultimate net-
Dominance Relations
work outcome G0 = Gh is preferred to G0 that is, as
long as G0 ≺S1 Gh . Finally, note that indirect dom-
Direct Dominance
inance is irreflexive but not in general transitive.
Given feasible set G( = P(P2(N)) or P(N N)),
In order to capture the idea of farsightedness in
coalitional preferences {S}S G (N) and
strategic behavior, Chwe (1994) analyzed abstract
coalitional effectiveness relations {!S}S G (N),
games equipped with indirect dominance rela-
network G0 G directly dominates network
tions in great detail, introducing the equilibrium
G G, written G0 ▷ G, if for some coalition
notions of consistency and largest consistent set.
S0 G (N),
The basic idea of indirect dominance goes back to
the work of Guilbaud (1949) and Harsanyi (1974).
G≺S0 G0 and G ! S0 G0
Given the supernetwork representation of pref-
erences and rules, Gr, we can write, G0 ▷ ▷ G if
Thus, network G0 directly dominates network
there is a finite sequence of networks,
G if some coalition S0 prefers G0 to G and if under
the rules of network formation coalition S0 has the
power to change G to G0 . G0 , G1 , . . . , Gh ,
Note that direct dominance is irreflexive but
not in general transitive. Also note that if Gr is the with G = G0, G0 = Gh, and Gk G for k = 0, 1,
supernetwork, then G0 ▷ G if and only if . . . h, and a corresponding sequence of coalitions,
Networks and Stability 623
S1 , S2 , . . . , Sh , G0 G1 G6
ðmSk , ðGk1 , Gk ÞÞ Gr ,
G7 G2
and
pSk , ðGk1 , Gk Þ Gr :
G5 G3
Path Dominance
Any irreflexive dominance relation > on
G ( = P(P2(N)) or P(N N)) – for example, direct
or indirect – induces a path dominance relation
G4
on the set of networks (sometimes referred to
as the transitive closure of >). In particular,
corresponding to dominance relation > on net- Networks and Stability, Fig. 9 > -Supernetwork D>
works G there is a corresponding path dominance
relation p on G specified as follows: Network subset of G G and where > -arc (G, G0 )
G0 G path dominates network G G with D> if and only if G0 > G (i.e., if and only if G0 >
respect to > (i.e., with respect to the underlying -dominates G). We call such a homogeneous
dominance relation > ), written G0 pG, if G0 = directed network (or directed graph)
G or if there exists a finite sequence of networks a > -supernetwork. For example, suppose G =
fGk ghk¼0 in G with Gh = G0 and G0 = G such that {G1, G2, G3, . . ., G7} and suppose the dominance
for k = 1, 2, . . ., h relation > on G is a direct dominance relation and
has the supernetwork representation given in
Gk > Gk1 : Fig. 9.
Note that network G5 is > -reachable through
We refer to such a finite sequence of networks D> from network G1 by the domination path
as a finite domination path and we say network G0 given by the > -arc sequence
is > – reachable from network G if there exists a
finite domination path from G to G0 . Thus, fðG1 , G6 Þ1 , G6 , G2 2 , G2 , G3 3 , G3 , G4 4 ,
G4 , G5 5 g:
G0 p G if and only if
0
G is > reachable from G, or Thus, G5 path dominates G1. Note that network
G0 ¼ G G2 is > -reachable from network G2 by the dom-
ination circuit given by the > -arc sequence
Note that, even though the underlying domi-
nance relation > is irreflexive and intransitive or ðG2 , G3 Þ1 , ðG3 , G4 2 , ðG4 , G5 3 , ðG5 , G2 4 :
transitive, the induced path dominance relation p
on G is both reflexive (G pG) and transitive and that network G3 is > -reachable from network
(G0 p G and G00 p G0 implies that G00 p G). G3 by two domination circuits given by the > -arc
sequences
> -Supernetworks
Let > denote the irreflexive dominance relation ðG3 , G4 Þ1 , ðG4 , G5 2 , ðG5 , G2 3 , ðG2 , G3 4
on G. It is often useful to represent > as a homo-
geneous directed network, D>, where D> is a and
624 Networks and Stability
ðG3 , G4 Þ1 , ðG4 , G3 2 : direct or indirect (i.e., > is equal to ▷ or ▷▷),
induced by coalitional preferences and network
Because networks G2 and G5 are on the same formation rules; and (ii) games where the feasible
circuit, G5 is > -reachable from G2 and G2 set of networks is equipped with a path dominance
is > -reachable from G5. Thus, G5 path dominates relation p induced by such an irreflexive domi-
G2 (i.e., G5 pG2) and G2 path dominates G5 (i.e., nance relation
G2 pG5). The same cannot be said of networks
G1 and G5. In particular, while G5 pG1 it is not Network Formation Games with Respect to
true that G1 pG5 because G1 is not > -reachable Irreflexive Dominance
from G5. Finally, note that network G0 is isolated In this section we consider the abstract game with
in D>. In particular, G0 is not reachable through respect to irreflexive dominance given by the pair
D> from any network in G and no network in G is
reachable through D> from G In general, a net- ðG, >Þ:
work G G is isolated if there does not exist a
network G0 with G0 p G or G p G0 . Throughout this section we will assume that
Note that if the direct dominance relation primitives are represented by supernetwork Gr: =
with > -supernetwork depicted in Fig. 9 has P [ Rr (where r is equal to jw, jn, or bg) and
underlying coalitional preferences {S}S G (N) >-supernetwork D> (where > is equal to ▷ or ▷▷),
and coalitional effectiveness relations {!S}S G (N) and that the feasible set of networks G is equal to
then the > – arc from network G3 to network G4 in the set of homogeneous linking networks
Fig. 9 means that for some coalition S, G4 is P(P2(N)) or the set of homogeneous directed net-
preferred to G3 and more importantly, that coali- works P(N N).
tion S has the power to change network G3 to
network G4. Thus, G3≺sSG4 and G3!SG4. But Quasi-Stability and Stability
because there is a > -arc in the opposite direction, We define the > -distance from G0 to G1 in D> to
from network G4 to network G3, G3 also be the length of the shortest > -path from G0 to G1
directly dominates G4. Thus for some coalition if G1 is > -reachable from G0 in D>, and + 1 if
S0 disjoint from coalition S(S0 \ S = Ø), G4 ≺sS0 G1 is not reachable from G0 in D>. We denote the
G3 and G4 !S0 G3 – Finally, note that if coalitional distance from G0 to G1 in D> by dD> ðG0 , G1 Þ.
preferences over networks are weak (i.e., are Thus,
based on weak preferences), then the statement, 8
‘for some coalition S0 disjoint from coalition S can >
> length of shortest > path
>
>
be weakened to for some coalition S0 not equal to < from G0 to G1 in D> ,
coalition S. With this weakening, the requirement d D > ðG 0 , G 1 Þ ¼ if G1 isreachable from G0 ,
>
>
that the intersection of S and S0 be empty is no > þ1, G1 not reachable from
>
:
longer needed. G0 D> :
2. Q is externally quasi-stable, that is, given In fact, it follows from the Theorem due to
any G0 2 = Q, there exists G1 Q with dD> Chvatal and Lovasz (1972) that any finite set
ðG0 , G1 Þ 2. Z equipped with an irreflexive binary relation ≺
2. A subset S of networks in G is said to be stable has a ≺ -quasi-stable set. Moreover, if the relation
for network formation game (G, > ) if, ≺ is transitive, then any ≺ -quasi-stable set is ≺
1. S is internally stable and -stable.
2. S is externally stable, that is, given any Next we state two results on the existence of
G0 2 = S, there exists G1 S with stable sets.
d D> ðG0 , G1 Þ 1.
Theorem 2 (Existence of Stable Sets for Net-
Thus, if Q is externally quasi-stable, a path work Formation Games, Page and Kamat
of length at most 2 is required to get from any (2005))1. If D> contains no > -circuits, then
network outside of Q to a network in Q, there exists a unique stable S for
whereas, if Q is externally stable a path of (G, > ).
length at most 1 is required. 2. If D> contains no > -circuits of odd length,
Letting then there exists a stable S for (G, >).
network G0 F and any mS1-deviation to network the unique stable set with respect to path domi-
G1 G by coalition S1 (via adding, subtracting, nance induced by this new transitive indirect dom-
or replacing arcs in accordance with Rr there inance relation is contained in the largest
exists further deviations leading to some network farsightedly consistent set – and in this way
G2 F where the initially deviating coalition S1 show that the largest farsightedly consistent set
is not better off – and possibly worse off. is nonempty and externally stable.
A network G G is said to be farsightedly
consistent if G F where F is a farsightedly
Network Formation Games with Respect to
consistent set.
Path Dominance
If (i) G = P(P2(N)), (ii) coalitional preferences
In this section we consider the network formation
are weak, denoted by {wS}S G(N), so that indi-
game with respect to path dominance given by the
rect dominance is weak (denoted by ⊳ ⊳w and
pair
defined in the obvious way), and (iii) coalitional
effectiveness relations are determined by Jackson- G, p :
Wolinsky rules, then the notion of farsighted con-
sistency above (essentially due to Chwe (1994)) is Throughout this section we will assume that
closely related to the notion of pairwise farsighted the underlying primitives,
stability introduced in Herrings et al. (2006).
For any game (G, ⊳⊳), there can be many far- ðG, fS g, f!S g, >ÞS GðNÞ ,
sightedly consistent sets. We shall denote by F* the
largest farsightedly consistent set. Thus, if F is far- are such that G is equal to the set of homogeneous
sightedly consistent, then F F
. Unlike quasi- linking networks P(P2(N)) or the set of homoge-
stable sets and stable sets where existence implies neous directed networks P(N N), that the dom-
nonemptiness, in considering farsightedly consistent inance relation > on G is given by either direct
sets two critical questions arise: (i) does there exist a dominance ⊳ or indirect dominance ⊳⊳, and that
largest farsightedly consistent set of networks for primitives are represented by supernetwork Gr ≔
(G, ⊳⊳), and (ii) is it nonempty? Our next result P [ Rr (with r equal to jw, jn, or bg)
provides a positive answer to both questions. and > -supernetwork D>.
We will present three notions of stability intro-
Theorem 3 (Existence, Uniqueness, and Non- duced in (Page and Wooders 2005, 2008) for
emptiness of F*, Page et al. (2005)) There exists abstract games of network formation with respect
a unique, nonempty largest farsightedly consistent to path dominance over heterogeneous directed
set F* for network formation game (G, ⊳⊳). networks: (i) strategic basins of attraction,
Moreover, F* is externally stable; that is, if net- (ii) path dominance stable sets, and (iii) the path
work G is not contained in F*, then there exists a dominance core.
network G0 contained in F* that indirectly domi-
nates G (i.e., G0 ⊳⊳G).
The method of proving existence and unique- Preliminaries
ness is a straightforward, supernetwork rendition
of Chwe’s (1994) method and is similar to the Networks Without Descendants If G1pG0
method introduced by Roth (1975, 1977), Page and G0 pG1, networks G1 and G0 are equivalent,
and Kamat (2005) provide an alternative proof written G1 pG0. If networks G1 and G0 are
(to that of Chwe and of (Page et al. 2005)) of the equivalent then either networks G1 and G0 coin-
nonemptiness and external stability of the largest cide or G1 and G0 are on the same circuit (see
consistent set (with respect to indirect domi- Fig. 9 above for a picture of a circuit). If G1 p G0
nance). In particular, Page and Kamat modify the but G1 and G0 are not equivalent (i.e., not
indirect dominance relation so as to make it tran- G1 pG0), then network G1 is a descendant of
sitive as well as irreflexive. They then show that network G0 and we write
Networks and Stability 627
Theorem 4 (All Path Dominance Network For- 1. A is a basin of attraction for (G, p).
mation Games Have Networks Without 2. There exists a network without descendants,
Descendants, Page and Wooders (2005, G ϵ Z, such that
2008)) In network formation game (G, p)
every network G ϵ G is path dominated by a
network G0 ϵ G without descendants (i.e., G0pG A ¼ G0 Z : G0 p G :
and G0 has no descendants).
By Theorem 4, in network formation game In light of Theorem 5, we conclude that in any
(G, p), corresponding to any network G ϵ G network formation game (G, p), G contains a
there is a network G0 ϵ G without descendants unique, finite, disjoint collection of basins of
which is > -reachable from G. Thus, in any net- attraction, say {A1, A2, . . ., Am}, where for each
work formation game the set of networks without k = 1,2, . . ., m (m 1 )
descendants is nonempty. Referring to Fig. 9, the
set of networks without descendants is given by Ak ¼ AG :¼ G0 Z : G0 p G
fG0 , G2 , G3 , G4 , G5 , G7 g:
for some network G ϵ Z. Note that for networks
We shall denote by Z the set of networks with- G0 and G in Z such that G0 p G, AG0 ¼ AG
out descendants. (i e. the basins of attraction AG0 and AGcoincide).
Also, note that if network G ϵ G is isolated, then
Basins of Attraction G ϵ Z and
Stated loosely, a basin of attraction is a set of
equivalent networks to which the strategic net- AG :¼ G0 Z : G0 p G ¼ fGg
work formation process represented by the game
might tend and from which there is no escape. is, by definition, a basin of attraction – but a very
Formally, we have the following definition. uninteresting one.
Z ¼ fG0 , G2 , G3 , G4 , G5 , G7 g: fA1 , A2 , . . . , Am g,
Even though there are six networks without where basin of attraction Ak contains |Ak| many
descendants, because networks G2, G3, G4, and networks (i.e., |Ak| is the cardinality of Ak). Then
G5 are equivalent, there are only three basins of the following statements are true:
attraction:
1. V G is a stable set for (G, p) if and only if
A1 ¼ fG0 g, A2 ¼ fG2 , G3 , G4 , G5 g, and A3 ¼ fG7 g: V is constructed by choosing one network from
each basin of attraction, that is, if and only if
Moreover, because G2, G3, G4, and G5 are V is of the form
equivalent,
V ¼ fG1 , G2 , . . . , Gm g,
AG2 ¼ AG3 ¼ AG4 ¼ AG5 ¼ fG2 , G3 , G4 , G5 g:
where Gk ϵ Ak for k = 1, 2, . . ., m.
Under the classical notion of Pareto efficiency, stable networks (Jackson and Wolinsky 1996), the
a network G is said to be Pareto efficient if and set of strongly stable networks (Dutta and
only if there does not exists another network G0 Mutuswami 1997; Jackson and van den
such that G≺iG0 for all players i N, regardless Nouweland 2005), or the set of Nash networks
of whether or not some coalition S can change (Bala and Goyal 2000). We also present results on
network G to network G0 . Letting PE denote the the relationships between basins of attraction, the
set of all classically Pareto efficient networks, it is path dominance core, and the largest farsightedly
easy to see that PE E. Note, however, that if consistent set (Chwe 1994). All of these results
under the rules of network formation, any network follow immediately from more general results in
G can be changed to any other network G0 via Page and Wooders (2005, 2008) where the notions
the actions of some coalition S, then the notions of strong stability, pairwise stability, Nash stabil-
of constrained Pareto efficiency and classical ity, and farsighted consistency are all extended to
Pareto efficiency are equivalent. Thus, if the heterogeneous directed networks.
collection of coalitional effectiveness relations
{!S}S G(N) on G is complete, that is, if for any Strongly Stable Homogeneous Networks
pair of networks G and G0 in G, G!SG0 for some We begin with a formal definition of strong sta-
coalition S G(N), then PE = E, and we have bility based on that of Jackson-van den
C PE = E. Nouweland (2005),
related to that given by Dutta-Mutuswami (Dutta another network by any coalition, or can only
and Mutuswami 1997). be changed by coalitions of size greater than 2,
We now have our main result on the path is pairwise stable.
dominance core and strong stability. Denote the Let PS denote the set of pairwise stable net-
set of strongly stable networks by SS. works. It follows from the definitions of strong
stability and pairwise stability that SS
Theorem 8 (The Path Dominance Core and PS. Moreover, if the full set of Jackson-Wolinsky
Strong Stability, Page and Wooders (2005, rules are in force, then SS = PS. Jackson-van den
2008)) Assume that G is equal to the set of all Nouweland (Jackson and van den Nouweland
homogeneous linking networks, P(P2(N)), and that 2005) provide two examples of the potential for
the Jackson-van den Nouweland rules are in force. strong stability to refine pairwise stability (i.e.,
two examples where SS is a strict subset of PS).
1. If the path dominance core C of (G, p) is However, under Jackson-Wolinsky rules because
nonempty, then SS is nonempty and C SS. network changes can occur only one link at a time
2. If the dominance relation > underlying p is a and because deviations by coalitions of more than
direct dominance relation ⊳, then C = SS and two players are not possible such refinements are
SS is nonempty if and only if there exists a not possible driving SS and PS to equality.
basin of attraction containing a single network. We now have our main result on the path
dominance core and pairwise stability.
Note that the set of strongly stable homoge-
neous linking networks is contained in the set of Theorem 9 (The Path Dominance Core and
constrained Pareto efficient homogeneous linking Pairwise Stability, Page and Wooders (2005,
networks. Thus, C SS E. 2008)) Assume that G is equal to the set of all
homogeneous linking networks, P(P2(N)), and
Pairwise Stable Networks that the Jackson-Wolinsky rules are in force.
The following definition of pairwise stability is a
translation of the Jackson-Wolinsky definition 1. If the path dominance core C of (G, p) is
(Jackson and Wolinsky 1996). nonempty, then PS is nonempty and C PS.
2. If the dominance relation > underlying p is a
Definition 13 (Pairwise Stability, Page and direct dominance relation ⊳, then C = PS and
Wooders (2005, 2008)) Assume that G is equal PS is nonempty if and only if there exists a
to the set of all homogeneous linking networks, basin of attraction containing a single net-
P(P2(N)), and that the Jackson-Wolinsky rules are work. Theorem 9 can be viewed as an exten-
in force. Network G G is said to be pairwise sion of a result due to Jackson and Watts (2002)
stable in (G, p) if for all {i, i0} P2(N), on the existence of pairwise stable homoge-
neous linking networks for network formation
1. G!fi, i0 g G [ fi, i0 g implies that G6fi, i0 g G[ games induced by Jackson-Wolinsky rules. In
fi, i0 g; particular, Jackson and Watts (2002) show that
2. G!{i}G\{i, i0} implies that G6{i}G\{i, i0}, and for this particular class of Jackson-Wolinsky
3. G!fi0 g Gnfi, i0 g implies that G6fi0 g Gnfi, i0 g network formation games, if there does not
. Thus, a homogeneous linking network is exist a closed cycle of networks, then there
pairwise stable if there is no incentive for any exists a pairwise stable network. Our notion
pair of players to add a link to the existing of a strategic basin of attraction containing
network and there is no incentive for any multiple networks corresponds to their notion
player who is party to a link in the existing of a closed cycle of networks. Thus, stated in
network to dissolve or remove the link. Note our terminology, Jackson and Watts show that
that under our definition of pairwise stability a for this class of network formation games, if
network G G that cannot be changed to there does not exist a basin of attraction
632 Networks and Stability
containing multiple networks, then there exists addition the dominance relation is direct, then
a pairwise stable network. Following our C = SS=NE E.
approach, by part 2 of Theorem 9 the existence
of at least one strategic basin containing a Farsightedly Consistent Networks
single network is both necessary and sufficient Our final result summarizes the relationships
for the existence of a pairwise stable network. between basins of attraction, the path dominance
core, and the largest farsightedly consistent set.
Nash Networks
Theorem 11 (Basins of Attraction, The Path
The following definition of Nash networks is a var-
Dominance Core, and the Largest Consistent
iation on the definition from Bala and Goyal (2000).
Set, Page and Wooders (2005, 2008)) Assume
that (i) G is equal to the set of homogeneous
Definition 14 (Nash Networks, Page and
linking networks P(P2(N)) and that the Jackson-
Wooders (2005, 2008)) Assume that G is equal
Wolinsky rules or the Jackson-van den
to the set of all homogeneous directed networks,
Nouweland rules are in force; or (ii) that G is
P(N N), and that the Bala-Goyal rules are in
equal to the set of homogeneous directed net-
force. Network G ϵ G is said to be a Nash network
works P(N N)) and that the Bala-Goyal rules
in (G, p) if for all G0 G, G!fi0 g G0 implies that
are in force.
G6fi0 g G0 . Given network formation game (G, p), where
Thus, a homogeneous directed network is path dominance is induced by an indirect domi-
N ash if whenever an individual player has the nance relation ⊳⊳, assume without loss of gener-
power to change the network to another network, ality that (G, p) has nonempty largest consistent
the player will have no incentive to do so. We shall set given by F* and basins of attraction given by
denote by NE the set of Nash networks. Note that
under our definition any network that cannot be fA1 , A2 , . . . , Am g:
changed to another network by a coalition of size
1 is a Nash network. Finally, note that the set of Then the following statements are true:
strongly stable networks SS is contained in the set
of Nash networks NE. 1. Each basin of attraction Ak, k = 1,2, . . ., m, has
We now have our main result on the path a nonempty intersection with the largest con-
dominance core and Nash stability. sistent set F *, that is
C F
for which this is true? In general, if the irreflexive As has been shown by Monderer and Shapley
dominance relation > inducing path dominance (1996), potential games are closely related to con-
p is transitive, then the > -supernetwork D > is gestion games introduced by Rosenthal (1973).
without circuits, and therefore all basins of attrac- Page and Wooders (2007) introduce a club network
tion for the game (G, P) contain a single net- formation game which is a variant of the noncoop-
work. Unfortunately, if the dominance relation is erative network formation game described above –
given by direct or indirect dominance, then tran- but for a class of heterogenous directed networks –
sitivity fails to hold in general. In the next two and using methods similar to those introduced by
sections we identify several classes of network Hollard (2000), show that this game possesses a
formation games having singleton basins. potential function. Prior papers studying potential
games in the context of linking networks include
Qin (1996), Slikker et al. (2000) and Slikker and
Network Formation Games and Potential van den Nouweland (2002). These papers have
Functions focused on providing the strategic underpinnings
Assume that G is equal to the set of homogeneous of the Myerson value (Myerson (1977) and
directed networks, P(N N), and that the Bala- Aumann and Myerson (1988)).
Goyal rules are in force, so that primitives are
S
represented by supernetwork Gbg: = P Rbg. In Jackson-Wolinsky Network Formation Games
addition assume that player preferences over Assume that G is equal to the set of all homoge-
P(N N) are specified via payoff functions neous linking networks, P(P2(N)), and that the
{vi( )}i N and that the dominance Jackson-van den Nouweland rules are in force,
relation > over P(N N) is given by direct so that rules are represented by rules super-
dominance ⊳ . Thus, G0 ⊳ G if and only if network Rjn. In addition assume that player pref-
for some player i0 N, G ! {i0}G0 and erences over P(P2(N)) are weak and therefore that
vi0 ðG0 Þ > vi0 ðGÞ coalitional preferences, {wS}S G(N), are weak.
We say that the noncooperative network for- Finally, assume that the dominance relation > on
mation game (G, p) is a potential game if there P(P2(N)) is given by direct dominance – but
exists a function because coalitional preferences are weak, direct
dominance is weak, denoted by ⊳w.
Pð Þ : G ! R In the Jackson-Wolinsky network formation
game coalitional preferences are specified by
such that for all G and G0 with G ! fi0 gG0 for player payoff functions, {vi( )}i N, and player
some player i0 , payoff functions are in turn specified by a network
value function
vi0 ðG0 Þ > vi0 ðGÞ if and onlyif PðG0 Þ > PðGÞ:
vð Þ : G ! R
It is easy to see that any noncooperative net- together with an allocation rule, Y(G, v) =
work formation game (G, p) possessing a poten- (Yi(G, v))i N R|N| satisfying
tial function (i.e., a potential game) has no
circuits, and thus possesses strategic basins of X
Y i ðG, vÞ ¼ vðGÞ:
attraction each consisting of a single network. iN
Thus, we can conclude from our Theorem 8 that
any noncooperative network formation game Thus in the Jackson-Wolinsky game, each
possessing a potential function has a nonempty player’s payoff function vi.( ) is given by Yi.( , v)
path dominance core. In addition, we know from where v( ) is the network value function. The basic
our Theorem 10 that in this example the path idea here is that given network G, v(G) is the total
dominance core C is equal to the set of Nash value generated by network G and Yi(G, v) is value
networks NE. allocated to player i. Translating Jackson-Wolinsky
634 Networks and Stability
into our abstract game model, if G0⊳wG then one of 9, (G, p) has a basin of attraction containing a
the following is true: single network.
Finally, Jackson (2003) has shown that if the
1. G!fi, i0 g G0 where G0 = G [ {i, i0} (a link allocation rule is given by the Myerson value (see
between players i and i0 is added) and Aumarnn and Myerson 1988; Myerson 1977);
Yi(G0, v) Yi(G, v) and Y i0 ðG0 , vÞ Y i0 ðG, vÞ that is if
X
with strict inequality for at least on of the
players; vi ðGÞ ¼ Y i ðG, vÞ ¼ v GS[fig vðGS Þ
2. G!{i}G0 where G0 = G\{i, i0} (a link between SNnfig
players i and i0 is subtracted) and !
Yi(G0, v) Yi(G, v); jSj!ðjN j jSj 1Þ!
,
3. G!{i}G0 where G0 = G\{i, i0} (a link between jN j!
players i and i0 is subtracted) and Y i0 ðG0 , vÞ >
Y i0 ðG, vÞ. where
or networks with a continuum of nodes. As in the Berge C (2001) The theory of graphs. Dover, Mineola
framework of cooperative games (cf., (reprint of the translated French edition published by
Dunod, Paris, 1958)
Kovalenkov and Wooders (2001) and Wooders Bhattacharya A (2005) Stable and efficient networks with
(1983, 2008b) for cores of transferable utility farsighted players: the largest consistent set. University
and nontransferable utility games) does it hold of York, York. Typescript
that some notion of the approximate path domi- Bloch F (1995) Endogenous structures of association in
oligopolies. Rand J Econ 26:537–556
nance core is nonempty if the numbers of players Bloch F (2005) Group and network formation in industrial
is sufficiently large? Do networks tend towards organization: a survey. In: Demange G, Wooders
having some property analogous to the equal M (eds) Group formation in economics: networks,
treatment property (as in, for example, Debreu clubs, and coalitions. Cambridge University Press,
Cambridge, pp 335–353
and Scarf (1963) and Green (1972) for exchange Bloch F, Genicot G, Ray D (2008) Informal insurance in social
economies, (Kovalenkov and Wooders 2001) or networks. J Econ Theory. https://doi.org/10.1016/j.jet.
Wooders (2008a), for cooperative games or 2008.01.008
Gravel and Thoron (2007) for local public goods Blume L (1993) The statistical mechanics of strategic
interaction. Games Econ Behav 5:387–424
economies, or Jackson and Watts (2008) for a Bogomolnaia A, Jackson MO (2002) The stability of hedonic
repeated game approach to a matching model). coalition structures. Games Econ Behav 38:201–230
Then there is the problem of characterizing stra- Bollobas B (1998) Modern graph theory. Springer,
tegic behavior in large networks (as in, for exam- New York
Boorman SA (1975) A combinatorial optimization model
ple, Kalai (2004) or Wooders et al. (2006)). Under for transmission of job information through contact
what conditions and to what extent might Kalai’s networks. Bell J Econ 6:216–249
“ex-post stability” or Wooders, Cartwright and Bramoulle Y, Kranton R (2007a) Public goods in networks.
Selteri’s social conformity continue to hold in J Econ Theory 135:478–494
Bramoulle Y, Kranton R (2007b) Risk-sharing networks.
strategic network formation? J Econ Behav Organ 64:275–294
Calvo-Armengol A (2004) Job contact networks. J Econ
Theory 115:191–206
Acknowledgments This paper was begun while Page and Calvo-Armengol A, Jackson MO (2004) The effects of
Wooders were visiting CERMSEM at the University of social networks on employment and inequality. Am
Paris 1 in June and October of 2007. The authors thank Econ Rev 94:426–454
CERMSEM and Paris 1 for their hospitality. URLs: http:// Calvo-Armengol A, Jackson MO (2007) Social networks
mypage.iu.edu/~lpage. http://www.mymawooders.com. in labor markets: wage and employment dynamics and
inequality. J Econ Theory 132:27–46
Calvo-Armengol A, Ballester C, Zenou Y (2006) Who’s
who in networks. Wanted: the key player.
Bibliography Econometrica 75:1403–1418
Casella A, Rauch J (2002) Anonymous market and group
Allouch N, Wooders M (2007) Price taking equilibrium in ties in international trade. J Int Econ 58:19–47
economies with multiple memberships in clubs and Casella A, Rauch J (2003) Overcoming informational bar-
unbounded club sizes. J Econ Theory. https://doi.org/ riers in international resource allocations: prices and
10.1016/j.jet.2007.07.06 ties. Econ J 113:21–42
Arnold T, Wooders M (2006) Club formation with coordi- Chvatal V, Lovasz L (1972) Every directed graph has a
nation. University of Warwick working paper 640 semi-kernel. In: Hypergraph seminar, lecture notes in
Aumarnn RJ (1964) Markets with a continuum of traders. mathematics, vol 411. Springer, Berlin
Econometrica 32:39–50 Chwe M (1994) Farsighted coalitional stability. J Econ
Aumarnn RJ, Myerson RB (1988) Endogenous formation Theory 63:299–325
of links between players and coalitions: an application Chwe M (2000) Communication and coordination in social
of the Shapley value. In: Roth A (ed) The Shapley networks. Rev Econ Stud 67:1–16
value. Cambridge University Press, Cambridge, Corominas-Bosch M (2004) Bargaining in a network of
pp 175–191 buyers and sellers. J Econ Theory 115:35–77
Bala V, Goyal S (2000) A noncooperative model of net- Currarini S (2007) Group stability of hierarchies in games
work formation. Econometrica 68:1181–1229 with spillovers. Math Soc Sci 54:187–202
Banerjee S, Konishi H, Sonmez T (2001) Core in a simple Currarini S, Morelli M (2000) Network formation with
coalition formation game. Soc Choice Welf 18:135–158 sequential demands. Rev Econ Des 5:229–249
Belleflamme P, Bloch F (2004) Market sharing agreements Debreu G, Scarf H (1963) A limit theorem on the core of an
and collusive networks. Int Econ Rev 45:387–411 economy. Int Econ Rev 4:235–246
636 Networks and Stability
van Deemen AMA (1991) A note on generalized stable set. Herings PJ-J, Mauleon A, Vannetelbosch V (2006) Far-
Soc Choice Welf 8:255–260 sightedly stable networks. Meteor Research Memoran-
Demange G (1994) Intermediate preferences and stable dum RM/06/041
coalition structures. J Math Econ 23:45–48 Hojman D, Szeidl A (2006) Endogenous networks, social
Demange G (2004) On group stability and hierarchies in games and evolution. Games Econ Behav 55:112–130
networks. J Political Econ 112:754–778 Hollard G (2000) On the existence of a pure strategy equilib-
Demange G, Henreit D (1991) Sustainable oligopolies. rium in group formation games. Econ Lett 66:283–287
J Econ Theory 54:417–428 Inarra E, Kuipers J, Olaizola N (2005) Absorbing and
Deroian F, Gannon F (2005) Quality improving alliances in generalized stable sets. Soc Choice Welf 24:433–437
differentiated oligopoly. Int J Ind Organ 24:629–637 Jackson MO (2003) The stability and efficiency of eco-
Diamantoudi E, Xue L (2003) Farsighted stability in nomic and social networks. In: Dutta B, Jackson MO
hedonic games. Soc Choice Welf 21:39–61 (eds) Networks and groups: models of strategic forma-
Durlauf S (1997) Statistical mechanics approaches to tion. Springer, Heidelberg, pp 99–141
socioeconomic behavior. In: Arthur WB, Durlauf S, Jackson MO (2005) A survey of models of network for-
Lane DA (eds) The economy as an evolving complex mation: stability and efficiency. In: Demange G,
system II. Addison-Wesley, Reading, pp 81–104 Wooders M (eds) Group formation in economics: net-
Dutta B, Mutuswami S (1997) Stable networks. J Econ works, clubs, and coalitions. Cambridge University
Theory 76:322–344 Press, Cambridge, pp 11–57
Dutta B, Ghosal S, Ray D (2005) Farsighted network Jackson MO, van den Nouweland A (2005) Strongly stable
formation. J Econ Theory 122:143–164 networks. Games Econ Behav 51:420–444
Even-Dar E, Kearns M, Suri S (2007) A network formation Jackson MO, Watts A (2002) The evolution of social and
game for bipartite exchange economies. Computer and economic networks. J Econ Theory 106:265–295
Information Science typescript, University of Jackson MO, Watts A (2008) Social games: matching and
Pennsylvania the play of finitely repeated games. Games Econ Behav.
Furusawa T, Konishi H (2007) Free trade networks. J Int https://doi.org/10.1016/j.geb.2008.02.004
Econ 72:310–335 Jackson MO, Wolinsky A (1996) A strategic model of social
Galeana-Sanchez H, Xueliang L (1998) Semikernels and and economic networks. J Econ Theory 71:44–74
(k, l)-kemels in digraphs. SIAM J Discret Math Kalai E (2004) Large robust games. Econometrica
11:340–346 72:1631–1665
Galeotti A, Moraga-Gonzalez JL (2007) Segmentation, Kalai E, Schmeidler D (1977) An admissible set occurring
advertising and prices. Int J Ind Organ. https://doi.org/ in various bargaining situations. J Econ Theory
10.1016/i.ijindorg.2007.11.002 14:402–411
Gillies DB (1959) Solutions to general non-zero-sum Kalai E, Pazner A, Schmeidler D (1976) Collective choice
games. In: Tucker AW, Luce RD (eds) Contributions correspondences as admissible outcomes of social
to the theory of games, vol 4. Princeton University bargaining processes. Econometrica 44:233–240
Press, Princeton, pp 47–85 Kirman A (1983) Communication in markets: a suggested
Goyal S (2005) Learning in networks. In: Demange G, approach. Econ Lett 12:101–108
Wooders M (eds) Group formation in economics: net- Kirman A, Herreiner D, Weisbuch G (2000) Market orga-
works, clubs, and coalitions. Cambridge University nization and trading relationships. Econ J 110:411–436
Press, Cambridge, pp 122–167 Konishi H, Ray D (2003) Coalition formation as a dynamic
Goyal S (2007) Connections: an introduction to the eco- process. J Econ Theory 110:1–41
nomics of networks. Princeton University Press, Konishi H, Le Breton M, Weber S (1998) Equilibrium in a
Princeton finite local public goods economy. J Econ Theory
Goyal S, Joshi S (2003) Networks of collaboration in 79:224–244
oligopoly. Games Econ Behav 43:57–85 Kovalenkov A, Wooders M (2001) Epsilon cores of games
Goyal S, Joshi S (2006) Bilateralism and free trade. Int with limited side payments: nonemptiness and equal
Econ Rev 47:749–778 treatment. Games Econ Behav 36:193–218
Goyal S, Moraga-Gonzalez JL (2001) R&D networks. Kovalenkov A, Wooders M (2003) Approximate cores of
Rand J Econ 32:686–707 games and economies with clubs. J Econ Theory
Granovetter M (1973) The strength of weak ties. Am 110:87–120
J Sociol 78:1360–1380 Kranton R, Minehart D (2000) Networks versus vertical
Gravel N, Thoron S (2007) Does endogenous formation of integration. RAND J Econ 31:570–601
jurisdictions lead to wealth stratification? J Econ The- Kranton R, Minehart D (2001) A theory of buyer- seller
ory 132:569–583 networks. Am Econ Rev 91:485–508
Green J (1972) On the inequitable nature of core alloca- Li S (1992) Far-sighted strong equilibrium and oligopoly.
tions. J Econ Theory 4:132–143 Econ Lett 40:39–44
Guilbaud GT (1949) La theorie des jeux. Econ Appl 2:18 Li S (1993) Stability of voting games. Soc Choice Welf
Harsanyi JC (1974) An equilibrium-point interpretation of 10:51–56
stable sets and a proposed alternative definition. Manag Lucas WF (1968) A game with no solution. Bull Am Math
Sci 20:1472–1495 Soc 74:237–239
Networks and Stability 637
Luo X (2001) General systems and j-stable sets – a formal Qin C-Z (1996) Endogenous formations of cooperation
analysis of socioeconomic environments. J Math Econ structures. J Econ Theory 69:218–226
36:95–109 Rees A (1966) Information networks in labor markets. Am
Mariotti M, Xue L (2002) Farsightedness in coalition for- Econ Rev 56:218–226
mation. Typescript, University of Aarhus Reny PJ, Wooders M (1996) The partnered core of a game
Maschler M, Peleg B (1967) The structure of the kernel of a without side payments. J Econ Theory 70:298–311
cooperative game. SIAM J Appl Math 15:569–604 Richardson M (1953) Solutions of irreflexive relations.
Maschler M, Peleg B, Shapley LS (1971) The kernel and Ann Math 58:573–590
bargaining set for convex games. Int J Game Theory Rockafellar RT (1984) Network flows and monotropic
1:73–93 optimization. Wiley, New York
Mauleon A, Varrnetelbosch V (2004) Farsightedness and Rosenthal RW (1973) A class of games possessing pure-
cautiousness in coalition formation games with positive strategy Nash equilibria. Int J Game Theory 2:65–67
spillovers. Theory Decis 56:291–324 Roth AE (1975) A lattice fixed-point theorem with con-
Mauleon A, Sempere-Monerris J, Varrnetelbosch V (2008) straints. Bull Am Math Soc 81:136–138
Networks of knowledge among unionized firms. Can Roth AE (1977) A fixed-point approach to stability in
J Econ (to appear) cooperative games. In: Karamardian S (ed) Fixed
Monderer D, Shapley LS (1996) Potential games. Games points: algorithms and applications. Academic,
Econ Behav 14:124–143 New York
Montgomery J (1991) Social networks and labor market Roughgarden T (2005) Selfish routing and the price of
outcomes: toward an economic analysis. Am Econ Rev anarchy. MIT Press, Cambridge
81:1408–1418 Scarf H (1967) The core of an N-person game.
Mutuswami S, Winter E (2002) Subscription mechanisms Econometrica 35:50–69
for network formation. J Econ Theory 106:242–264 Schwartz T (1974) Notes on the abstract theory of collec-
Myerson RB (1977) Graphs and cooperation in games. tive choice. Carnegie-Mellon University, School of
Math Oper Res 2:225–229 Urban and Public Affairs typescript
von Neumann J, Morgenstern O (1944) Theory of games Shapley LS, Shubik M (1969) On market games. J Econ
and economic behavior. Princeton University Press, Theory 1:9–25
Princeton Shenoy PP (1980) A dynamic solution concept for abstract
van den Nouweland A (2005) Models of network forma- games. J Optim Theory Appl 32:151–169
tion in cooperative games. In: Demange G, Wooders Shubik M (1971) The “bridge game” economy: an exam-
M (eds) Group formation in economics: networks, ple of indivisibilities. J Political Econ 79:909–912
clubs, and coalitions. Cambridge University Press, Skyrms B, Pemantle R (2000) A dynamic model of
Cambridge, pp 58–88 social network formation. Proc Nat Acad Sci
Page FH Jr, Kamat S (2005) Farsighted stability in network 97:9340–9346
formation. In: Demange G, Wooders M (eds) Group Slikker M, van den Nouweland A (2001) Social and eco-
formation in economics: networks, clubs, and coali- nomic networks in cooperative game theory. Kluwer,
tions. Cambridge University Press, Cambridge, Boston
pp 89–121 Slikker M, van den Nouweland A (2002) Network forma-
Page FH Jr, Wooders M (1996) The partnered core and the tion, costs, and potential games. In: Borm P, Peters
partnered competitive equilibrium. Econ Lett H (eds) Chapters in game theory. Kluwer, Boston,
52:143–152 pp 223–246
Page FH Jr, Wooders M (2005) Strategic basins of attrac- Slikker M, Dutta B, van den Nouweland A, Tijs S (2000)
tion, the farsighted core, and network formation games. Potential maximizers and network formation. Math Soc
FEEM Working Paper 36.05 Sci 39:55–70
Page FH Jr, Wooders M (2007) Club networks with mul- Tardos E, Wexler T (2007) Network formation games and
tiple memberships and noncooperative stability. the potential function method. In: Nisan N,
Indiana University, Department of Economics type- Roughgarden T, Tardos E, Vazirani V (eds) Algorith-
script (paper presented at the Conference in Honor of mic game theory. Cambridge University Press, Cam-
Ehud Kalai, 16–18 Dec 2007) bridge, pp 487–516
Page FH Jr, Wooders M (2008) Strategic basins of attrac- Tesfatsion L (1997) A trade network game with endoge-
tion, the path dominance core, and network formation nous partner selection. In: Amman HM, Rustem B,
games. Games Econ Behav. https://doi.org/10.1016/j. Whinston AB (eds) Computational approaches to eco-
geb.2008.05.003 nomic problems. Kluwer, Boston, pp 249–269
Page FH Jr, Wooders M, Kamat S (2005) Networks and Tesfatsion L (1998) Preferential partner selection in evo-
farsighted stability. J Econ Theory 120:257–269 lutionary labor markets: a study in agent-based com-
Qin C-Z (1993) A conjecture of Shapley and Shubik on putational economics. In: Porto VW, Saravanan N,
competitive outcomes in the cores of NTU market Waagen D, Eiben AE (eds) Evolutionary program-
games. Int J Game Theory 22:335–344 ming VII. Proceedings of the seventh annual confer-
Qin C-Z (1994) The inner core of an N-person game. ence on evolutionary programming. Springer, Berlin,
Games Econ Behav 6:431–444 pp 15–24
638 Networks and Stability
Topa G (2001) Social interactions, local spillovers, and Wooders M (2008b) Small group effectiveness, per
unemployment. Rev Econ Stud 68:261–295 capita boundedness and nonemptiness of approxi-
Vega-Redondo F (2007) Complex social networks. Cam- mate cores. J Math Econ. https://doi.org/10.1016/j.
bridge University Press, Cambridge jmateco.2007.06.006
Wang P, Watts A (2006) Formation of buyer-seller trade Wooders M, Cartwright C, Selten R (2006) Behavioral
networks in a quality differentiated product market. conformity in games with many players. Games Econ
Can J Econ 39:971–1004 Behav 57:347–360
Watts A (2001) A dynamic model of network formation. Xue L (1998) Coalitional stability under perfect foresight.
Games Econ Behav 34:331–341 Econ Theory 11:603–627
Wooders M (1983) The epsilon core of a large replica Xue L (2000) Negotiation-proof Nash equilibrium. Int
game. J Math Econ 11:277–300 J Game Theory 29:339–357
Wooders M (2008a) Competitive markets and market Zissimos B (2005) Why are free trade agreements
games. Rev Econ Design (forthcoming) regional? FEEM Working Paper 67-07
Definition
Game Theory and Strategic
Complexity The subject of this entry is at the intersection of
economics and computer science and deals with
Kalyan Chatterjee1 and Hamid Sabourian2 the use of measures of complexity obtained from
1
Department of Economics, The Pennsylvania the study of finite automata to help select among
State University, University Park, USA multiple equilibria and other outcomes appearing
2
Faculty of Economics, University of Cambridge, in game-theoretic models of bargaining, markets,
Cambridge, UK and repeated interactions. The importance of the
topic lies in the ability of concepts that employ
bounds on available resources to generate more
Article Outline refined predictions of individual behavior in
markets.
Glossary
Definition
Introduction Introduction
Games, Automata, and Equilibrium Concepts
Complexity Considerations in Repeated Games This entry is concerned with the concept of stra-
Complexity and Bargaining tegic complexity and its use in game theory. There
Complexity, Market Games, and the Competitive are many different meanings associated with the
Equilibrium word “complexity,” as the variety of topics
Discussion and Future Directions discussed in this volume makes clear. In this
Bibliography entry, we shall adopt a somewhat narrow view,
confining ourselves to notions that measure, in
Glossary some way, constraints on the ability of economic
agents to behave with full rationality in their inter-
Automata A formal definition of a strategy that actions with other agents in dynamic environ-
captures its complexity. ments. This will be made more precise a little
Continuation Game A description of how the later. (A more general discussion is available in
play will proceed in a dynamic game once Rubinstein (1998)).
some part of the game has already occurred. Why is it important to study the effect of such
Equilibrium A solution concept for games in constraints on economic decision-making? The
which each player optimizes given his correct first reason could be to increase the realism of
prediction of others’ behavior. the assumptions of economic models; it is evident
Equilibrium Path The outcome in terms of the from introspection and from observing others that
play of the game if every player uses his equi- we do not have infinite memory and cannot con-
librium strategy. dition our future actions on the entire corpus of
Game Theory A formal model of interaction, what we once knew or, for that matter, unlimited
usually in human behavior. computational power. However, only considering
Repeated Games A series of identical interac- the assumptions of a model would not be consid-
tions of this kind. ered enough if the increased realism were not to
Strategic Complexity A measure of how com- expand our ability to explain or to predict. The
plex a strategy is to implement. second reason therefore is that studying the effects
Strategy A complete specification of how a of complexity on human decision-making might
player will play the game. help us either to make our predictions more
© Springer Science+Business Media, LLC, part of Springer Nature 2020 639
M. Sotomayor et al. (eds.), Complex Social and Behavioral Systems,
https://doi.org/10.1007/978-1-0716-0368-0_241
Originally published in
R. A. Meyers (ed.), Encyclopedia of Complexity and Systems Science, © Springer Science+Business Media New York 2013
https://doi.org/10.1007/978-3-642-27737-5_241-3
640 Game Theory and Strategic Complexity
precise (by selecting among equilibria) or to gen- Competitive Equilibrium” will extend the analysis
erate explanations for behavior that is frequently of bargaining to markets in which several agents
observed, but incompatible with equilibrium in bargain and consider the recent literature that jus-
models that have stronger assumptions about the tifies competitive outcomes in market environ-
abilities of agents. ments by appealing to the aversion of agents to
A strategy in a game is an entire plan of how to complexity. Section “Discussion and Future
play the game at every possible history/contin- Directions” concludes with some thoughts on
gency/eventuality at which the player has to future research. This entry draws on an earlier
make a move. The particular aspect of complexity survey paper (Chatterjee 2002) for some of the
that we shall focus on is on the complexity of material in sections “Games, Automata and Equi-
strategy as a function of the history. One repre- librium Concepts,” “Complexity Considerations
sentation of the players’ strategies in games is in Repeated Games,” and “Complexity and
often in terms of (finite) automata. The finiteness Bargaining.”
need not always be assumed; it can be derived.
The ideas of complexity, though often most con-
veniently represented this way, can also be Games, Automata, and Equilibrium
discussed without referring to finite automata at Concepts
all but purely in how a strategy depends on the
past history of the game. As mentioned in the introduction, this entry will
The number of states in the automaton can be be concerned with dynamic games. Though the
used as a measure of complexity. This may be a theory of games has diffused from economics and
natural measure of complexity in a stationary mathematics to several other fields in the last few
repetitive environment such as repeated games. decades, we include an introduction to the basic
We shall discuss this measure of complexity as concepts to keep this entry as self-contained as
well as other aspects of the complexity of a strat- possible. A game is a formal model of interaction
egy that are particularly relevant in non-stationary between individual agents. The basic components
frameworks. of a game are the following: (i) Players or agents,
Note the players are not themselves considered whose choices will, in general, have conse-
automata in this entry, and in the literature it sur- quences for each other. We assume a finite set of
veys. Also, we do not place restrictions on the players, denoted by N. We shall also use
ability of players to compute strategies (see N sometimes to represent the cardinality of this
Papadimitriou 1992), only on the strategies that set. (ii) A specification of the “rules of the game”
they can implement. The entry is also not intended or the structure of interaction, described by the
as a comprehensive survey of the literature on sequence of possible events in the game, the order
complexity of implementation in games. The in which the players move, what they can choose
main focus of the entry is inevitably on the works at each move, and what they know about previous
that we have been personally associated with. moves. This is usually modeled as a tree and is
The remaining part of this entry is organized as called the “extensive form” of the game (and will
follows: In the next section, we discuss strategies not be formalized here, though the formalization
in a game, their representation as finite automata, is standard and found in all the texts on the sub-
and the basic equilibrium concepts to be used in ject). (iii) Payoffs for each player associated with
the entry. Section “Complexity Considerations in every path through the tree from the root. It is
Repeated Games” will consider the use of com- easier to describe this as a finite tree and ascribe
plexity notions in repeated games. payoffs to the end nodes z. Let ui(z) be the real-
Section “Complexity and Bargaining” will focus valued payoff to Player i associated with end node
on extensive form bargaining and the effect of z. The payoffs are usually assumed to satisfy
complexity considerations in selecting equilibria. conditions that are sufficient to guarantee that
Section “Complexity, Market Games and the the utility of a probability distribution on a subset
Game Theory and Strategic Complexity 641
of the set of end nodes is the expectation of the repeated again, and so on. (The reader will recall
utility of the individual end nodes. However, dif- that in the Prisoners’ Dilemma played once,
ferent strands of work on bounded rationality Defect is better than Cooperate for each player,
dispense with this assumption. The description no matter what the other player does, but both
above presupposes a tree of finite depth, while players choosing Defect is strictly worse for each
many of the applications deal with infinite horizon than both choosing Cooperate.) What a strategy
games. However, the definitions are easily modi- for a given player would do would be to specify
fied by associating payoffs with a play of the game the choice in a given stage as a function of the
and defining a node as a set of plays. We shall not history of the game up to that stage (for every
pursue this further here. stage). A finite automaton represents a particular
In the standard model of a game, players are strategy in the following way: It partitions all
assumed to have all orders of knowledge about the possible histories in the game (at which the player
preceding description. Work on bounded rational- concerned has to move) using a finite number of
ity also has considered relaxing this assumption. elements. Each of these elements is a state of the
A strategy is a complete plan of action for machine. Given a state, the automaton prescribes
playing a game, describing the course of action an action (e.g., Cooperate after all histories in
to be adopted in every possible contingency which the other party has cooperated). It also
(or every information set of the player concerned). specifies how the state of the machine will change
The plan has to be detailed enough so that it can be as a result of the action taken by the other player.
played by an agent, even if the principal is not The state-to-action mapping is called the output
himself or herself in town, and the agent could mapping, and the rule that prescribes the state in
well be a computer, which is programmed to the next period as a function of today’s state and
follow the strategy. Without any loss of generality, the action of one’s opponent in this period is
a strategy can be represented by an automaton (see called the transition mapping. The automaton
below for illustration and Osborne and Rubinstein also needs to prescribe what to do in the first
(1994) for a formal treatment in the context of stage, when there is no history of past actions to
repeated games). Often such a machine descrip- rely on. Thus, for example, the famous “tit-for-
tion is more convenient in terms of accounting for tat” strategy in the repeated Prisoners’ Dilemma
a complexity of a machine. For example, the can be represented by the following automaton:
works that are based on the use of finite automata
or Turing machines to represent strategies for 1. Play Cooperate in the first stage. The initial
playing a game impose a natural bound on the state is denoted as q1, and in this state, the
set of allowable strategies. action prescribed is cooperate.
For the types of problem that we shall consider 2. As long as the other player cooperates, stay in
here, it is best to think of a multistage game with state q1.
observable actions, to use the terminology of 3. If the other player defects in a state, go to state
Fudenberg and Tirole (1991). The game has q2. The action specified in q2 is Defect.
some temporal structure; let us call each unit of 4. Stay in q2 as long as the other player defects. If
time a period or a stage. In each period, the players the other player cooperates in a stage, go to q1.
choose actions simultaneously and independently.
(The actions could include the dummy action.) All Denoting the output mapping by l(), we get
the actions taken in a stage are observed, and the l(q1) ¼ C and l(q2) ¼ D. The transition map-
players then choose actions again. An example is ping m(,) is as follows: m(q1,C) ¼ q1, m(q1,
a repeated normal-form game, such as the famous D) ¼ q2, m(q2,C) ¼ q1, and m(q2,D) ¼ q2. Here,
Prisoners’ Dilemma being repeated infinitely or of course, C and D denote cooperate and defect,
finitely often. In each stage, players choose respectively. The machine described above has
whether to cooperate or defect. The choices are two states and is an instance of a Moore machine
revealed, payoffs received and the choices in computer science terminology.
642 Game Theory and Strategic Complexity
The use of a Moore machine to represent a which it refines, is that players must specify strat-
strategy rules out strategies in which histories are egies that are best responses to each other even at
arbitrarily finitely partitioned or arbitrarily com- nodes in the game tree that would never be
plex. In fact, the number of states in the machine is reached if the prescribed equilibrium were being
a popular measure of the complexity of the played. The Nash concept does not require this.
machine and the strategy it represents. The notion of histories off the equilibrium path
Another kind of finite automaton used in the therefore refers to those that do not occur if every
literature is a Mealy machine. The main difference player follows his or her equilibrium strategy.
between this and the Moore machine is that now Another useful concept to mention here is that of
the output is a function both of the state and of an payoff in the continuation game. This refers to the
input, unlike the Moore machine where it is only a expected payoff from the prescribed strategies in
function of the state. One can always transform a the part of the game remaining to be played after
Mealy machine to a Moore machine by making some moves have already taken place. The restric-
transitions depend on the input and having state tion of the prescribed strategies to the continua-
transitions after every input. The Mealy machine tion game is referred to here as continuation
representation is more convenient for the exten- strategies.
sive form game we shall consider in section Rubinstein (1986), Abreu and Rubinstein
“Complexity and Bargaining.” We shall briefly (1988), and others have modified the standard
address why in that section. equilibrium concepts to account for complexity
The aim of using the machine framework to costs. This approach is somewhat different from
describe strategies is to take into account explic- that adopted, for example, by Neyman (1985),
itly the cost of complexity of strategies. There is who restricted strategies to those of bounded com-
the belief, for instance, that short-term memory plexity. We shall next present the Abreu-
(see Miller 1956) is capable of keeping seven Rubinstein definition of Nash equilibrium with
things in mind at any given time, and if five of complexity (often referred to as NEC in the rest
them are occupied by how to play the Prisoners’ of the entry).
Dilemma, there might be less leftover for other The basic idea is a very simple extension of
important activities. Nash equilibrium. Complexity enters the utility
The standard equilibrium concept in game the- function lexicographically. A player first calcu-
ory is the concept of Nash equilibrium. This lates his or her best response to the conjectured
requires each player to choose a best strategy strategies of the other players. If there are alterna-
(in terms of payoff) given his or her conjectures tive best responses, the player chooses the less
about other players’ strategies, and, of course, in complex one. Thus, a Nash equilibrium with com-
equilibrium the conjectures must be correct. Thus, plexity has two aspects. First, the strategies cho-
a Nash equilibrium is a profile of strategies, one sen by any player must be the best response given
for each player, such that every player is choosing his or her conjectures about other players’ strate-
the best response strategy given the Nash equilib- gies, and, of course, in equilibrium the conjectures
rium strategies of the other players. In dynamic must be correct. Second, there must not exist an
games Nash equilibrium strategies may not be alternative strategy for a player such that his or her
credible (sequentially rational). In multistage payoff is the same as in the candidate equilibrium
games, to ensure credibility, the concept of Nash strategy, given what other players do, but the
equilibrium is refined by requiring the strategy of alternative strategy is less complex.
each player to be a best response to the strategies In Abreu and Rubinstein (1988), the measure
of the others at every well-defined history of complexity is the number of states in the Moore
(subgame) within the game. This notion of equi- machine that represents the strategy. The second
librium was introduced by Selten (1965) and is part of their equilibrium definition restricts the
called subgame perfect equilibrium. The differ- extent to which punishments can be used off the
ence between this concept and that of Nash, equilibrium path. For example, there is a famous
Game Theory and Strategic Complexity 643
strategy that, if used by all players, gives cooper- complexity costs in (bargaining) games in which
ation in the infinitely repeated Prisoners’ machines/strategies can make errors/trembles in
Dilemma (for sufficiently high discount factors), output/action. The introduction of errors ensures
namely, the “grim” strategy. This strategy can be that the equilibrium strategies are optimal after
described by the following machine: Start with every history. As the error goes to zero, we are
Cooperate. Play Cooperate as long as the other left with subgame perfect equilibria of the under-
players all cooperate. If in the last period any lying game. Chatterjee and Sabourian (2000a),
player has used Defect, then switch to playing Sabourian (2003), Gale and Sabourian (2005),
Defect forever (i.e., never play Cooperate again, and Lee and Sabourian (2007) take a more direct
no matter what the other players do in succeeding method of introducing credibility into the equilib-
periods). This strategy profile (each player uses rium concept with complexity costs by restricting
the grim strategy) gives an outcome path NEC strategies to be subgame perfect equilibrium
consisting solely of players cooperating. No one in the underlying game with no complexity costs.
defects because from then until the end of time all We refer to such an equilibria by perfect equilib-
the players will be punishing one another. rium with complexity costs (PEC).
However, this strategy profile is not a Nash In contrast to the fixed-cost interpretation of
equilibrium with complexity; the grim strategy is complexity cost, Rubinstein in his 1986 paper
a two-state machine in which one state (the one in considers a different approach, namely, the choice
which a player chooses Defect) is never used of “renting” states in the machine for every period
given that everyone else cooperates on the equi- the game is played. Formally, the Rubinstein
librium path. Some player can do better, even if notion of semi-perfect equilibrium requires the
lexicographically, by switching to a one-state strategy chosen to have the minimal number of
machine in which he or she cooperates no matter states necessary to play the game at every node on
what. Thus, even the weak lexicographic require- the (candidate) equilibrium outcome path. A state
ment has some bite. could therefore be dropped if it is not going to be
Note that the complexity restriction we are used on the candidate equilibrium path after some
considering is on the complexity of implementa- period. Thus, to be in the equilibrium machine, it
tion, not the complexity of computation. We know is not sufficient that a state be used on the path; it
that even a Turing machine, which has potentially has to be used in every possible future. Rubinstein
infinite memory, might be unable to calculate best called this notion of equilibrium semi-perfect,
responses to all possible strategy profiles of other because the complexity of a strategy could be
players in the game (see Anderlini 1990; Binmore changed in one direction (it could be decreased)
1987). after every period. If states could be added as well
To return to the question of defining equilib- as deleted every period, we would have yet
rium in the machine game, the Abreu-Rubinstein another definition of equilibrium with complexity,
approach is described by them as “buying” states machine subgame perfect equilibrium. (See Neme
in the machine at the beginning of the game. The and Quintas 1995.) In contrast, both the NEC and
complexity cost is therefore a fixed cost per state PEC concepts we use here entail a single choice of
used. Some recent papers have taken the fixed- automaton or strategy by players at the beginning
cost approach further by requiring NEC strategies of the game.
to be credible. The idea is that players pay an In all these models, complexity analysis has
initial fixed cost for the complexity (the notion been facilitated by considering the “machine
of complexity in some of these papers differs from games.” Each player chooses among machines,
counting the states approach) of his/her strategy and the complexity of a machine is taken to be
and then the game is played with strategies being the number of states of the machine. In fact, the
optimal at every contingency as in standard game counting-the-number-of-states measure of com-
theory. Chatterjee and Sabourian (2000a, b) plexity has an equivalent measure stated in terms
model this by considering Nash equilibrium with of the underlying strategies that the machine could
644 Game Theory and Strategic Complexity
implement. Kalai and Stanford (1988) define infinitely often, and future payoffs are discounted
complexity of a strategy by the number of contin- with a common discount factor d.
uation strategies that the strategy induces at dif- The solution concept to be used was introduced
ferent periods/histories of the game and in the last section, NEC or Nash equilibrium with
establishes that such a measure is equal to the complexity. Note that here complexity is endoge-
number of the states of the smallest machine that nous. A player has a preference for less complex
implements the strategy. Thus, one could equiva- strategies. This preference comes into play lexi-
lently describe any result either in terms of under- cographically, that is, for any strategies or
lying strategies and the cardinality of the set of machines that give the same payoff against the
continuation strategies that they induce or in terms opponent’s equilibrium strategy, a player will
of machines and the number of states in them. The choose the one with lowest complexity. Thus,
same applies to other measures of complexity the cost of complexity is infinitesimal. One
discussed in this entry; they can be defined either could also consider positive but small costs of
in terms of the machine specification or in terms of more complex strategies, but results will then
the underlying strategy. In the rest of this entry, to depend on how large the cost of additional com-
simplify the exposition, we shall at times go from plexity is compared to the additional payoff
one exposition to the other without further obtained with a more complex strategy.
explanation. We saw in the last section that the “grim trig-
With this preamble on the concepts of equilib- ger” strategy, which is a two-state automaton, is
rium used in this literature, we turn to a discussion not an NEC. The reason is that if Player 2 uses
of a specific game in the next section, the infinitely such a strategy, Player 1 can be better off by
repeated Prisoners’ Dilemma. We will discuss deviating to a one-state strategy in which she
mainly the approach of Abreu and Rubinstein in always cooperates. (This will give the same pay-
this section but contrast it with the literature fol- off with a less complex strategy.) One-state strat-
lowing from Neyman. We also note that the sug- egies where both players cooperate clearly do not
gestion for using finite automata in games of this constitute NEC (deviating and choosing a one-
kind came originally from Aumann (1981). state machine that always plays D is strictly better
for a player). However, if both players use a one-
state machine that always generates an action of
D, this is an NEC.
Complexity Considerations in Repeated
The question obviously arises if the coopera-
Games
tive outcome in each stage can be sustained as an
NEC, and the preceding discussion makes clear
Endogenous Complexity
that the answer is no. Punishments have to be used
In this subsection, we shall first concentrate on the
on the equilibrium path, but we can get arbitrarily
Prisoners’ Dilemma and discuss the work of
close to the cooperative outcome for a high
Abreu and Rubinstein, which was introduced
enough discount factor. For example, consider
briefly in the last section. For concreteness, con-
the following two-state machine:
sider the following Prisoners’ Dilemma payoffs:
C2 D2 Q ¼ q1 , q2 ; lðq1 Þ ¼ D, lðq2 Þ ¼ C,
C1 3, 3 1, 4 mðq1 , C Þ ¼ q1 , mðq1 , DÞ ¼ q2 , mðq2 , DÞ ¼ q1 ,
D1 4, 1 0, 0 mðq2 , C Þ ¼ q2 :
This is the “stage game”; each of the two Here both players play the same strategy,
players chooses an action in each stage; their which starts out playing D. If both players do as
actions are revealed at the end of the stage, and they are supposed to, each plays C in the next
then the next stage begins. The game is repeated period and thereafter, so the sequence of actions
Game Theory and Strategic Complexity 645
is (D,D), (C,C), (C,C). . . . If either player plays by one of the players (say Player 1), so that to the
C in the first period, the other player keeps playing other player it becomes part of the “environment.”
D in the next period. The transition rule prescribes For Player 2 to calculate the best response or an
that if one plays C and one’s opponent plays D, one optimal strategy to Player 1’s given machine, it is
goes back to playing D, so the sequence with the clearly not necessary to partition past histories
deviation will be (D,C), (D,D), (C,C), (C,C). . . . more finely than the other player has done in
Suppose both players use this machine. First, obtaining her strategy; therefore, the number of
we check it is a Nash equilibrium in payoffs. We states in Player 2’s machine need not (and there-
only need to check what happens when a player fore will not, if there are complexity costs) exceed
plays C. If Player 2 deviates and plays D, she will the number in Player 1’s machine in equilibrium.
get an immediate payoff of 4 followed by payoffs The same holds true in the other direction, so the
of 0,3,3. . . if she thereafter sticks to her strategy number of states must be equal. (This does not
for a total payoff of 4 + d2(3/(1 d)) as opposed hold for more than two players.) Another way of
to 3/(1 d) if she had not deviated. The net gain interpreting this result is that it restates the result
from deviation is 1 3d, which is negative for from Markov decision processes on the existence
d > 13. One can check that more complicated devi- of an optimal “stationary” policy (i.e., depending
ations are also worse. The second part of the only on the states of the environment, which are
definition needs to be checked as well, so we here the same as the states of the other player’s
need to ensure that a player cannot do as well in machine). See also Piccione (1992).
terms of payoff by moving to a less complex Thus, there is a one-to-one correspondence
strategy, namely, a one-state machine. A one- between the states of the two machines. (Since
state machine that always plays C will get the the number of states is finite and the game is
worst possible payoff, since the other machine infinitely repeated, the machine must visit at
will keep playing D against it. A one-state least one of the states infinitely often for each
machine that plays D will get a payoff of 4 in player.) One can strengthen this further to estab-
periods 2,4,6. . . or a total payoff of 4d/(1 d2) lish a one-to-one correspondence between
as against 3d/(1 d). The second is strictly actions. Suppose Player 1’s machine has at1 ¼
greater for d > 13. as1, where these denote the actions taken at two
This machine gives a payoff close to 3 per distinct periods and states by Player 1, with at2 6¼
stage for d close to 1. As d ! 1, the payoff of as2 for Player 2. Since the states in t and s are
each player goes to 3, the cooperative outcome. distinct for Player 1 and the actions taken are the
The paper by Abreu and Rubinstein obtains a same, the transitions must be different following
basic result on the characterization of payoffs the two distinct states. But then Player 1 does not
obtained as NEC in the infinitely repeated Pris- need two distinct states; he can drop one and
oners’ Dilemma. We recall that the “Folk Theo- condition the transition after, say, s on the differ-
rem” for repeated games tells us that all outcome ent action used by Player 2. (Recall the transition
paths that give a payoff per stage strictly greater is a function of the state and the opponent’s
for each player than the minmax payoff for that action.) But then Player 1 would be able to obtain
player in the stage game can be sustained by Nash the same payoff with a less complex machine; so
equilibrium strategies. Using endogenous com- the original one could not have been an NEC
plexity, one can obtain a refinement; now only machine.
payoffs on a so-called cross are sustainable as Therefore, the actions played must be some
NEC. This result is obtained from two observa- combination of (C,C) and (D,D) (the correspon-
tions. First, in any NEC of a two-player game, the dence is between the two Cs and the two Ds) or
number of states in the players’ machines must be some combination of (C,D) and (D,C).
equal. This follows from the following intuitive (By combination, we mean combination over
reasoning (we refer readers to the original paper time. For example, (C,C) is played, say, 10 times
for the proofs). Suppose we fix the machine used for every 3 plays of (D,D). In the payoff space,
646 Game Theory and Strategic Complexity
cooperative state, the transition from q2 to the constrained and the unconstrained games will
defect state will take place no matter whether the coincide. For sub-exponential growth, a version
other player plays C or D. However, automata of the Folk theorem is proved for two-person
with three states violate the constraint that the games. The first result says:
number of states be no more than 2, so the profit- For every game G in strategic form and with m i
able deviation is out of reach. being the bound on the complexity of i’s strategy
While this is easy to see, it is not clear what and T the number of times the game G is played,
happens when the complexity is high. Neyman there exists a constant c such that if mi exp(cT),
then E(GT) ¼ E(GT(m1,m2) where E(.) is the set of
shows the following result: For any integer k, equilibrium payoffs in the game concerned.
there exists a T 0, such that for T T0 and
T{(1/k)}m1,m2T{k} there is a mixed strategy The second result, which generalizes the Pris-
equilibrium of GT(m1,m2) in which the expected oners’ Dilemma result already stated, considers a
average payoff to each player is at least 3 1k . sequence of triples (m1(n),m2(n),T(n)) for a two-
The basic idea is that rather than playing (C,C) player strategic form game, with m2 m1, and
at each stage, players are required to play a com- shows that the lim inf of the set of equilibrium
plex sequence of C and D, and keeping track of payoffs of the automata game as n ! 1 includes
this sequence uses up a sufficient number of states essentially the strictly individually rational pay-
in the automaton so that profitable deviations offs of the stage game if m1(n) ! 1 and
again hit the constraint on the number of states. (log m1(n))/T(n) ! 0 as n ! 1. Thus, a version
But since D cannot be avoided on the equilibrium of the Folk theorem holds provided the complex-
path, only something close to (C,C) each period ity of the players’ machines does not grow too fast
can be obtained rather than (C,C) all the time. with the number of repetitions.
Zemel’s paper adds a clever little twist to this
argument by introducing communication. In his
game, there are two actions each player chooses at Complexity and Bargaining
each stage, either C or D as before and a message to
be communicated. The message does not directly Complexity and the Unanimity Game
affect payoffs as the choice of C or D does. The The well-known alternating offers bargaining
communication requirements are now made suffi- model of Rubinstein has two players alternating
ciently stringent, and deviation from them is consid- in making proposals and responding to proposals.
ered a deviation, so that once again the states “left Each period or unit of time consists of one pro-
over” to count up to N are inadequate in number and posal and one response. If the response is “reject,”
(C,C) can once again be played in each stage/period. the player who rejects makes the next proposal but
This is an interesting explanation of the rigid in the following period. Since there is discounting
“scripts” that many have observed to be followed, with discount factor d per period, a rejection has a
for example, in negotiations. cost. The unanimity game we consider is a multi-
Neyman (1997) surveys his own work and that person generalization of this bargaining game,
of Ben Porath (1986, 1993). He also generalizes with n players arranged in a fixed order, say
his earlier work on the finitely repeated Prisoners’ 1,2,3. . .n. Player 1 makes a proposal on how to
Dilemma to show how small the complexity divide a pie of size unity among the n people;
bounds would have to be in order to obtain out- players 2,3,. . .n respond sequentially, either
comes outside the set of (unconstrained) equilib- accepting or rejecting. If everyone accepts, the
rium payoffs in the finitely repeated, normal-form game ends. If someone rejects, Player 2 now
game (just as (C,C) is not part of an unconstrained gets to make a proposal but in the next period.
equilibrium outcome path in the Prisoners’ The responses to Player 2’s proposal are made
Dilemma). Essentially, if the complexity permit- sequentially by Players 3,4,5. . .n,1. If Player
ted grows exponentially or faster with the number i gets a share x i in an eventual agreement at time
of repetitions, the equilibrium payoff sets of the t, his payoff is dt1xi.
648 Game Theory and Strategic Complexity
Avner Shaked had shown in 1986 that the rejects. Suppose it is Player i who rejects. In the
unanimity game had the disturbing feature that following period, the offer made gives 1 to Player
all individually rational (i.e., nonnegative payoffs i and 0 to the others, and this is accepted.
for each player) outcomes could be supported as Various attempts were made to get around the
subgame perfect equilibria. Thus, the sharp result continuum of equilibria problem in bargaining
of Rubinstein (1982), who found a unique sub- games with more than two players; most of them
game perfect equilibrium in the two-play, stood in involved changing the game. (See Chatterjee and
complete contrast with the multiplicity of sub- Sabourian 2000a, b for a discussion of this litera-
game perfect equilibria in the multiplayer game. ture.) An alternative to changing the game might
Shaked’s proof had involved complex changes be to introduce a cost for this additional complex-
in expectations of the players if a deviation from ity, in the belief that players who value simplicity
the candidate equilibrium were to be observed. will end up choosing simple, that is, history-
For example, in the three-player game with com- independent strategies. This seems to be a prom-
mon discount factor d, the three extreme points ising approach because it is clear from Shaked’s
(1,0,0), (0,1,0), and (0,0,1) sustain one another in construction that the large number of equilibria
the following way. Suppose Player 1 is to propose results from the players choosing strategies that
(0,1,0), which is not a very sensible offer for him are history dependent. In fact, if the strategies are
or her to propose, since it gives everything to the restricted to those that are history independent
second player. If Player 1 deviates and proposes, (also referred to as stationary or Markov), then it
say, ((1 d)/2, d, (1 d)/2) then it might be can be shown (see Herrero 1985) that the subgame
reasoned that Player 2 would have no incentive perfect equilibrium is unique and induces equal
to reject because in any case he or she can’t get division of the pie as d ! 1.
more than 1 in the following period, and Player The two papers (Chatterjee and Sabourian
3 would surely prefer a positive payoff to 0. How- 2000a, b) in fact seek to address the issue of
ever, there is a counterargument. In the subgame complex strategies with players having a prefer-
following Player 1’s deviation, Player 3’s expec- ence for simplicity, just as in Abreu and Rubin-
tations have been raised so that he (and everyone stein. However, now we have a game of more than
else, including Player 1) now expect the outcome two players and a single extensive form game
to be (0,0,1), instead of the earlier expected out- rather than a repeated game as in Abreu-
come. For sufficiently high discount factor, Player Rubinstein. It was natural that the framework
3 would reject Player 1’s insufficiently generous had to be broadened somewhat to take this into
offer. Thus, Player 1 would have no incentive to account.
deviate. Player 1 is thus in a bind; if he offers For each of n players playing the unanimity
Player 2 less than d and offers Player 3 more in game, we define a machine or an implementation
the deviation, the expectation that the outcome of the strategy as follows.
next period will be (0,1,0) remains unchanged, A stage of the game is defined to be n periods,
so now Player 2 rejects his offer. So no deviation such that if a stage were to be completed, each
is profitable, because each deviation generates an player would play each role at most once. A role
expectation of future outcomes, an expectation could be as proposer or n1th responder or n2th
that is confirmed in equilibrium. (This is what responder . . . up to the first responder (the last role
equilibrium means.) Summarizing, (0,1,0) is would occur in the period before the player
sustained as follows: Player 1 offers (0,1,0), and concerned had to make another proposal). An
Player 2 accepts any offer of at least 1 and Player outcome of a stage is defined as a sequence of
3 any offer of at least 0. If one of them rejects offers and responses, for example, e ¼ (x,A,A,R;y,
Player 1’s offer, the next player in order offers R;z,A,R;b,A,A,A) in a four-player game where the
(0,1,0) and the others accept. If any proposer, (x,y,z,b) are proposals made in the four periods
say Player 1, deviates from the offer (0,1,0) to and (A,R) refer to accept and reject, respectively.
(x1,x2,x3), the player with the lower of {x2,x3} From the point of view of the first player to
Game Theory and Strategic Complexity 649
propose (for convenience, let’s call him Player 1), machines play the unanimity game in analogy
he makes an offer x, which is accepted by Players with Abreu-Rubinstein. Using the same lexico-
2 and 3 but rejected by Player 4. Now it is Player graphic utility, with complexity coming after
2’s turn to offer, but this offer, y, is rejected by the bargaining payoffs, what do we find for Nash
first responder Player 3. Player 1 gets to play as equilibria of the machine game?
second responder in the next period, where he As it turns out, the addition of complexity costs
rejects Player 3’s proposal. In the last period of in this setting has some bite but not much. In
this stage, a proposal b is made by Player 4 and particular, any division of the pie can be sustained
everyone accepts (including Player 1 as first in some Nash equilibrium of the machine game.
responder). Any partial history within a stage is Perpetual disagreement can, in fact, be sustained
denoted by s. For example, when Player 2 makes by a stationary machine, that is, one that makes
an offer, he does so after a partial history s ¼ (x,A, the same offers and responses each time,
A,R). Let the set of possible outcomes of a stage be irrespective of past history. Nor can we prove,
denoted by E and the set of possible partial histo- for general n-player games, that the equilibrium
ries by S. Let Q i denote the set of states used in the machines will be one state. (A three-player
ith player’s machine M i. The output mapping is counter-example exists in (Chatterjee and
given by li:S Qi ! L, where L is the set of Sabourian 2000b); it does not appear to be possi-
possible actions (i.e., the set of possible proposals, ble to generate in games that lasted less than thirty
plus accept or reject). The transition between periods.) For two-player games, the result that
states now takes place at the end of each stage, machines must be one state in equilibrium can
so the transition mapping is given as mi:E Qi be shown neatly (Chatterjee and Sabourian
! Qi. As before, in the Abreu-Rubinstein setup, 2000b); another illustration that in this particular
there is an initial state qinitial,i specified for each area, there is a substantial increase of analytical
player. There is also a termination state F, which is difficulty in going from two to three players.
supposed to indicate agreement. Once in the ter- One reason why complexity does not appear
mination state, players will play the null action important here is that the definition of complexity
and make transitions to this state. used is too restrictive. Counting the number of
Note that our formulation of a strategy natu- states is fine, so long as we don’t consider how
rally uses a Mealy machine. The output mapping complex a response might be for partial histories
li(.,.) has two arguments, the state of the machine within a stage. The next attempt at a solution is
and the input s, which lists the outcomes of pre- based on this observation.
vious moves within the stage. The transitions take We devise the following definition of complex-
place at the end of the stage. The benefit of using ity: Given the machine and the states, if a machine
this formulation is that the continuation game is made the same response to different partial stage
the same at the beginning of each stage. In Chat- histories in different states and another machine
terjee and Sabourian (2000b), we investigate the made different responses, then the second one was
effects of modifying this formulation, including more complex (given that the machines were
studying the effects of having a submachine to identical in all other respects). We refer to this
play each role. The different formulations can all notion as response complexity. (In (Chatterjee
implement the same strategies, but the complexi- and Sabourian 2000a) the concept of response
ties in terms of various measures could differ. We complexity is in fact stated in terms of the under-
refer the reader to that paper for details, but lying strategy rather than in terms of machines.) It
emphasize that in the general unanimity game, captures the intuition that counting states is not
the results from other formulations are similar to enough; two machines could have the same num-
the one developed here, though they could differ ber of states, for example, because each generated
for special cases, like three-player games. the same number of distinct offers, but the com-
We now consider a machine game, where plexity of responses in one machine could be
players first choose machines and then the much lower than that in the other. Note that this
650 Game Theory and Strategic Complexity
notion would only arise in extensive-form games. one essentially recognizing that players can make
In normal-form games, counting states could be mistakes and the other that players prefer simpler
an adequate measure of complexity. Nor is this strategies if the payoffs are the same as those
notion of complexity derivable from notions of given by a more complex strategy, resolves the
transition complexity, due to Banks and problem of multiplicity of equilibria in the multi-
Sundaram, for example, which also apply in person bargaining game.
normal-form games. As we mentioned before, the introduction of
The main result of Chatterjee and Sabourian errors ensures that the equilibrium strategies are
(2000a) is that this new aspect of complexity credible at every history. We could also take the
enables us to limit the amount of delay that can more direct (and easier) way of obtaining the
occur in equilibrium and hence to infer that only uniqueness result with complexity costs by con-
one-state machines are equilibrium machines. sidering NEC strategies that are subgame perfect
The formal proofs using two different in the underlying game (PEC) (as done in
approaches are available in Chatterjee and (Chatterjee and Sabourian 2000a)). Then since a
Sabourian (2000a, b). We mention the basic intu- history-independent subgame perfect equilibrium
ition behind these results. Suppose, in the three- of the game is unique and any NEC automaton
player game, there is an agreement in period profile has one state and hence is history indepen-
4 (this is in the second stage). Why doesn’t this dent, it follows immediately that any PEC is
agreement take place in period 1 instead? It must unique and induces equal division as d ! 1.
be because if the same offer and responses are
seen in period 1, some player will reject the Complexity and Repeated Negotiations
offer. But of course, he or she does not have to In addition to standard repeated games or standard
do so because the required offer never happens. bargaining games, multiplicity of equilibria often
But a strategy that accepts the offer in period 4 and appears in dynamic repeated interactions, where a
rejects it off the equilibrium path in period 1 must repeated game is superimposed on an alternating
be more complex, by our definition, than one that offers bargaining game. For instance, consider
always accepts it whenever it might happen, on or two firms, in an ongoing vertical relationship,
off the expected path. Repeated application of this negotiating the terms of a merger. Such situations
argument by backward induction gives the result. have been analyzed in several “negotiation
(The details are more complicated but are in the models” by Busch and Wen (1995), Fernandez
papers cited above.) Note that this uses the defi- and Glazer (1991), and Haller and Holden
nition that two machines might have the same (1990). These models can be interpreted as com-
number of states, and yet one could be simpler bining the features of both repeated and
than the other. It is interesting, as mentioned ear- alternating-offers bargaining games. In each
lier, that for two players one can obtain an analo- period, one of the two players first makes an
gous result without invoking the response offer on how to divide the total available periodic
simplicity criterion, but from three players on (flow) surplus; if the offer is accepted, the game
this criterion is essential. ends with the players obtaining the corresponding
The above result (equilibrium machines have payoffs in the current and every period thereafter.
one state each and there are no delays beyond the If the offer is rejected, they play some normal-
first stage) is still not enough to refine the set of form game to determine their flow payoffs for that
equilibria to a single allocation. In order to do this, period, and then the game moves on to the next
we consider machines that can make errors/trem- period in which the same play continues with the
bles in output. As the error goes to zero, we are left players’ bargaining roles reversed. One can think
with perfect equilibria of our game. With one-state of the normal-form game played in the event of a
machines, the only subgame perfect equilibria are rejection as a “threat game” in which a player
the ones that give equal division of the pie as takes actions that could punish the other player
d ! 1. Thus, a combination of two techniques, by reducing his total payoffs.
Game Theory and Strategic Complexity 651
If the bargaining had not existed, the game negotiation game with complexity costs, Lee and
would be a standard repeated normal-form game. Sabourian also show that introducing transaction
Introducing bargaining and the prospect of per- costs into the negotiation game dramatically alters
manent exit, the negotiation model still admits a the selection result from efficiency to inefficiency.
large number of equilibria, like standard repeated In particular, they show that, for any discount
games. Some of these equilibria involve delay in factor and any transaction cost, every PEC in the
agreement (even perpetual disagreement) and costly negotiation game induces perpetual dis-
inefficiency, while some are efficient. agreement if the stage game normal form (after
Lee and Sabourian (2007) apply complexity con- any disagreement) has a unique Nash equilibrium.
siderations to this model. As in Abreu and Rubin-
stein (1988) and others, the players choose among
automata, and the equilibrium notion is that of NEC Complexity, Market Games, and the
and PEC. One important difference however is that Competitive Equilibrium
in this entry, the authors do not assume the automata
to be finite. Also, the entry introduces a new There has been a long tradition in economics of
machine specification that formally distinguishes trying to provide a theory of how a competitive
between the two roles – proposer and responder – market with many buyers and sellers operates.
played by each player in a given period. The concept of competitive (Walrasian) equilib-
Complexity considerations select only efficient rium (see Debreu 1959) is a simple description of
equilibria in the negotiation model players are such markets. In such an equilibrium each trader
sufficiently patient. First, it is shown that if an chooses rationally the amount he wants to trade
agreement occurs in some finite period as an taking the prices as given, and the prices are set
NEC outcome, then it must occur within the first (or adjust) to ensure that total demanded is equal
two periods of the game. This is because if an to the total supplied. The important feature of the
NEC induces an agreement beyond the first two setup is that agents assume that they cannot influ-
periods, then one of the players must be able to ence (set) the prices, and this is often justified by
drop the last period’s state of his machine without appealing to the idea that each individual agent is
affecting the outcome of the game. Second, given small relative to the market.
sufficiently patient players, every PEC in the There are conceptual as well as technical prob-
negotiation model that induces perpetual dis- lems associated with such a justification. First, if
agreement is at least long-run almost efficient; no agent can influence the prices, then who sets
that is, the game must reach a finite date at them? Second, even in a large but finite market, a
which the continuation game then on is almost change in the behavior of a single individual agent
efficient. may affect the decisions of some others, which in
Thus, these results take the study of complexity turn might influence the behavior of some other
in repeated games a step further from the previous agents and so on and so forth; thus, the market as a
literature in which complexity or bargaining alone whole may end up being affected by the decision
has produced only limited selection results. of a single individual.
While, as we discussed above, many inefficient Game theoretic analysis of markets have tried
equilibria survive complexity refinement, Lee and to address these issues (e.g., see Gale 2000;
Sabourian (2007) demonstrate that complexity Sabourian 2003). This has turned out to be a
and bargaining in tandem ensure efficiency in difficult task because the strategic analysis of
repeated interactions. Complexity considerations markets, in contrast to the simple and elegant
also allow Lee and Sabourian to highlight the role model of competitive equilibrium, tends to be
of transaction costs in the negotiation game. complex and intractable. In particular, dynamic
Transaction costs take the form of paying a cost market games have many equilibria, in which a
to enter the bargaining stage of the negotiation variety of different kinds of behavior are sustained
game. In contrast to the efficiency result in the by threats and counter-threats.
652 Game Theory and Strategic Complexity
More than 60 years ago, Hayek (1945) noted s and many buyers. Since there are more buyers
that the competitive markets are simple mecha- than sellers, the price of 1, at which the seller
nisms in which economic agents only need to receives all the surplus, is the unique competitive
know their own endowments, preferences and equilibrium; furthermore, since there are no fric-
technologies, and the vector of prices at which tions, p ¼ 1 seems to be the most plausible price.
trade takes place. In such environments, economic RW’s precise result, however, establishes that for
agents maximizing utility subject to constraints any price p* [0, 1] and any buyer b, there is a
make efficient choices in equilibrium. Below we subgame perfect equilibrium that results in s and
report some recent work, which suggests that the b trading at p. The idea behind the result is to
converse might also be true: construct an equilibrium strategy profile such that
If rational agents have, at least at the margin, an buyer b is identified as the intended recipient of
aversion to complex behavior, then their maximiz- the good at a price p. This means that the strate-
ing behavior will result in simple behavioral rules gies are such that (i) when s meets b, whichever is
and thereby in a perfectly competitive equilibrium. chosen as the proposer offers price p and the
(Gale and Sabourian 2005)
responder accepts; (ii) when s is the proposer in
a match with some buyer b 6¼ b*, s offers the
Homogeneous Markets good at a price of p ¼ 1 and b rejects; and (iii)
In a seminal paper, Rubinstein and Wolinsky when a buyer b 6¼ b* is the proposer, he offers to
(1990), henceforth RW, considered a market for buy the good at a price of p ¼ 0 and s rejects.
a single indivisible good in which a finite number These strategies produce the required outcome.
of homogeneous buyers and homogeneous sellers Furthermore, the equilibrium strategies make use
are matched in pairs and bargain over the terms of of the following punishment strategies to deter
trade. In their setup, each seller has one unit of an deviations. If the seller s deviates by proposing
indivisible good, and each buyer wants to buy at to a buyer b a price p 6¼ p*, b rejects this offer, and
most one unit of the good. Each seller’s valuation the play continues with b becoming the intended
of the good is 0, and each buyer’s valuation is recipient of the item at a price of zero. Thus, after
1. Time is divided into discrete periods, and at rejection by b strategies are the same as those
each date, buyers and sellers are matched ran- given earlier with the price zero in place of p
domly in pairs, and one member of the pair is and buyer b in place of buyer b. Similarly, if a
randomly chosen to be the proposer and the buyer b deviates by offering a price p 6¼ 1 then the
other the responder. In any such match, the pro- seller rejects, another buyer b0 6¼ b is chosen to be
poser offers a price p [0,1] and the responder the intended recipient, and the price at which the
accepts or rejects the offer. If the offer is accepted, unit is traded changes to 1. Further deviations
the two agents trade at the agreed price p, and the from these punishment strategies can be treated
game ends with the seller receiving a payoff p and in an exactly similar way.
the buyer in the trade obtaining a payoff 1p. If The strong impression left by RW is that inde-
the offer is rejected, the pair returns to the market terminacy of equilibrium is a robust feature of
and the process continues. RW further assume that dynamic market games, and, in particular, there
there is no discounting to capture the idea that is no reason to expect the outcome to be perfectly
there is no friction (cost to waiting) in the market. competitive. However, the strategies required to
Assuming that the number of buyers and support the family of equilibria in RW are quite
sellers is not the same, RW showed that this complex. In particular, when a proposer deviates,
dynamic matching and bargaining game has, in the strategies are tailor-made so that the responder
addition to a perfectly competitive outcome, a is rewarded for rejecting the deviating proposal.
large set of other subgame perfect equilibrium This requires coordinating on a large amount of
outcomes, a result reminiscent of the Folk theo- information so that at every information set the
rem for repeated games. To see the intuition for players know (and agree) what constitutes a
this, consider the case in which there is one seller deviation.
Game Theory and Strategic Complexity 653
In fact, RW show that if the amount of infor- 1. For example, if in the case of a buyer this was
mation available to the agents is strictly limited so not the case, then since accepting 1 is a worst
that the agents do not recall the history of past possible outcome, he could economize on com-
play, then the only equilibrium outcome is the plexity and obtain at least the same payoff by
competitive one. This suggests that the competi- adopting another strategy that is otherwise the
tive outcome may result if agents use simple strat- same as the equilibrium strategy except that it
egies. Furthermore, the equilibrium strategies always rejects 1.
used described in RW to support noncompetitive Second, in any noncompetitive NEC in which
outcomes are particularly unattractive because s receives a payoff of less than 1, there cannot be
they require all players, including those buyers an agreement at a price of 1 between s and a buyer
who do not end up trading, to follow complex at any history. For example, if at some history a
non-stationary strategies in order to support a buyer is offered p ¼ 1 and he accepts, then by the
noncompetitive outcome. But buyers who do not first step, the buyer should accept p ¼ 1 whenever
trade and receive zero payoff on the equilibrium it is offered; but this is a contradiction because it
path could always obtain at least zero by follow- means that the seller can guarantee himself an
ing a less complex strategy than the ones specified equilibrium payoff of one by waiting until he has
in RW’s construction. Thus, RW’s construction of a chance to make a proposal to this buyer.
noncompetitive equilibria is not robust if players Third, in any noncompetitive PEC, the contin-
prefer, at least at the margin, a simpler strategy to a uation payoffs of all buyers are positive at every
more complex one. history. This follows immediately from the previ-
Following the above observation, Sabourian ous step because if there is no trade at p ¼ 1 at any
(2003), henceforth S, addresses the role of com- history, it follows that each buyer can always
plexity (simplicity) in sustaining a multiplicity of obtain a positive payoff by offering the seller
noncompetitive equilibria in RW’s model. The more than he can obtain in any subgame.
concept of complexity in S is similar to that in Finally, because of competition between the
Chatterjee and Sabourian (2000a). It is defined by buyers (there is one seller and many buyers), in
a partial ordering on the set of individual strategies any subgame perfect equilibrium, there must be a
(or automata) that very informally satisfies the buyer with a zero continuation payoff after some
following: If two strategies are otherwise identical history. To illustrate the basic intuition for this
except that in some role, the second strategy uses claim, let m be the worst continuation payoff for
more information than that available in the current s at any history, and suppose that there exists a
period of bargaining and the first uses only the subgame at which s is the proposer in a match with
information available in the current period, then a buyer b, and the continuation payoff of s at this
the second strategy is said to be more complex subgame is m. Then if at this subgame s proposes
than the first. S also introduces complexity costs m + e (e > 0), b must reject (otherwise s can get
lexicographically into the RW game and shows more than m). Since the total surplus is 1, b must
that any PEC is history-independent and induces obtain at least 1 m e in the continuation
the competitive outcome in the sense that all trades game in order to reject s’s offer and s gets at least
take place at the unique competitive price of 1. m; this implies that the continuation payoff of all
Informally, S’s conclusions in the case of a u 6¼ u after us rejection is less than e. The result
single seller s and many buyers follow from the follows by making e arbitrarily small (and by
following three steps. First, since trading at the appealing to the finiteness of f ).
competitive price of 1 is the worst outcome for a But the last two claims contradict each other
buyer and the best outcome for the seller, by unless the equilibrium is competitive. This estab-
appealing to complexity type reasoning, it can be lishes the result for the case in which there is one
shown that in any NEC, a trader’s response to a seller and many buyers. The case of a market with
price offer of 1 is always history independent, and more than one seller is established by induction on
thus, he either always rejects 1 or always accepts the number of sellers.
654 Game Theory and Strategic Complexity
The matching technology in the above model is agent can be individually rational). Third, in a
random. RW also consider another market game homogeneous market, the set of competitive
with the matching is endogenous: At each date prices remains constant, independently of the set
each seller (the short side of the market) chooses of agents remaining in the market. In the hetero-
his trading partner. Here, they show that non- geneous market, this need not be so, and in some
competitive outcomes and multiplicity of equilib- cases, the new competitive interval may not even
ria survive even when the players discount the intersect the old one. The change in the competi-
future. By strengthening the notion of complexity, tive interval of prices as the result of trade exac-
S also shows that in the endogenous matching erbates the problems associated with using an
model of RW, the competitive outcome is the induction hypothesis because here future prices
only equilibrium if complexity considerations may be conditioned on past trades even if prices
are present. are restricted to be competitive ones.
These results suggest perfectly competitive Despite these difficulties associated with a
behavior may result if agents have, at least at the market with a heterogeneous set of buyers and
margin, preferences for simple strategies. Unfor- sellers, Gale and Sabourian (2005), henceforth
tunately, both RW and S have too simple a market GS, show that the conclusions of S can be
setup; for example, it is assumed that the buyers extended to the case of a heterogeneous market
are all identical, similarly for the sellers and each in which each agent trades at most one unit of the
agent trades at most one unit of the good. Do the good. GS, however, focus on deterministic
conclusions extend to richer models of trade? sequential matching models in which one pair of
agents is matched at each date and they leave the
Heterogeneous Markets market if they reach an agreement. In particular,
There are good reasons to think that it may be too they start by considering exogenous matching
difficult (or even impossible) to establish a similar processes in which the identities of the proposer
set of conclusions as in S in a richer framework. and responder at each date are an exogenous and
For example, consider a heterogeneous market for deterministic function of the set of agents
a single indivisible good, where buyers (and remaining in the market and the date. The main
sellers) have a range of valuations of the good result of the entry is that a PEC is always compet-
and each buyer wants at most one unit of the itive in such a heterogeneous market, thus
good and each seller has one unit of the good for supporting the view that competitive equilibrium
sale. In this case the analysis of S will not suffice. may arise in a finite market where complex behav-
First, in the homogeneous market of RW, except ior is costly.
for the special case where the number of buyers is The notion of complexity in GS is similar to
equal to the number of sellers, the competitive that in Chatterjee and Sabourian (2000a). How-
equilibrium price is either 0 or 1, and all of the ever, in the GS setup with heterogeneous buyers
surplus goes to one side of the market. S’s selec- and sellers, the set of remaining agents changes
tion result crucially uses this property of the com- depending who has traded and left the market and
petitive equilibrium. By contrast, in a who is remaining, and this affects the market
heterogeneous market, in general there will be conditions. (In the homogeneous case, only the
agents receiving positive payoffs on both sides number of remaining agents matters.) Therefore,
of the market in a competitive equilibrium. There- the definition of complexity in GS is with refer-
fore, one cannot justify the competitive outcome ence to a given set of remaining agents. GS also
simply by focusing on extreme outcomes in which discuss an alternative notion of complexity that is
there is no surplus for one party from trade. Sec- independent of the set of remaining agents; such a
ond, in a homogeneous market individually ratio- definition may be too strong and may result in an
nal trade is by definition efficient. This may not be equilibrium set being empty.
the case in a heterogeneous market (an inefficient To show their result, GS first establish two very
trade between inframarginal and an extramarginal useful restrictions on the strategies that form an
Game Theory and Strategic Complexity 655
NEC (similar to the no delay result in Chatterjee that if such an offer is made by k to ‘ on the
and Sabourian (2000a)). First, they show that if equilibrium path it is rejected. But then ‘ can
along the equilibrium path a pair of agents k and ‘ could economize on complexity by always
trades at a price p with k as the proposer and ‘ as rejecting p by k without sacrificing any payoff
the responder, then k and ‘ always trade at p, on the equilibrium path: Such a change of behav-
irrespective of the previous history, whenever ior is clearly more simple, and furthermore, ‘’s
the two agents are matched in the same way with payoff is not affected because such a behavior is
the same remaining set of agents. To show this the same as what the equilibrium strategy pre-
consider first the case of the responder ‘. Then it scribes on the equilibrium path.
must be that at every history with the same By appealing to the above two properties of
remaining set of agents, ‘ always accepts p by k. NEC and to the competitive nature of the market
Otherwise, ‘ could economize on complexity by GS establish, using a complicated induction argu-
choosing another strategy that is otherwise iden- ment, that every PEC induces a competitive out-
tical to his equilibrium strategy except that it come in which each trade occurs at the same
always accepts p from k without sacrificing any competitive price.
payoff: Such a change of behavior is clearly more The matching model we have described so far
simple than sometimes accepting and sometimes is deterministic and exogenous. The selection
rejecting the offer, and moreover, it results in result of GS however extends to richer determin-
either agent k proposing p and ‘ accepting, so istic matching models. In particular, GS also con-
the payoff to agent ‘ is the same as from the sider a semi-endogenous sequential matching
equilibrium strategy, or agent k not offering p, in model in which the choice of partners is endoge-
which case the change in the strategy is not nous but the identity of the proposer at any date is
observed and the play of the game is unaffected exogenous. Their results extend to this variation,
by the deviation. Furthermore, it must also be that with an endogenous choice of responders. A more
at every history with the same remaining set of radical departure change would be to consider the
agents agent k proposes p in any match with ‘. case where at any date any agent can choose his
Otherwise, k could economize on complexity by partner and make a proposal. Such a totally
choosing another strategy that is otherwise iden- endogenous model of trade generates new con-
tical to his equilibrium strategy except that it ceptual problems. In a recent working paper, Gale
always proposes p to ‘ without sacrificing any and Sabourian (2008) consider a continuous time
payoff on the equilibrium path: Such a change of version of such a matching model and show that
behavior is clearly more simple, and moreover, k’s complexity considerations allow one to select a
payoff is not affected because either agents k and ‘ competitive outcome in the case of totally endog-
are matched and k proposes p and ‘ by the previ- enous matching. Since the selection result holds
ous argument accepts, so the payoff to agent k is for all the different matching models, we can
the same as from the equilibrium strategy, or conclude that complexity considerations inducing
agents k and ‘ are not matched with k as the a competitive outcome seem to be a robust result
proposer, in which case the change in the strategy in deterministic matching and bargaining market
is not observed and the play of the game is unaf- games with heterogeneous agents.
fected by the deviation. Random matching is commonly used in eco-
GS show a second restriction, again with the nomic models because of its tractability. The basic
same remaining set of agents, namely, that in any framework of GS, however, does not extend to
NEC for any pair of agents k and ‘, player ‘’s such a framework if either the buyers or the sellers
response to k’s (on or off-the-equilibrium path) are not identical. This is for two different reasons.
offer is always the same. Otherwise, it follows First, in general in any random framework, there
that ‘ sometimes accepts an offer p by k and is more than one outcome path that can occur in
sometimes rejects (with the same remaining set equilibrium with a positive probability; as a result
of agents). Then by the first restriction it must be introducing complexity lexicographically may not
656 Game Theory and Strategic Complexity
be enough to induce agents to behave in a simple A different approach would be to assume that
way (they will have to be complex enough to play complexity is a less significant criterion than the
optimally along all paths that occur with a positive off-the-equilibrium payoffs. In the extreme case,
probability). Second, in Gale and Sabourian one would require agents to choose minimally
(2006) it is shown that subgame perfect equilibria complex strategies among the set of strategies
in Markov strategies are not necessarily perfectly that are best responses on and off the equilibrium
competitive for the random matching model with path (see Kalai and Neme 1992).
heterogeneous agents. Since the definition of An alternative way of illustrating the differ-
complexity in GS is such that Markov strategies ences between the different approaches is by
are the least complex ones, it follows that with introducing two kinds of vanishingly small per-
random matching the complexity definition used turbations into the underlying game. One pertur-
in GS is not sufficient to select a competitive bation is to impose a small but positive cost of
outcome. choosing a more complex strategy. Another per-
turbation is to introduce a small but positive prob-
Complexity and Off-The-Equilibrium- ability of making an error (off-the-equilibrium-
Path Play path move). Since a PEC requires each agents to
The concept of the PEC (or NEC) used in S, GS, choose a minimally complex strategy within the
and elsewhere was defined to be such that for each set of best responses, it follows that the limit
player, the strategy/automaton has minimal com- points of Nash equilibria of the above perturbed
plexity among all strategies/automata that are best game correspond to the concept of PEC if we first
responses to the equilibrium strategies/automata let the probability of making an off-the-
of others. Although, these concepts are very mild equilibrium-path move go to zero and then let
in the treatment of complexity, it should be noted the cost of choosing a more complex strategy go
that there are other ways of introducing complex- to zero (this is what Chatterjee and Sabourian
ity into the equilibrium concept. One extension of (2000a) do). On the other hand, in terms of the
the above setup is to treat complexity as a (small) above limiting arguments, if we let the cost of
positive fixed cost of choosing a more complex choosing a more complex strategy go to zero and
strategy and define a Nash (subgame perfect) then let the probability of making an off-the-
equilibrium with a fixed positive complexity equilibrium-path move go to zero, then any limit
costs accordingly. All the selection results based corresponds to the equilibrium definition in Kalai
on lexicographic complexity in the papers we and Neme (1992) where agents choose minimally
discuss in this survey also hold for positive small complex strategies among the set of strategies that
complexity costs. This is not surprising because are best responses on and off the equilibrium path.
with positive costs complexity has at least as Most of the results reported in this entry on
much bite as in the lexicographic case; there is at refinement and endogenous complexity (e.g.,
least as much refinement of the equilibrium con- Abreu-Rubinstein (1988)), Chatterjee and
cept with the former as with the latter. In particu- Sabourian (2000a), Gale and Sabourian (2005),
lar, in the case of an NEC (or a PEC), in and Lee and Sabourian (2007) hold only for the
considering complexity, players ignore any con- concept of NEC and its variations and thus depend
sideration of payoffs off the equilibrium path, and crucially on assuming that complexity costs are
the trade-off is between the equilibrium payoffs of more important than off-the-equilibrium payoffs.
two strategies and the complexity of the two. As a This is because these results always appeal to an
result these concepts put more weight on com- argument that involves economizing on complex-
plexity costs than on being “prepared” for off-the- ity if the complexity is not used off the equilib-
equilibrium-path moves. Therefore, although rium path. Therefore, they may be a good
complexity costs are insignificant, they take pri- predictor of what may happen only if complexity
ority over optimal behavior after deviations. (See costs are more significant than the perturbations
Chatterjee and Sabourian 2000b for a discussion.) that induce off-the-equilibrium-path behavior.
Game Theory and Strategic Complexity 657
The one exception is the selection result in Maenner (2008) has undertaken such an investi-
S (Sabourian 2003). Here, although the result we gation with the infinitely repeated Prisoners’
have reported is stated for NEC and its variations, Dilemma (studied in the equilibrium context by
it turns out that the selection of competitive equi- Abreu and Rubinstein). Maenner provides an
librium does not in fact depend on the relative argument for “learning to be simple.” On the
importance of complexity costs and off-the- other hand, there are arguments for increasing
equilibrium-path payoffs. It remains true even complexity in competitive games (Robson
for the case where the strategies are required to 2003). It is an open question, therefore, whether
be least complex among those that are best simplicity could arise endogenously through
responses at every information set. This is because learning, though it seems to be a feature of most
in S’s analysis complexity is only used to show human preferences and aesthetics (see Birkhoff
that every agent’s response to the price offer of 1 is 1933).
always the same irrespective of the past history of The broader research program of explicitly
play. This conclusion holds irrespective of the considering complexity in economic settings
relative importance of complexity costs and off- might be a very fruitful one. Auction mechanisms
the-equilibrium payoff because trading at the are designed with an eye towards how complex
price of 1 is the best outcome that any seller can they are – simplicity is a desideratum. The com-
achieve at any information set (including those plexity of contracting has given rise to a whole
off-the-equilibrium) and a worst outcome for any literature on incomplete contracts, where some
buyer. Therefore, irrespective of the order, the models postulate a fixed cost per contingency
strategy of sometimes accepting a price of 1 and described in the contract. All this is apart from
sometimes rejecting cannot be an equilibrium for the popular literature on complexity, which seeks
a buyer (similar arguments applies for a seller) to understand complex, adaptive systems from
because the buyer can economize on complexity biology. The use of formal complexity measures
by always rejecting the offer without sacrificing such as those considered in this survey and the
any payoff off- or on-the-equilibrium path research we describe might throw some light on
(accepting p ¼ 1 is a worse possible outcome). whether incompleteness of contracts, or simplic-
ity of mechanisms, is an assumption or a result
(of explicitly considering choice of level of
Discussion and Future Directions complexity).
The use of finite automata as a model of players in Acknowledgments We wish to thank an anonymous ref-
a game has been criticized as being inadequate, eree and Jihong Lee for valuable comments that improved
especially because as the number of states the exposition of this entry. We would also like to thank
becomes lower it becomes more and more diffi- St. John’s College, Cambridge, and the Pennsylvania State
University for funding Dr. Chatterjee’s stay in Cambridge
cult for the small automaton to do routine calcu- at the time this entry was written.
lations, let alone the best response calculations
necessary for game-theoretic equilibria. Some of
the papers we have explored address other aspects
of complexity that arise from the concrete nature Bibliography
of the games under consideration. Alternative
Abreu D, Rubinstein A (1988) The structure of Nash
models of complexity are also suggested, such as equilibria in repeated games with finite automata.
computational complexity and communication Econometrica 56:1259–1282
complexity. Anderlini L (1990) Some notes on church’s thesis and the
While our work and the earlier work on which theory of games. Theory Decis 29:19–52
Anderlini L, Sabourian H (1995) Cooperation and effective
it builds focus on equilibrium, an alternative computability. Econometrica 63:1337–1369
approach might seek to see whether simplicity Aumann RJ (1981) Survey of repeated games. In: Essays in
evolves in some reasonable learning model. game theory and mathematical economics in honor of
658 Game Theory and Strategic Complexity
Oskar Morgenstern. Bibliographisches Institut, Mann- Kalai E, Neme A (1992) The strength of a little perfection.
heim/Vienna/Zurich, pp 11–42 Int J Game Theory 20:335–355
Banks J, Sundaram R (1990) Repeated games, finite autom- Kalai E, Stanford W (1988) Finite rationality and interper-
ata and complexity. Games Econ Behav 2:97–117 sonal complexity in repeated games. Econometrica
Ben Porath E (1986) Repeated games with bounded com- 56:397–410
plexity. Mimeo, StanfordUniversity, Stanford, Calif Klemperer P (ed) (2000) The economic theory of auctions.
Ben Porath E (1993) Repeated games with finite automata. Elgar, Northampton
J Econ Theory 59:17–32 Lee J, Sabourian H (2007) Coase theorem, complexity and
Binmore KG (1987) Modelling rational players I. Econ transaction costs. J Econ Theory 135:214–235
Philos 3:179–214 Maenner E (2008) Adaptation and complexity in repeated
Binmore KG, Samuelson L (1992) Evolutionary stability games. Games Econ Behav 63:166–187
in repeated games played by finite automata. J Econ Miller GA (1956) The magical number seven plus or minus
Theory 57:278–305 two: Some limits on our capacity to process informa-
Binmore KG, Piccione M, Samuelson L (1998) Evolution- tion. Psychol Rev 63:81–97
ary stability in alternating-offers bargaining games. Neme A, Quintas L (1995) Subgame perfect equilibrium of
J Econ Theory 80:257–291 repeated games with implementation cost. J Econ The-
Birkhoff GD (1933) Aesthetic measure. Harvard Univer- ory 66:599–608
sity Press, Cambridge, MA Neyman A (1985) Bounded complexity justifies coopera-
Bloise G (1998) Strategic complexity and equilibrium in tion in the finitely-repeated Prisoners’ Dilemma. Econ
repeated games. Unpublished doctoral dissertation, Lett 19:227–229
University of Cambridge Neyman A (1997) Cooperation, repetition and automata in
Busch L-A, Wen Q (1995) Perfect equilibria in a negotia- cooperation: game-theoretic approaches. In: Hart S,
tion model. Econometrica 63:545–565 Mas-Colell A (eds) NATO ASI series F, vol 155.
Chatterjee K (2002) Complexity of strategies and multiplic- Springer, Berlin, pp 233–255
ity of Nash equilibria. Group Decis Negot 11:223–230 Osborne M, Rubinstein A (1990) Bargaining and markets.
Chatterjee K, Sabourian H (2000a) Multiperson bargaining Academic, New York
and strategic complexity. Econometrica 68:1491–1509 Osborne M, Rubinstein A (1994) A course in game theory.
Chatterjee K, Sabourian H (2000b) N-person bargaining MIT Press, Cambridge, MA
and strategic complexity. Mimeo, University of Cam- Papadimitriou CH (1992) On games with a bounded num-
bridge and the Pennsylvania State University, Cam- ber of states. Games Econ Behav 4:122–131
bridge, UK and University Park, Pa., USA Piccione M (1992) Finite automata equilibria with
Debreu G (1959) Theory of value. Yale University Press, discounting. J Econ Theory 56:180–193
New Haven/London Piccione M, Rubinstein A (1993) Finite automata play a
Fernandez R, Glazer J (1991) Striking for a bargain repeated extensive game. J Econ Theory 61:160–168
between two completely informed agents. Am Econ Robson A (2003) The evolution of rationality and the Red
Rev 81:240–252 Queen. J Econ Theory 111:1–22
Fudenberg D, Maskin E (1990) Evolution and repeated Rubinstein A (1982) Perfect equilibrium in a bargaining
games. Mimeo, Harvard University, Cambridge, Mass model. Econometrica 50:97–109
Fudenberg D, Tirole J (1991) Game theory. MIT Press, Rubinstein A (1986) Finite automata play the repeated
Cambridge, MA Prisoners’ Dilemma. J Econ Theory 39:83–96
Gale D (2000) Strategic foundations of general equilib- Rubinstein A (1998) Modeling bounded rationality. MIT
rium: dynamic matching and bargaining games. Cam- Press, Cambridge, MA
bridge University Press, Cambridge Rubinstein A, Wolinsky A (1990) Decentralized trading,
Gale D, Sabourian H (2005) Complexity and competition. strategic behaviour and the Walrasian outcome. Rev
Econometrica 73:739–770 Econ Stud 57:63–78
Gale D, Sabourian H (2006) Markov equilibria in dynamic Sabourian H (2003) Bargaining and markets: complexity
matching and bargaining games. Games Econ Behav and the competitive outcome. J Econ Theory
54:336–352 116:189–228
Gale D, Sabourian H (2008) Complexity and competition Selten R (1965) Spieltheoretische Behandlung eines
II; endogenous matching. Mimeo, New York Univer- Oligopolmodells mit Nachfrageträgheit. Z gesamte
sity, New York, USA/University of Cambridge, Cam- Staatswiss 12:201–324
bridge, UK Shaked A (1986) A three-person unanimity game. In: The
Haller H, Holden S (1990) A letter to the editor on wage Los Angeles national meetings of the Institute of Man-
bargaining. J Econ Theory 52:232–236 agement Sciences and the Operations Research Society
Hayek F (1945) The use of knowledge in society. Am Econ of America, Mimeo, University of Bonn, Bonn,
Rev 35:519–530 Germany
Herrero M (1985) A strategic theory of market institutions. Zemel E (1989) Small talk and cooperation: a note on
Unpublished doctoral dissertation, London School of bounded rationality. J Econ Theory 49:1–9
Economics
Part II
Agent-Based Models
autonomy of action and that it must be able to
Agent-Based Modeling and engage in tasks in an environment without direct
Simulation, Introduction to external control.
From simple agents, which interact locally
Filippo Castiglione with simple rules of behavior, merely responding
Istituto Applicazioni del Calcolo (IAC), Consiglio befittingly to environmental cues, and not neces-
Nazionale delle Ricerche (CNR), Rome, Italy sarily striving for an overall goal, we observe a
synergy which leads to a higher-level whole with
Keywords much more intricate behavior than the component
Discrete mathematical modeling · Simulation · agents (holism, meaning all, entire, total). Agents
Complex systems can be identified on the basis of a set of properties
that must characterize an entity and, in particular,
Agent-based modeling (ABM) is a computational autonomy (the capability of operating without
modeling paradigm that is markedly useful in intervention by humans, and a certain degree of
studying complex systems composed of a large control over its own state); social ability (the
number of interacting entities with many degrees capability of interacting by employing some
of freedom. Other names for ABM are individual- kind of agent communication language); reactiv-
based modeling (IBM) or multi-agent systems ity (the ability to perceive an environment in
(MAS). Physicists often use the term micro- which it is situated and respond to perceived
simulation or interaction-based computing. changes); and pro-activeness (the ability to take
The basic idea of ABM is to construct the the initiative, starting some activity according to
computational counterpart of a conceptual model internal goals rather than as a reaction to an exter-
of a system under study on the basis of discrete nal stimulus). Moreover, it is also conceptually
entities (agents) with defined properties and important to define what the agent “environment”
behavioral rules and then to simulate them in a in an ABM is.
computer to mimic the real phenomena. In general, given the relative immaturity of this
The definition of agent is somewhat fuzzy as modeling paradigm and the broad spectrum of dis-
witnessed by the fact that the models found in the ciplines in which it is applied, a clear-cut and widely
literature adopt an extremely heterogeneous ratio- accepted definition of high-level concepts of agents,
nale. The agent is an autonomous entity having its environment, interactions, and so on is still lacking.
own internal state reflecting its perception of the Therefore a real ABM ontology is needed to address
environment and interacting with other entities the epistemological issues related to the agent-based
according to more or less sophisticated rules. In paradigm of modeling of complex systems in order
practice, the term agent is used to indicate entities to attempt to reach a more general comprehension of
ranging all the way from simple pieces of software emergent properties which, though ascribed to the
to “conscious” entities with learning capabilities. definition of a specific application domain, are also
For example, there are “helper” agents for web universal (see chapter ▶ “Agent-Based Modeling
retrieval, robotic agents to explore inhospitable and Simulation”).
environments, buyer/seller agents in an economy, Historically, the first simple conceptual form of
and so on. Roughly speaking, an entity is an agent-based models was developed in the late
“agent” if it has some degree of autonomy, that 1940s, and it took the advent of the computer to
is, if it is distinguishable from its environment by show its modeling power. This is the Von Neu-
some kind of spatial, temporal, or functional attri- mann machine, a theoretical machine capable of
bute: an agent must be identifiable. Moreover, it is reproduction. The device von Neumann proposed
usually required that an agent must have some would follow precisely detailed instructions to
© Springer Science+Business Media, LLC, part of Springer Nature 2020 661
M. Sotomayor et al. (eds.), Complex Social and Behavioral Systems,
https://doi.org/10.1007/978-1-0716-0368-0_13
Originally published in
R. A. Meyers (ed.), Encyclopedia of Complexity and Systems Science, © Springer Science+Business Media LLC 2020
https://doi.org/10.1007/978-3-642-27737-5_13-5
662 Agent-Based Modeling and Simulation, Introduction to
produce an identical copy of itself. The concept experimentally (that is, on a computer) the essential
was then improved by Stanislaw Ulam. He ingredients of a complex phenomenon. Rather than
suggested that the machine be built on paper, as being derived from some fundamental law of phys-
a collection of cells on a grid. This idea inspired ics, these essential ingredients constitute artificial
von Neumann to create the first of the models later worlds. Therefore, there exists a pathway from
termed cellular automata (CA). John Conway then Newton’s laws to CA and ABM simulations in
constructed the well-known “Game of Life.” Unlike classical physics that has not yet expressed all its
the von Neumann’s machine, Conway’s Game of potential (see chapter ▶ “Interaction-Based Com-
Life operated by simple rules in a virtual world in puting in Physics”).
the form of a two-dimensional checkerboard. CA-like models also proved very successful in
Conway’s Game of Life has become a paradig- theoretical biology to describe the aggregation of
matic example of models concerned with the emer- cells or microorganisms in normal or pathological
gence of order in nature. How do systems self- conditions (see chapter ▶ “Cellular Automaton
organize themselves and spontaneously achieve a Modeling of Tumor Invasion”).
higher-ordered state? These and other questions Returning to the concept of agent in the ABM
have been deeply addressed in the first workshop paradigm, an agent may represent a particle, a
on Artificial Life (ALife) held in the late 1980s in financial trader, a cell in a living organism, a
Santa Fe. This workshop shaped the ALife field of predator in a complex ecosystem, a power plant,
research. Agent-based modeling is historically an atom belonging to a certain material, a buyer in
connected to ALife because it has become a dis- a closed economy, customers in a market model,
tinctive form of modeling and simulation in this forest trees, cars in large traffic vehicle system,
field. In fact, the essential features of ALife models etc. Once the level of description of the system
are translated into computational algorithms under study has been defined, the identification of
through agent-based modeling (see chapter such entities is quite straightforward. For exam-
▶ “Agent-Based Modeling and Artificial Life”). ple, if one looks at the world economy, then the
Agent-based models can be seen as the natural correct choice of agents are nations, rather than
extension of the CA-like models, which have been individual companies. On the other hand, if one is
very successful in the past decades in shedding interested in looking at the dynamics of a stock,
light on various physical phenomena. One impor- then the entities determining the price evolution
tant characteristic of ABMs, which distinguishes are the buyers and sellers.
them from cellular automata, is the potential asyn- This example points to a field where ABM
chrony of the interactions among agents and provides a very interesting and valuable instru-
between agents and their environments. In ment of research. Indeed, mainstream economic
ABMs, agents typically do not simultaneously models typically make the assumption that an
perform actions at constant time-steps, as in CAs entire group of agents, for example, “investors,”
or Boolean networks. Rather, their actions follow can be modeled with a single “rational represen-
discrete-event cues or a sequential schedule of tative agent.” While this assumption has proven
interactions. The discrete-event setup allows for extremely useful in advancing the science of eco-
the cohabitation of agents with different environ- nomics by yielding analytically tractable models,
mental experiences. Also ABMs are not necessar- it is clear that the assumption is not realistic:
ily grid based nor do agents “tile” the environment. people differ in their tastes, beliefs, and sophisti-
Physics investigation is based on building cation, and as many psychological studies have
models of reality. It is a common experience shown, they often deviate from rationality in sys-
that, even using simple “building blocks,” one tematic ways. Agent-based computational econom-
usually obtains systems whose behavior is quite ics (ACE) is a framework allowing economics to
complex. This is the reason why CA-like, and expand beyond the realm of the “rational represen-
therefore agent-based models, have been used tative agent.” By modeling and simulating the
extensively among physicists to investigate behavior of each agent and interactions among
Agent-Based Modeling and Simulation, Introduction to 663
agents, agent-based simulation allows us to investi- interchangeable, and disposable); reliability (due to
gate the dynamics of complex economic systems the redundancy of the components; destruction/
with many heterogeneous (and not necessarily fully death of some units has negligible effect on the
rational) agents. Agent-based computational eco- accomplishment of the task, as the swarm adapts
nomics complements the traditional analytical to the loss of few units); and ability to perform tasks
approach and is gradually becoming a standard beyond those of centralized systems, for example,
tool in economic analysis (see chapter ▶ “Agent- escaping enemy detection. From this initial perspec-
Based Computational Economics”). tive on potential advantages, the actual application
Because the paradigm of agent-based model- of swarm intelligence has extended to many areas
ing and simulation can handle richness of detail in and inspired potential future applications in defense
the agent’s description and behavior, this method- and space technologies (for example, control of
ology is very appealing for the study and simula- groups of unmanned vehicles in land, water, or
tion of social systems, where the behavior and the air), flexible manufacturing systems, and advanced
heterogeneity of the interacting components are computer technologies (bio-computing), medical
not safely reducible to some stylized or simple technologies, and telecommunications (see chapter
mechanism. Social phenomena simulation in the ▶ “Swarm Intelligence”).
area of agent-based modeling and simulation, Similarly, robotics has adopted the ABM par-
concerns the emulation of the individual behavior adigm to study, by means of simulation, the cru-
of a group of social entities, typically including cial features of adaptation and cooperation in the
their cognition, actions, and interaction. This field pursuit of a global goal. Adaptive behavior con-
of research aims at “growing” artificial society cerns the study of how organisms develop their
following a bottom-up approach. behavioral and cognitive skills through a synthetic
Historically, the birth of the agent-based model methodology, consisting of designing artificial
as a model for social systems can be primarily agents which are able to adapt to their environ-
attributed to a computer scientist, Craig Reynolds. ment autonomously. These studies are important
He tried to model the reality of lively biological both from a modeling point of view (that is, for
agents, known as artificial life, a term coined by better understanding intelligence and adaptation
Christopher Langton. In 1996 Joshua M. Epstein in natural beings) and from an engineering point
and Robert Axtell developed the first large-scale of view (that is, for developing artifacts displaying
agent model, the Sugarscape, to simulate and effective behavioral and cognitive skills) (see
explore the role of social phenomenon such as sea- chapter ▶ “Embodied and Situated Agents, Adap-
sonal migrations, pollution, sexual reproduction, tive Behavior in”).
combat, transmission of disease, and even culture What makes ABM a novel and interesting par-
(see chapter ▶ “Social Phenomena Simulation”). adigm of modeling is the idea that agents are
In the field of artificial intelligence, the collective individually represented and “monitored” in the
behavior of agents that without central control, col- computer’s memory. One can, at any time during
lectively carry out tasks normally requiring some the simulation, ask a question such as “what is the
form of “intelligence,” constitutes the central con- age distribution of the agents?” or “how many
cept in the field of swarm intelligence. The term stocks have accumulated buyers following that
“swarm intelligence” first appeared in 1989. As the specific strategy?” or “what is the average velocity
use of the term swarm intelligence has increased, its of the particles?” “Large-scale” simulations in the
meaning has broadened to a point in which it is often context of agent-based modeling are not only
understood to encompass almost any type of collec- simulations that are large in terms of size
tive behavior. Technologically, the importance of (number of agents simulated) but are also com-
“swarms” is mainly based on potential advantages plex. Complexity is inherent in agent-based
over centralized systems. The potential advantages models, as they are usually composed of dynamic,
are economy (the swarm units are simple, hence, in heterogeneous, interacting agents. Large-scale
principle, mass producible, modularizable, agent-based models have also been referred to as
664 Agent-Based Modeling and Simulation, Introduction to
compositionality, types and high-order calculi, beyond a certain level of model complexity, the
which have proved so fruitful in computer sci- model loses its ability to explain or predict reality
ence, to be applied in the domain of ABM and and it reduces to a mere surrogate of reality
simulation (see chapter ▶ “Logic and Geometry where things happen with a surprisingly good
of Agents in Agent-Based Modeling”). adherence to reality, although we are unable to
The appeal of ABM methodology in science explain why this happens. Therefore, model con-
increases manifestly with advances in computa- struction must proceed incrementally, step by
tional power of modern computers. However, it step, possibly validating the model at each stage
is important to bear in mind that increasing the of development before adding more details.
complexity of a model does not necessarily bring ABM technology is very powerful but, if badly
more understanding of the fundamental laws used, could reduce science to a mere exercise
governing the overall dynamics. Actually, consisting of mimicking reality.
tropistic, according to the classification reported
Agent-Based Modeling and in Genesereth and Nilsson (1987)).
Simulation Autonomy The term autonomy has different
meanings, for it represents (in addition to the
Stefania Bandini, Sara Manzoni and control of an agent over its own internal state)
Giuseppe Vizzari different aspects of the possibility of an agent
Complex Systems and Artificial Intelligence to decide about its own actions. For instance, it
Research Center, University of Milan-Bicocca, may represent the possibility of an agent to
Milan, Italy decide (i) about the timing of an action,
(ii) whether or not to fulfill a request, (iii) to
act without the need of an external trigger
Article Outline event (also called pro-activeness or pro-
activity) or even (iv) basing on its personal
Glossary experience instead of hard-wired knowledge
Definition of the Subject (Russel and Norvig 1995). It must be noted
Introduction that different agent models do not generally
Agent-Based Models for Simulation embody all the above notions of autonomy.
Platforms for Agent-Based Simulation Interaction “An interaction occurs when two or
Future Directions more agents are brought into a dynamic rela-
Bibliography tionship through a set of reciprocal actions”
(Ferber 1999).
Glossary Environment “The environment is a first-class
abstraction that provides the surrounding con-
Agent The definition of the term agent is contro- ditions for agents to exist and that mediates
versial even inside the restricted community of both the interaction among agents and the
computer scientists dealing with research on access to resources” (Weyns et al. 2007).
agent models and technologies (Franklin and Platform for agent-based simulation a software
Graesser 1997). A weak definition, that could framework specifically aimed at supporting the
be suited to describe the extremely heteroge- realization of agent-based simulation systems;
neous approaches in the agent-based simula- this kind of framework often provides abstrac-
tion context, is “an autonomous entity, having tions and mechanisms for the definition of agents
the ability to decide the actions to be carried and their environments, to support their interac-
out in the environment and interactions to be tion, but also additional functionalities like the
established with other agents, according to its management of the simulation (e.g., setup, con-
perceptions and internal state”. figuration, turn management), its visualization,
Agent architecture The term agent architecture monitoring and the acquisition of data about the
(Russel and Norvig 1995) refers to the internal simulated dynamics.
structure that is responsible of effectively
selecting the actions to be carried out, according
to the perceptions and internal state of an agent. Definition of the Subject
Different architectures have been proposed in
order to obtain specific agent behaviors and Agent-Based Modeling and Simulation – an
they are generally classified into deliberative approach to the modeling and simulation of a sys-
and reactive (respectively, hysteretic and tem in which the overall behavior is determined by
the local action and interaction of a set of agents topics like decentralized decision making, local-
situated in an environment. Every agent chooses global interactions, self-organization, emergence,
the action to be carried out on the basis of its own effects of heterogeneity in the simulated system.
behavioral specification, internal state and percep- The interest in this relatively recent approach to
tion of the environment. The environment, besides modeling and simulation is demonstrated by the
enabling perceptions, can regulate agents’ interac- number of scientific events focused in this topic
tions and constraint their actions. (see, to make some examples rooted in the com-
puter science context, the Multi Agent Based
Introduction Simulation workshop series (Davidsson et al.
2005; Hales et al. 2003; Moss and Davidsson
Computer simulation represents a way to exploit a 2001; Sichman and Antunes 2006; Sichman
computational model to evaluate designs and et al. 1998; Sichman et al. 2003), the IMA work-
plans without actually bringing them into exis- shop on agent-based modeling (http://www.ima.
tence in the real world (e.g., architectural designs, umn.edu/complex/fall/agent.html) and the Agent-
road networks and traffic lights), but also to eval- Based Modeling and Simulation symposium
uate theories and models of complex systems (Bandini et al. 2006a)). Agent-based models and
(e.g., biological or social systems) by envisioning multi-agent systems (MAS) have been adopted to
the effect of the modeling choices, with the aim of simulate complex systems in very different con-
gaining insight of their functioning. The use of texts, ranging from social and economical simu-
these “synthetic environments” is sometimes nec- lation (see, e.g., Dosi et al. 2006) to logistics
essary because the simulated system cannot actu- optimization (see, e.g., Weyns et al. 2006b),
ally be observed, since it is actually being from biological systems (see, e.g., Bandini et al.
designed or also for ethical or practical reasons. 2006b) to traffic (see, e.g., Balmer and Nagel
A general schema (based on several elaborations, 2006; Bazzan et al. 1999; Wahle and
such as those described in Edmonds (2001) and Schreckenberg 2001) and crowd simulation (see,
Gilbert and Troitzsch (2005)) describing the role e.g., Batty 2001).
of simulation as a predictive or explanatory instru- This heterogeneity in the application domains
ment is shown in Fig. 1. also reflects the fact that, especially in this context
Several situations are characterized by the of agent focused research, influences come from
presence of autonomous entities whose actions most different research areas. Several traffic and
and interactions determine (in a non-trivial way) crowd agent models, to make a relevant example,
the evolution of the overall system. Agent-based are deeply influenced by physics, and the related
models are particularly suited to represent these models provide agents that are modeled as particles
situations, and to support the study and analysis of subject to forces generated by the environment as
Agent-Based Modeling and Simulation, Fig. 1 A general schema describing the usage of simulation as a predictive
or explanatory instrument
Agent-Based Modeling and Simulation 669
well as by other agents (i.e., active walker models, introduced application domains, the environment
such as Helbing et al. 1997). Other approaches to plays a prominent role because:
crowd modeling and simulation build on experi-
ences with Cellular Automata (CA) approaches • It deeply influences the behaviors of the simu-
(see, e.g., Schadschneider et al. 2002) but provide lated entities, in terms of perceptions and allo-
a more clear separation between the environment wed actions for the agents.
and the entities that inhabit, act and interact in it • The aim of the simulation is to observe some
(see, e.g., Bandini et al. 2004; Henein and White aggregate level behavior (e.g., the density of a
2005). This line of research leads to the definition certain type of agent in an area of the environ-
of models for situated MASs, a type of model that ment, the average length of a given path for
was also defined and successfully applied in the mobile agents, the generation of clusters of
context of (reactive) robotics and control systems agents), that can actually only be observed in
(Weyns and Holvoet 2006; Weyns et al. 2005). the environment.
Models and simulators defined and developed in
the context of social sciences (Gilbert and Besides these common elements, the above
Troitzsch 2005) and economy (Pyka and Fagiolo introduced approaches often dramatically differ
2007) are instead based on different theories (often in the way agents are described, both in terms of
non-classical ones) of human behavior in order to properties and behavior. A similar consideration
gain further insight on it and help building and can be done for their environment.
validating new theories. Considering the above introduced consider-
The common standpoint of all the above- ations, the aim of this article is not to present a
mentioned approaches and of many other ones specific technical contribution but rather to intro-
that describe themselves as agent-based is the duce an abstract reference model that can be
fact that the analytical unit of the system is applied to analyze, describe and discuss different
represented by the individual agent, acting and models, concrete simulation experiences, plat-
interacting with other entities in a shared environ- forms that, from different points of view, legiti-
ment: the overall system dynamic is not defined in mately claim to adopt an agent-based approach.
terms of a global function, but rather the result of The reference model is illustrated in Fig. 2.
individuals’ actions and interactions. On the other In particular, the following are the main ele-
hand, it must also be noted that in most of the ments of this reference model:
Agent-Based Modeling
and Simulation,
Fig. 2 An abstract
reference model to analyze,
describe and discuss
different models, concrete
simulation experiences,
platforms legitimately
claiming to adopt an agent-
based approach
670 Agent-Based Modeling and Simulation
Reactive agents are elementary (and often behavior. When layers are instead arranged in a
memory-less) agents with a defined position in vertical structure, reactive layers have higher pri-
the environment. Reactive agents perform their ority over deliberative ones, that are activated
actions as a consequence of the perception of only when no reactive behavior is triggered by
stimuli coming either from other agents or from the perception of an external stimulus.
the environment; generally, the behavioral speci- A MAS can be composed of cognitive agents
fication of this kind of agent is a set of condition- (generally a relatively low number of deliberative
action rules, with the addition of a selection strat- agents), each one possessing its own knowledge-
egy for choosing an action to be carried out when- model determining its behavior and its interactions
ever more rules could be activated. In this case, with other agents and the environment. By con-
the motivation for an action derives from a trig- trast, there could be MAS made only by reactive
gering event detected in the environment; these agents. This type of system is based on the idea that
agents cannot be pro-active. it is not necessary for a single agent to be individ-
Deliberative or cognitive agents, instead, are ually intelligent for the system to demonstrate com-
characterized by a more complex action selection plex (intelligent) behaviors. Systems of reactive
mechanism, and their behavior is based on so agents are usually more robust and fault tolerant
called mental states, on facts representing agent than other agent-based systems (e.g., an agent may
knowledge about the environment and, possibly, be lost without any catastrophic effect for the sys-
also on memories of past experiences. Deliberative tem). Other benefits of reactive MAS include flex-
agents, for every possible sequence of perceptions, ibility and adaptability in contrast to the
try to select a sequence of actions, allowing them to inflexibility that sometimes characterizes systems
achieve a given goal. Deliberative models, usually of deliberative agents (Brooks 1990). Finally, a
defined within the planning context, provide a system might also present an heterogeneous com-
symbolic and explicit representation of the world position of reactive and deliberative agents.
within agents and their decisions are based on logic
reasoning and symbol manipulation. The BDI Environment
model (belief, desire, intention (Rao and Georgeff Weyns et al. in (Weyns et al. 2007) provide a
1991; Rao and Georgeff 1995)) is perhaps the most definition of the notion of environment for
widespread model for deliberative agents. The MASs (and thus of an environment for an
internal state of agents is composed of three “data ABM), and also discuss the core responsibilities
structures” concerning agent beliefs, desires and that can be ascribed to it. In particular, in the
intentions. Beliefs represent agent information specific context of simulation the environment is
about its surrounding world, desires are the agent typically responsible for the following:
goals, while intentions represent the desire an agent
has effectively selected, that it has to some extend • Reflecting/reifying/managing the structure of
committed. the physical/social arrangement of the overall
Hybrid architectures can also be defined by system;
combining the previous ones. Agents can have a • Embedding, supporting regulated access to
layered architecture, where deliberative layers are objects and parts of the system that are not
based on a symbolic representation of the sur- modeled as agents;
rounding world, generate plans and take deci- • Supporting agent perception and situated
sions, while reactive layers perform actions as action (it must be noted that agent interaction
effect of the perception of external stimuli. Both should be considered a particular kind of
vertical and horizontal architectures have been action);
proposed in order to structure layers (Brooks • Maintain internal dynamics (e.g., spontaneous
1986). In horizontal architecture no priorities are growth of resources, dissipation signals emit-
associated to layers and the results of the different ted by agents);
layers must be combined to obtain agent’s • Define/enforce rules.
Agent-Based Modeling and Simulation 673
In order to exemplify this schema, we will now context of agent-based simulation, provided a
consider agent-based models and simulators that model for the deployment environment, that is
are based on a physical approach; the latter gen- the specific part of the software infrastructure
erally consider agents as particles subject to and that manages the interactions among agents.
generating forces. In this case, the environment Another recent work is focused on clarifying
comprises laws regulating these influences and the notion of ABM environment and describes a
relevant elements of the simulated system that three layered model for situated ABM environ-
are not agents (e.g., point of reference that gener- ments (Weyns et al. 2006a). This work argues that
ate attraction/repulsion forces). It is the environ- environmental abstractions (as well as those
ment that determines the overall dynamics, related to agents) crosscut all the system levels,
combining the effects that influence each agent from application specific ones, to the execution
and applying them generally in discrete time platform, to the physical infrastructure. There are
steps. In this cycle, it captures all the above intro- thus application specific aspects of agents’ envi-
duced responsibilities, and the role of agents is ronment that must be supported by the software
minimal (according to some definitions they infrastructure supporting the execution of the
should not be called agents at all), and running a ABM, and in particular the ABM framework
simulation is essentially reduced to computing (MAS framework in the figure). Figure 3 com-
iteratively a set equations (see, e.g., (Balmer and pares the two above-described schemas.
Nagel 2006; Helbing et al. 1997)). In situated The fact that the environment actually cross-
ABM approaches agents have a higher degree of cuts all system levels in a deployment model
autonomy and control over their actions, since represents a problem making difficult the sepa-
they evaluate their perceptions and choose their ration between simulated environment and sim-
actions according to their behavioral specifica- ulation infrastructure. In fact, the modeling
tion. The environment retains a very relevant choices can have a deep influence on the design
role, since it provides agents with their percep- of the underlying ABM framework and, vice
tions that are generated according to the current versa, design choices on the simulation infra-
structure of the system and to the arrangement of structure make it suitable for some ABM and
agents situated in it. Socioeconomic models and environment models but not usable for other
simulations provide various approaches to the ones. As a result, general ABM framework
representation of the simulated system, but are supporting simulation actually exist, but they
generally similar to situated ABMs. cannot offer a specific form of support to the
It is now necessary to make a clarification on modeler, although they can offer basic mecha-
how the notion of environment in the context of nisms and abstractions.
MAS-based simulation can be turned into a soft- SeSAm, for instance, offers a general simula-
ware architecture. Klügl et al. (2005) argue that tion infrastructure but relies on plugins (Klügl
the notion of environment in multi-agent simula- et al. 2005). Those plugins, for example, could
tion is actually made up of two conceptually dif- be used to define and manage the spatial features
ferent elements: the simulated environment and of the simulated environment, including the asso-
the simulation environment. The former is a part ciated basic functions supporting agent movement
of the computational model that represents the and perception in that kind of environment. With
reality or the abstraction that is the object of the reference to Fig. 3b, such a plugin would be
simulation activity. The simulation environment, associated to the application environment module,
on the other hand, is a software infrastructure for in the ABM application layer. However, these
executing the simulation. In this framework, to aspects represent just some of the features of the
make an explicit decoupling between these levels simulated environment, that can actually com-
is a prerequisite for good engineering practice. It prise rules and laws that extend their influence
must be noted that also a different work (Gouaich over the agents and the outcomes of their attempts
et al. 2005), non-specifically developed in the to act in the environment.
674 Agent-Based Modeling and Simulation
Agent-Based Modeling and Simulation, Fig. 3 A environment (a), and a three layer deployment model for
schema introduced in (Klügl et al. 2005) to show differences situated MAS introduced in (Weyns et al. 2006a) highlight-
and relationships between simulated and simulation ing the crosscutting abstractions agent and environment (b)
most of them are related to the issue of agents provides the definition of suitable languages to
acquaintance. The way ACL-based agent interac- cover these aspects. While this approach is gener-
tion models deal with this issue is the subject of ally well-understood and can be implemented in a
another dimension of the taxonomy, providing very effective way (especially as it is substantially
direct a priori acquaintance among agents, the adop- based on the vast experience of computer networks
tion of middle agents for information discovery and protocols), in the agent context, it requires specific
the development of more complex acquaintance architectural and conceptual solutions to tackle
models to tackle issues related to the representation issues related to the agent acquaintance/discovery
and maintenance of acquaintance information but and on-tological issues.
also to robustness and scalability of the agent infra-
structure. However there are other agent interaction Intuitively an Agent Communication Language
models providing an indirect communication (ACL) provides agents with a means of exchang-
among them. Some of these approaches provide ing information and knowledge. This vague defi-
the creation and exploitation of artifacts that repre- nition inherently includes the point of view on the
sent a medium for agents’ interaction. Other indirect conception of the term agent, which assumes that
interaction models are more focused on modeling an agent is an intelligent autonomous entity whose
agent environment as the place where agent interac- features include some sort of social ability
tions take place, thus influencing interactions and (Wooldridge and Jennings 1995). According to
agent behavior (Fig. 4). some approaches this kind of feature is the one
that ultimately defines the essence of agency
Direct Interaction Models The first and most (Genesereth and Ketchpel 1994). Leaving aside
widely adopted kind of agent interaction model the discussion on the definition and conception of
provide a direct information exchange between agency, this section will focus on what the expres-
communication partners. This approach ignores sion “social ability” effectively means. To do so we
the details related to the communication channel will briefly compare these ACL share with those
that allows the interaction, and does not include it approaches that allow the exchange of information
as an element of the abstract interaction model. among distributed components (e.g., in legacy sys-
Generally the related mechanisms provide a tems (With this expression we mean pieces of
point-to-point message-passing protocol regulating software which are not designed to interact with
the exchange of messages among agents. There are agents and agent based systems.)) some basic
various aspects of the communicative act that must issues: in particular, the definition of a communi-
be modeled (ranging from low-level technical con- cation channel allowing the reliable exchange of
siderations on message format to conceptual issues messages over a computer network (i.e., the lower
related to the formal semantics of messages and level aspects of the communication). What distin-
conversations), but generally this approach guishes ACLs from such systems are the objects of
Agent-Based Modeling and Simulation, Fig. 4 The proposed taxonomy of agent interaction models
676 Agent-Based Modeling and Simulation
discourse and their semantic complexity; in partic- concepts, entities, properties and relations among
ular there are two aspects which distributed com- them. In other words, the same concept, object or
puting protocols and architectures do not have to entity must have a uniform meaning and set of
deal with (Fig. 5): properties across the whole system.
• Autonomy of interacting components: modern
systems’ components (even though they can be Indirect Interaction Models From a strictly
quite complex and can be considered as self- technical point of view, agent communication is
sufficient with reference to supplying a specific generally indirect even in direct agent interaction
service) have a lower degree of autonomy than models. In fact most of these approaches adopt
the one that is generally associated to agents. some kind of communication infrastructure sup-
• The information conveyed in messages does not plying a reliable end-to-end message passing
generally require a comprehensive ontological mechanism. Nonetheless, the adoption of a con-
approach, as structures and categories can be ceptually direct agent interaction model brings
considered to be shared by system components. specific issues that were previously introduced.
The remaining of this section will focus on models
Regarding the autonomy, while traditional providing the presence of an intermediate entity
software components offer services and generally mediating (allowing and regulating) agent inter-
perform the required actions as a reaction to the action. This communication abstraction is not
external requests, agents may decide not to carry merely a low-level implementation detail, but a
out a task that was required by some other system first-class concept of the model (Fig. 6).
entity. Moreover generally agents are considered
temporally continuous and proactive, while this is Agent interaction models which provide indi-
not generally true for common software rect mechanisms of communication will be clas-
components. sified into artifact mediated and spatially
For what concerns the second point, generally grounded models. The distinction is based on the
components have specific interfaces which inspiration and metaphor on which these models
assume an agreement on a set of shared data are rooted. The former provide the design and
structures. The semantics of the related informa- implementation of an artifact which emulates con-
tion, and the semantics of messages/method invo- crete objects of agents’ environment whose goal is
cation/service requests, is generally given on the communication of autonomous entities. Spa-
some kind of (more or less formally specified) tially grounded agent interaction models bring the
modeling language, but is tightly related to com- metaphor of modeling agent environment to the
ponent implementation. For agent interaction a extreme, recognizing that there are situations in
more explicit and comprehensive view on domain
concepts must be specified. In order to be able to
effectively exchange knowledge, agents must
share an ontology (see, e.g., Gruber 1995), that
is a representation of a set of categories of objects,
which spatial features and information represent a interaction. With respect to direct interaction
key factor and cannot be neglected in analyzing models, part of the burden of coordination is in
and modeling a system. fact moved from the agent to the infrastructure.
Both of these approaches provide interaction The evolution of this approach has basically
mechanisms that are deeply different from point- followed two directions: the extension of the coor-
to-point message exchange among entities. In dination language and infrastructure in order to
fact, the media which enable the interaction intrin- increase its expressiveness or usability, and the
sically represent a context influencing agent modeling and implementation of distributed tuple
communication. spaces (Cabri et al. 2000; Omicini and Zambonelli
In the real world a number of physical agents 1999; Picco et al. 1999).
interact sharing resources, by having a competi- While the previously described indirect
tive access to them (e.g., cars in streets and cross- approaches define artifacts for agent interaction
roads), but also collaborating in order to perform taking inspiration from actual concrete object of
tasks which could not be carried out by single the real world, other approaches bring the meta-
entities alone, due to insufficient competencies phor of agent environment to the extreme by
or abilities (e.g., people that carry a very heavy taking into account its spatial features.
burden together). Very often, in order to regulate In these approaches agents are situated in an
the interactions related to these resources, we environment whose spatial features are represented
build concrete artifacts, such as traffic lights possibly in an explicit way and have an influence
on the streets, or neatly placed handles on large on their perception, interaction and thus on their
heavy boxes. Exploiting this metaphor, some behavior. The concept of perception, which is
approaches to agent interaction tend to model really abstract and metaphoric in direct interaction
and implement abstractions allowing the cooper- models and has little to do with the physical world
ation of entities through a shared resource, whose (agents essentially perceive their state of mind,
access is regulated according to precisely defined which includes the effect of received messages,
rules. Blackboard-based architectures are the first like new facts in their knowledge base), here is
examples of this kind of models. A blackboard is a related to a more direct modeling of what is often
shared data repository that enables cooperating referred to as “local point of view”. In fact these
software modules to communicate in an indirect approaches provide the implementation of an infra-
and anonymous way (Englemore and Morgan structure for agent communication which allows
1988). In particular the concept of tuple space, them to perceive the state of the environment in
first introduced in Linda (Gelernter 1985), repre- their position (and possibly in nearby locations).
sents a pervasive modification to the basic black- They can also cause local modifications to the state
board model. of the environment, generally through the emission
Linda (Gelernter 1985) coordination language of signals, emulating some kind of physical phe-
probably represents the most relevant blackboard- nomenon (e.g., pheromones (Brueckner 2000), or
based model. It is based on the concept of tuple fields (Bandini et al. 2002; Mamei et al. 2002)) or
space, that is an associative blackboard allowing also by simply observing the actions of other
agents to share and retrieve data (i.e., tuples) agents and reacting to this perception, in a “behav-
through some data-matching mechanism (such ioral implicit communication” schema (Tummolini
as pattern-matching or unification) integrated et al. 2004) (Fig. 7).
within the blackboard. Linda also defines a very In all these cases, however, the structuring
simple language defining mechanisms for function of the environment is central, since it
accessing the tuple space. actually defines what can be perceived by an
The rationale of this approach is to keep sepa- agent in its current position and how it can actu-
rated computation and coordination contexts as ally modify the environment, to which extent its
much as possible (Gelernter and Carriero 1992), actions can be noted by other agents and thus
by providing specific abstractions for agent interact with them.
678 Agent-Based Modeling and Simulation
Agent-Based Modeling and Simulation, Fig. 8 A screenshot of a Netlogo simulation applet, (a), and a Repast
simulation model, (b)
simplifies the integration with external and by the presence of a high level interface for
existing libraries. Repast, in its current version, “point-and-click” definition of agent’s behaviors,
can be easily connected to instruments for statis- that is based on a set of primitives for the specifi-
tical analysis, data visualization, reporting and cation of agent’s actions. SimSesam (http://www.
also geographic information systems. simsesam.de/) (Klügl et al. 2003) defines a set of
While the above-mentioned functionalities are primitive functions as basic elements for describ-
surely important in simplifying the development ing agents’ behaviors, and it also provides visual
of an effective simulator, and even if in principle it tools supporting model implementation. At the
is possible to adapt frameworks belonging to the extreme border of this category, we can mention
previously described categories, it must be noted efforts that are specifically aimed at supporting the
that their neutrality with respect to the specific development of simulations based on a precise
adopted agent model leads to a necessary prelim- agent model, approach and sometimes even for a
inary phase of adaptation of the platform to the specific area of application, such as the one
specific features of the model that is being described in (Bandini et al. 2007; Weyns et al.
implemented. If the latter defines specific abstrac- 2006b).
tions and mechanisms for agents, their decision-
making activities, their environment and the way
they interact, then the modeler must in general Future Directions
develop proper computational supports to be
able to fruitfully employ the platform. These plat- Agent-based modeling and simulation is a rela-
forms, in fact, are not endowed with specific sup- tively young yet already widely diffused approach
ports to the realization of agent deliberation to the analysis, modeling and simulation of com-
mechanisms or infrastructures for interaction plex systems. The heterogeneity of the
models, either direct or indirect (even if it must approaches, modeling styles and applications
be noted that all the above platforms generally that legitimately claim to be “agent-based”, as
provide some form of support to agent environ- well as the fact that different disciplines are
ment definition, such as grid-like or graph involved in the related research efforts, they are
structures). all factors that hindered the definition of a gener-
A third category of platforms represent an ally recognized view of the field. A higher level
attempt to provide a higher level linguistic sup- framework of this kind of activity would be desir-
port trying to reduce the distance between agent- able in order to relate different efforts by means of
based models and their implementations. The lat- a shared schema. Moreover it could represent the
est version of Repast, for instance, is characterized first step in effectively facing some of the
680 Agent-Based Modeling and Simulation
epistemological issues related to this approach to advances in artificial intelligence, 23rd annual German
the modeling and analysis of complex systems. conference on artificial intelligence, Bonn, 13–15 Sept
1999. Lecture notes in computer science, vol 1701.
The future directions in this broad research area Springer, Berlin, pp 303–306
are thus naturally aimed at obtaining vertical ana- Brooks RA (1986) A robust layered control system for a
lytical results on specific application domains, but mobile robot. IEEE J Robot Autom 2:14–23
they must also include efforts aimed at “building Brooks RA (1990) Elephants don’t play chess. Robot
Auton Syst 6:3–15
bridges” between the single disciplinary results in Brueckner S (2000) An analytic approach to pheromone-
the attempt to reach a more general and shared based coordination. In: ICMAS. IEEE Computer Soci-
understanding of how these bottom-up modeling ety, pp 369–370
approaches can be effectively employed to study, Cabri G, Leonardi L, Zambonelli F (2000) MARS: a pro-
grammable coordination architecture for mobile agents.
explain and (maybe) predict the overall behavior IEEE Inter Comp 4:26–35
of complex systems. Davidsson P, Logan B, Takadama K (eds) (2005) Multi-
agent and multi-agent-based simulation, joint work-
shop MABS (2004), New York, 19 July 2004. Revised
selected papers. In: Davidsson P, Logan B, Takadama
Bibliography K (eds) MABS. Lecture notes in computer science,
vol 3415. Springer, Berlin
Adami C (1998) Introduction to artificial life. Springer, Dosi G, Fagiolo G, Roventini A (2006) An evolutionary
New York model of endogenous business cycles. Comput Econ
Agha G (1986) Actors: a model of concurrent computation 27:3–34
in distributed systems. MIT Press, Cambridge Edmonds B (2001) The use of models – making MABS more
Alfi V, Galla T, Marsili M, Pietronero L (eds) informative. In: Multi-agent-based simulation, second
(2007) Interacting agents, complexity and inter- international workshop MABS 2000, Boston, July 2000.
disciplinary applications (IACIA) Revised and additional papers. Lecture notes in computer
Balmer M, Nagel K (2006) Shape morphing of intersection science, vol 1979. Springer, Berlin, pp 15–32
layouts using curb side oriented driver simulation. In: Englemore RS, Morgan T (eds) (1988) Blackboard sys-
van Leeuwen JP, Timmermans HJ (eds) Innovations in tems. Addison-Wesley, Reading
design & decision support systems in architecture and Epstein JM, Axtell R (1996) Growing artificial societies.
urban planning. Springer, Dordrecht, pp 167–183 MIT Press, Boston
Bandini S, Manzoni S, Simone C (2002) Heterogeneous Ferber J (1999) Multi–agent systems. Addison-Wesley,
agents situated in heterogeneous spaces. Appl Artif London
Intell 16:831–852 Ferber J, Muller J (1996) Influences and reaction: a model
Bandini S, Manzoni S, Vizzari G (2004) Situated cellular of situated multiagent systems. In: Proceedings of the
agents: a model to simulate crowding dynamics. IEICE 2nd international conference on multiagent systems
Trans Inf Syst E87-D, pp 669–676. Special Issues on Fikes RE, Nilsson NJ (1971) STRIPS: a new approach to
Cellular Automata the application of theorem proving to problem solving.
Bandini S, Petta P, Vizzari G (eds) (2006a) International Artif Intell 2:189–208
symposium on agent based modeling and simulation Franklin S, Graesser A (1997) Is it an agent, or just a
(ABModSim 2006). Cybernetics and systems. Austrian program?: a taxonomy for autonomous agents. In:
Society for Cybernetic Studies (2006) 18th European Müller JP, Wooldridge M, Jennings NR (eds) Intelli-
meeting on cybernetics and systems research (EMCSR gent agents III, agent theories, architectures, and lan-
2006) guages ECAI’96 workshop (ATAL), Budapest, 12–13
Bandini S, Celada F, Manzoni S, Puzone R, Vizzari Aug 1996. Lecture notes in computer science, vol 1193.
G (2006b) Modelling the immune system with situated Springer, Berlin, pp 21–36
agents. In: Apolloni B, Marinaro M, Nicosia G, Gelernter D (1985) Generative communication in Linda.
Tagliaferri R (eds) Proceedings of WIRN/NAIS 2005. ACM Trans Program Lang Syst 7:80–112
Lecture notes in computer science, vol 3931. Springer, Gelernter D, Carriero N (1992) Coordination languages
Berlin, pp 231–243 and their significance. Commun ACM 35:97–107
Bandini S, Federici ML, Vizzari G (2007) Situated cellular Genesereth MR, Ketchpel SP (1994) Software agents.
agents approach to crowd modeling and simulation. Commun ACM 37(7):48–53
Cybern Syst 38:729–753 Genesereth MR, Nilsson N (1987) Logical foundations of
Bar-Yam Y (1997) Dynamics of complex systems. artificial intelligence. Morgan Kaufmann, San Mateo
Addison-Wesley, Reading Georgeff M (1984) A theory of action in multi–agent
Batty M (2001) Agent based pedestrian modeling. Env planning. In: Proceedings of the AAAI84, pp 121–125
Plan B: Plan Des 28:321–326 Gilbert N, Troitzsch KG (2005) Simulation for the social
Bazzan ALC, Wahle J, Klügl F (1999) Agents in traffic scientist, 2nd edn. Open University Press, Maidenhead
modelling – from reactive to social behaviour. In: Goles E, Martinez S (1990) Neural and automata networks,
Burgard W, Christaller T, Cremers AB (eds) KI-99: dynamical behavior and applications. Kluwer, Norwell
Agent-Based Modeling and Simulation 681
Gouaich A, Michel F, Guiraud Y (2005) MIC: a deploy- Syst 2:251–269. Special issue: Coordination Mecha-
ment environment for autonomous agents. In: Environ- nisms for Web Agents
ments for multi-agent systems, first international Picco GP, Murphy AL, Roman GC (1999) Lime: Linda
workshop (E4MAS 2004). Lecture notes in computer meets mobility. In: Proceedings of the 21st interna-
science, vol 3374. Springer, Berlin, pp 109–126 tional conference on software engineering (ICSE 99).
Gruber TR (1995) Toward principles for the design of ACM Press, Los Angeles, pp 368–377
ontologies used for knowledge sharing. Int J Hum Pyka A, Fagiolo G (2007) Agent-based modelling: a meth-
Comput Stud 43:907–928 odology for neo-Schumpeterian economics. In:
Hales D, Edmonds B, Norling E, Rouchier J (eds) Hanusch H, Pyka A (eds) Elgar companion to neo-
(2003) Multi-agent-based simulation III, 4th interna- schumpeterian economics. Edward Elgar Publishing,
tional workshop MABS 2003, Melbourne, 14 July pp 467–487
2003. Revised papers. In: Hales D, Edmonds B, Rao A, Georgeff M (1991) Modeling rational agents within
Norling E, Rouchier J (eds) MABS. Lecture notes in a BDI-architecture. In: Proceedings of the knowledge
computer science, vol 2927. Springer, Berlin representation and reasoning (KR&R 1991)
Hassas S, Serugendo GDM, Phan D (eds) (2007) Multi- Rao A, Georgeff M (1995) BDI agents: from theory to
agents for modelling complex systems (MA4CS). practice. In: Proceedings of the international confer-
http://bat710.univ-lyon1.fr/~farmetta/MA4CS07 ence on multi-agent systems
Helbing D, Schweitzer F, Keltsch J, Molnár P (1997) Russel S, Norvig P (1995) Artificial intelligence: a modern
Active walker model for the formation of human and approach. Prentice Hall, Upper Saddle River
animal trail systems. Phys Rev E 56:2527–2539 Schadschneider A, Kirchner A, Nishinari K (2002) CA
Henein CM, White T (2005) Agent-based modelling of approach to collective phenomena in pedestrian
forces in crowds. In: Davidsson P, Logan B, Takadama dynamics. In: Bandini S, Chopard B, Tomassini
K (eds) Multi-agent and multi-agent-based simulation, M (eds) Cellular automata, 5th international conference
joint workshop MABS 2004, New York, 19 July 2004. on cellular automata for research and industry ACRI
Revised selected papers. Lecture notes in computer 2002. Lecture notes in computer science, vol 2493.
science, vol 3415. Springer, Berlin, pp 173–184 Springer, Berlin, pp 239–248
Klügl F, Herrler R, Oechslein C (2003) From simulated to Sichman JS, Antunes L (eds) (2006) Multi-agent-based
real environments: how to use sesam for software simulation VI, international workshop MABS 2005,
development. In: Schillo M, Klusch M, Müller JP, Utrecht, The Netherlands, 25 July 2005. Revised and
Tianfield H (eds) MATES. Lecture notes in computer invited papers. In: Sichman JS, Antunes L (eds)
science, vol 2831. Springer, Berlin, pp 13–24 MABS. Lecture notes in computer science, vol 3891.
Klügl F, Fehler M, Herrler R (2005) About the role of the Springer, Berlin
environment in multi-agent simulations. In: Weyns D, Sichman JS, Conte R, Gilbert N (eds) (1998) Multi-agent
Parunak HVD, Michel F (eds) Environments for multi- systems and agent-based simulation, first international
agent systems, first international workshop E4MAS workshop MABS’98, Paris, 4–6 July 1998. Proceed-
2004, New York, 19 July 2004. Revised selected ings. In: Sichman JS, Conte R, Gilbert N (eds): MABS.
papers, vol 3374, pp 127–149 Lecture notes in computer science, vol 1534. Springer,
Langton C (1995) Artificial life: an overview. MIT Press, Berlin
Cambridge Sichman JS, Bousquet F, Davidsson P (eds) (2003) Multi-
Latombe JC (1991) Robot motion planning. Kluwer, Boston agent-based simulation, third international workshop
Luck M, McBurney P, Sheory O, Willmott S (eds) MABS 2002, Bologna, 15–16 July 2002. Revised
(2005) Agent technology: computing as interaction. papers. In: Sichman JS, Bousquet F, Davidsson
University of Southampton, Southampton P (eds) MABS. Lecture notes in computer science,
Mamei M, Zambonelli F, Leonardi L (2002) Co-fields: vol 2581. Springer, Berlin
towards a unifying approach to the engineering of Torrens P (2002) Cellular automata and multi-agent sys-
swarm intelligent systems. In: Engineering societies tems as planning support tools. In: Geertman S,
in the agents world III: third international workshop Stillwell J (eds) Planning support systems in practice.
(ESAW 2002). Lecture notes in artificial intelligence, Springer, London, pp 205–222
vol 2577. Springer, Berlin, pp 68–81 Tummolini L, Castelfranchi C, Ricci A, Viroli M, Omicini
Moss S, Davidsson P (eds) (2001) Multi-agent-based sim- A (2004) “Exhibitionists” and “voyeurs” do it better: a
ulation, second international workshop MABS 2000, shared environment approach for flexible coordination
Boston, July 2000. Revised and additional papers. Lec- with tacit messages. In: Weyns D, Parunak HVD, Michel
ture notes in computer science, vol 1979. Springer, F (eds) 1st international workshop on environments for
Berlin multi-agent systems (E4MAS 2004), pp 97–111
Murata T (1989) Petri nets: properties, analysis and appli- Wahle J, Schreckenberg M (2001) A multi-agent system
cations. Proc IEEE 77:541–580 for online simulations based on real-world traffic data.
North MJ, Collier NT, Vos JR (2006) Experiences creating In: Annual Hawaii international conference on system
three implementations of the repast agent modeling sciences (HICSS-34). IEEE Computer Society, Los
toolkit. ACM Trans Model Comp Sim 16:1–25 Alamitos
Omicini A, Zambonelli F (1999) Coordination for internet Weyns D, Holvoet T (2006) From reactive robots to situ-
application development. Autono Agents Multi-Agent ated multi-agent systems: a historical perspective on the
682 Agent-Based Modeling and Simulation
role of environment in multi-agent systems. In: multiagent systems. ACM Press, Hakodate,
Dikenelli O, Gleizes MP, Ricci A (eds) Engineering pp 842–849
societies in the agents world VI, 6th international work- Weyns D, Omicini A, Odell J (2007) Environment as a first
shop ESAW (2005). Lecture notes in computer science, class abstraction in multiagent systems. Auton Agents
vol 3963. Springer, Berlin, pp 63–88 Multi-Agent Syst 14:5–30
Weyns D, Schelfthout K, Holvoet T, Lefever T (2005) Weyns D, Brueckner SA, Demazeau Y (eds) (2008) Engi-
Decentralized control of E’GV transportation systems. neering environment-mediated multi-agent systems:
In: AAMAS industrial applications. ACM Press, international workshop, EEMMAS 2007, Dresden,
Utrecht, pp 67–74 Oct 2007. Selected revised and invited papers. Lecture
Weyns D, Vizzari G, Holvoet T (2006a) Environments notes in computer science, vol 5049. Springer, Berlin
for situated multi-agent systems: beyond infrastruc- Wolfram S (1986) Theory and applications of cellular
ture. In: Weyns D, Parunak HVD, Michel F (eds) automata. World Press, Singapore
Environments for multi-agent systems II, second Wooldridge MJ, Jennings NR (1995) Intelligent agents:
international workshop E4MAS (2005), Utrecht, theory and practice. Knowl Eng Rev 10:115–152
25 July 2005. Selected revised and invited papers. Zambonelli F, Parunak HVD (2003) Signs of a revolution
Lecture notes in computer science, vol 3830. in computer science and software engineering. In:
Springer, Berlin, pp 1–17 Petta P, Tolks-dorf R, Zambonelli F (eds) Engineering
Weyns D, Boucké N, Holvoet T (2006b) Gradient field- societies in the agents world III, third international
based task assignment in an AGV transportation sys- workshop, ESAW 2002, Madrid, Sept 2002. Revised
tem. In: AAMAS’06: proceedings of the fifth interna- papers. Lecture notes in computer science, vol 2577.
tional joint conference on autonomous agents and Springer, Berlin, pp 13–28
Finite dynamical system A finite dynamical
Agent-Based Modeling, system is a time-discrete dynamical system
Mathematical Formalism for on a finite state set. That is, it is a mapping
from a Cartesian product of finitely many cop-
Reinhard Laubenbacher1, Abdul S. Jarrah2, ies of a finite set to itself. This finite set is often
Henning S. Mortveit1 and S. S. Ravi3 considered to be a field. The dynamics is gen-
1
Virginia Bioinformatics Institute, Virginia erated by iteration of the mapping.
Polytechnic Institute and State University, Mathematical framework A mathematical
Virginia, USA framework for agent-based simulation consists
2
Department of Mathematics and Statistics, of a collection of mathematical objects that are
American University of Sharjah, Sharjah, United considered mathematical abstractions of agent-
Arab Emirates based simulations. This collection of objects
3
Department of Computer Science, University at should be general enough to capture the key
Albany – State University of New York, features of most simulations, yet specific
New York, USA enough to allow the development of a mathe-
matical theory with meaningful results and
algorithms.
Article Outline
interacting in some manner that involves one or would like to be able to relate their dynamics in
more of the complexity components just men- a rigorous fashion. We are currently lacking a
tioned, that is, with a large number of agents, mathematically rich formal framework that
heterogeneity in agent character and interactions, models agent-based simulations. This framework
and possibly stochastic aspects to all these parts. should have at its core a class of mathematical
The global properties of complex systems, such as objects to which one can map agent-based simu-
their global dynamics, emerge from the totality of lations. The objects should have a sufficiently
local interactions between individual agents over general mathematical structure to capture key fea-
time. While these local interactions are well tures of agent-based simulations and, at the same
understood in many cases, little is known about time, should be rich enough to allow the deriva-
the emerging global behavior arising through tion of substantial mathematical results. This
interaction. Thus, it is typically difficult to con- entry presents one such framework, namely, the
struct global mathematical models such as sys- class of time-discrete dynamical systems over
tems of ordinary or partial differential equations, finite state sets.
whose properties one could then analyze. Agent- The building blocks of these systems consist of
based simulations are one way to create computa- a collection of variables (mapping to agents), a
tional models of complex systems that take their graph that captures the dependence relations of
place. agents on other agents, a local update function for
An agent-based simulation, sometimes also each agent that encapsulates the rules by which
called an individual-based or interaction-based the state of each agent evolves over time, and an
simulation (which we prefer), of a complex sys- update discipline for the variables (e.g., parallel or
tem is in essence a computer program that realizes sequential). We will show that this class of math-
some (possibly approximate) model of the system ematical objects is appropriate for the representa-
to be studied, incorporating the agents and their tion of agent-based simulations and, therefore,
rules of interaction. The simulation might be complex systems, and is rich enough to pose and
deterministic (i.e., the evolution of agent states is answer relevant mathematical questions. This
governed by deterministic rules) or class is sufficiently rich to be of mathematical
stochastic. The typical way in which such simu- interest in its own right and much work remains
lations are used is to initialize the computer pro- to be done in studying it. We also remark that
gram with a particular assignment of agent states many other frameworks such as probabilistic
and to run it for some time. The output is a tem- Boolean networks (Shmulevich et al. 2002a) fit
poral sequence of states for all agents, which is inside the framework described here.
then used to draw conclusions about the complex
system one is trying to understand. In other words,
the computer program is the model of the complex Introduction
system, and by running the program repeatedly,
one expects to obtain an understanding of the Computer simulations have become an integral
characteristics of the complex system. part of today’s research and analysis methodolo-
There are two main drawbacks to this gies. The ever-increasing demands arising from
approach. First, it is difficult to validate the the complexity and sheer size of the phenomena
model. Simulations for most systems involve studied constantly push computational bound-
quite complex software constructs that pose chal- aries, challenge existing computational methodol-
lenges to code validation. Second, there are essen- ogies, and motivate the development of new
tially no rigorous tools available for an analysis of theories to improve our understanding of the
model properties and dynamics. There is also no potential and limitations of computer simulation.
widely applicable formalism for the comparison Interaction-based simulations are being used to
of models. For instance, if one agent-based simu- simulate a variety of biological systems such as
lation is a simplification of another, then one ecosystems and the immune system, social
Agent-Based Modeling, Mathematical Formalism for 685
systems such as urban populations and markets, 2006). Such toolkits allow one to specify more
and infrastructure systems such as communication complex agents and interactions than would be
networks and power grids. possible using, e.g., ordinary differential equa-
To model or describe a given system, one typ- tions models. In general, it is difficult to develop
ically has several choices in the construction and a software package that is capable of supporting
design of agent-based models and representations. the simulation of a wide variety of physical, bio-
When agents are chosen to be simple, the simula- logical, and social systems.
tion may not capture the behavior of the real Standard or classical approaches to modeling
system. On the other hand, the use of highly are often based on continuous techniques and
sophisticated agents can quickly lead to complex frameworks such as ordinary differential equa-
behavior and dynamics. Also, use of sophisticated tions (ODEs) and partial differential equations
agents may lead to a system that scales poorly. (PDEs). For example, there are PDE-based
That is, a linear increase in the number of agents in models for studying traffic flow (Gupta and
the system may require a nonlinear (e.g., qua- Katiyar 2005; Keyfitz 2004; Whitham 1999).
dratic, cubic, or exponential) increase in the com- These can accurately model the emergence of
putational resources needed for the simulation. traffic jams for simple road/intersection configu-
Two common methods, namely, discrete-event rations through, for example, the formation of
simulation and time-stepped simulation, are often shocks. However, these models fail to scale to
used to implement agent-based models (Bagrodia the size and the specifications required to accu-
1998; Jefferson 1985; Nance 1993). In the rately represent large urban areas. Even if they
discrete-event simulation method, each event hypothetically were to scale to the required size,
that occurs in the system is assigned a time of the answers they provide (e.g., car density on a
occurrence. The collection of events is kept in road as a function of position and time) cannot
increasing order of their occurrence times. (Note answer questions pertaining to specific travelers
that an event occurring at a certain time may give or cars. Questions of this type can be naturally
rise to events which occur later.) When all the described and answered through agent-based
events that occur at a particular time instant have models. An example of such a system is TRANS-
been carried out, the simulation clock is advanced IMS (see section “TRANSIMS (Transportation
to the next time instant in the order. Thus, the Analysis and Simulation System”), where an
differences between successive values of the sim- agent-based simulation scheme is implemented
ulation clock may not be uniform. Discrete-event through a cellular automaton model. Another
simulation is typically used in contexts such as well-known example of the change in modeling
queuing systems (Misra 1986). In the time- paradigms from continuous to discrete is given by
stepped method of simulation, the simulation lattice gas automata (Frish et al. 1986) in the
clock is always advanced by the same amount. context of fluid dynamics. Stochastic elements
For each value of the simulation clock, the states are inherent in many systems, and this usually is
of the system components are computed using reflected in the resulting models used to describe
equations that model the system. This method of them. A stochastic framework is a natural
simulation is commonly used for studying, e.g., approach in the modeling of, for example, noise
fluid flows or chemical reactions. The choice of over a channel in a simulation of telecommunica-
model (discrete event vs. time stepped) is typi- tion networks (Barrett et al. 2002). In an economic
cally guided by an analysis of the computational market or a game theoretic setting with competing
speed they can offer, which in turn depends on the players, a player may sometimes decide to provide
nature of the system being modeled (see, e.g., Guo incorrect information. The state of such a player
et al. 2000). may therefore be viewed and modeled by a ran-
Toolkits for general purpose agent-based sim- dom variable. A player may make certain proba-
ulations include Swarm (Ebeling and Schweitzer bilistic assumptions about other players’
2001; Minar et al. 1996) and Repast (North et al. environment. In biological systems, certain
686 Agent-Based Modeling, Mathematical Formalism for
features and properties may only be known up to individual through the transportation network so
the level of probability distributions. It is only that its activity plan is carried out. This is done in
natural to incorporate this stochasticity into such a way that all constraints imposed on indi-
models of such systems. viduals from traffic driving rules, road signals,
Since applications of stochastic discrete fellow travelers, and public transportation sched-
models are common, it is desirable to obtain a ules are respected. The time scale is typically 1 s.
better understanding of these simulations both The micro-simulator is the part of TRANSIMS
from an application point of view (reliability, val- responsible for the detailed traffic dynamics. Its
idation) and from a mathematical point of view. implementation is based on cellular automata
However, an important disadvantage of agent- which are described in more detail in section
based models is that there are few mathematical “Cellular Automata.” Here, for simplicity, we
tools available at this time for the analysis of their focus on the situation where each individual
dynamics. travels by car. The road network representation
is in terms of links (e.g., road segments) and nodes
Examples of Agent-Based Simulations (e.g., intersections). The network description is
In order to provide the reader with some concrete turned into a cell-network description by
examples that can also be used later on to illustrate discretizing each lane of every link into cells.
theoretical concepts, we describe here three exam- A cell corresponds to a 7.5 m lane segment and
ples of agent-based descriptions of complex sys- can have up to four neighbor cells (front, back,
tems, ranging from traffic networks to the immune left, and right).
system and voting schemes. The vehicle dynamics is specified as follows.
Vehicles travel with discrete velocities 0, 1, 2, 3, 4,
TRANSIMS (Transportation, Analysis and or 5 which are constant between time steps. Each
Simulation System) update time step brings the simulation one time
TRANSIMS is a large-scale computer simulation unit forward. If the time unit is 1 s, then the
of traffic on a road network (Nagel and Wagner maximum speed of vmax ¼ 5 cells per time unit
2006; Nagel et al. 1997; Rickert et al. 1996). The corresponds to an actual speed of 5 7.5 m/s ¼
simulation works at the resolution level of indi- 37.5 m/s which is 135 km/h or approximately
vidual travelers and has been used to study large 83.9 mph.
US metropolitan areas such as Portland, OR, Ignoring intersection dynamics, the micro-
Washington, DC, and Dallas/Fort Worth. simulator executes three functions for each vehi-
A TRANSIMS-based analysis of an urban area cle in every update: (a) lane-changing,
requires (i) a population, (ii) a location-based (b) acceleration, and (c) movement. These func-
activity plan for each individual for the duration tions can be implemented through four cellular
of the analysis period, and (iii) a network repre- automata, one each for lane change decision and
sentation of all transportation pathways of the execution, one for acceleration, and one for move-
given area. The data required for (i) and (ii) are ment. For instance, the acceleration automaton
generated based on, e.g., extensive surveys and works as follows. A vehicle in TRANSIMS can
other information sources. The network represen- increase its speed by at most 1 cell per second, but
tation is typically very close to a complete if the road ahead is blocked, the vehicle can come
description of the real transportation network of to a complete stop in the same time. The function
the given urban area. that is applied to each cell that has a car in it uses
TRANSIMS consists of two main modules: the the gap ahead and the maximal speed to determine
router and the cellular automaton-based micro- if the car will increase or decrease its velocity.
simulator. The router maps each activity plan for Additionally, a car may have its velocity
each individual (obtained typically from activity decreased one unit as determined by a certain
surveys) into a travel route. The micro-simulator deceleration probability. The random deceleration
executes the travel routes and sends each is an important element of producing realistic
Agent-Based Modeling, Mathematical Formalism for 687
traffic flow. A major advantage of this representa- implemented on a parallel architecture. Its com-
tion is that it leads to very lightweight agents, a plexity is several orders of magnitude larger than
feature that is critical for achieving efficient that of its predecessor. It has been used to model
scaling. hypersensitivity to chemotherapy (Castiglione
and Agur 2003) and the selection of escape
C-ImmSim mutants from immune recognition during HIV
Next, we discuss an interaction-based simulation infection (Bernaschi and Castiglione 2002). In
that models certain aspects of the human immune Castiglione et al. (2007), the C-ImmSim frame-
system. Comprised of a large number of work was applied to the study of mechanisms
interacting cells whose motion is constrained by that contribute to the persistence of infection
the body’s anatomy, the immune system lends with the Epstein-Barr virus.
itself very well to agent-based simulation. In par-
ticular, these models can take into account three- A Voting Game
dimensional anatomical variations as well as The following example describes a hypothetical
small-scale variability in cell distributions. For voting scheme. The voting system is constructed
instance, while the number of T cells in the from a collection of voters. For simplicity, it is
human body is astronomical, the number of assumed that only two candidates, represented by
antigen-specific T cells, for a specific antigen, 0 and 1, contest in the election. There are N voters
can be quite small, thereby creating many spatial represented by the set {v1, v2, . . ., vN}. Each voter
inhomogeneities. Also, little is known about the has a candidate preference or a state. We denote
global structure of the system to be modeled. the state of voter vi by xi. Moreover, each voter
The first discrete model to incorporate a useful knows the preferences or states of some of his or
level of complexity was ImmSim (Celada and her friends (fellow voters). This friendship rela-
Seiden 1992a, b), developed by Seiden and tion is captured by the dependency graph which
Celada as a stochastic cellular automaton. The we describe later in section “Definitions, Back-
system includes B cells, T cells, antigen- ground, and Examples.” Informally, the depen-
presenting cells (APCs), antibodies, antigens, dency graph has as vertices the voters with an
and antibody-antigen complexes. Receptors on edge between each pair of voters that are friends.
cells are represented by bit strings, and antibodies Starting from an initial configuration of prefer-
use bit strings to represent their epitopes and pep- ences, the voters cast their votes in some order.
tides. Specificity and affinity are defined by using The candidate that receives the most votes is the
bit string similarity. The bit string approach was winner. A number of rules can be formulated to
initially introduced in Farmer et al. (1986). The decide how each voter chooses a candidate. We
model is implemented on a regular two- will provide examples of such rules later, and as
dimensional grid, which can be thought of as a will be seen, the outcome of the election is
slice of a lymph node, for instance. It has been governed by the order in which the votes are cast
used to study various phenomena, including the as well as the structure of the dependency graph.
optimal number of human leukocyte antigens in
human beings (Celada and Seiden 1992a), the
autoimmunity and T lymphocyte selection in the Existing Mathematical Frameworks
thymus (Morpurgo et al. 1995), antibody selection
and hyper-mutation (Celada and Seiden 1996), The field of agent-based simulation currently
and the dependence of the selection and matura- places heavy emphasis on implementation and
tion of the immune response on the antigen-to- computation rather than on the derivation of for-
receptor’s affinity (Bernaschi et al. 2000). The mal results. Computation is no doubt a very useful
computational limitations of the Seiden-Celada way to discover potentially interesting behavior
model have been overcome by a modified and phenomena. However, unless the simulation
model, C-ImmSim (Castiglione et al. 1997), has been set up very carefully, its outcome does
688 Agent-Based Modeling, Mathematical Formalism for
not formally validate or guarantee the observed used as models for phenomena ranging from lattice
phenomenon. It could simply be caused by an gases (Frish et al. 1986) and flows in porous media
artifact of the system model, an implementation (Rothman 1988) to traffic analysis (Fukś 2004;
error, or some other uncertainty. Nagel and Schreckenberg 1992; Nagel et al. 1995).
A first step in a theory of agent-based simula- A cellular automaton is typically defined over a
tion is the introduction of a formal framework regular grid. An example is a two-dimensional grid
that, on the one hand, is precise and computation- such as Z2. Each grid point (i, j) is referred to as a
ally powerful and, on the other hand, is natural in site or node. Each site has a state xi, j(t) which is
the sense that it can be used to effectively describe often taken to be binary. Here, t denotes the time
large classes of both deterministic and stochastic step. Furthermore, there is a notion of a neighbor-
systems. Apart from providing a common basis hood for each site. The neighborhood N of a site is
and a language for describing the model using a the collection of sites that can influence the future
sound formalism, such a framework has many state of the given site. Based on its current state xi,
advantages. At a first level, it helps to clarify the j(t) and the current states of the sites in its neighbor-
key structure in a system. Domain-specific knowl- hood N, a function fi, j is used to compute the next
edge is crucial to deriving good models of com- state xi, j(t + 1) of the site (i, j). Specifically, we have
plex systems, but domain specificity is often
confounded by domain conventions and terminol- xi,j ðt þ 1Þ ¼ f i,j xi,j ðt Þ , ð1Þ
ogy that eventually obfuscate the real structure.
A formal, context independent framework also where xi, j ðt Þ denotes the tuple consisting of all the
makes it easier to take advantage of existing gen- states xi0, j0 ðt Þ with (i0 , j0 ) N. The tuple consisting
eral theory and algorithms. Having a model for- of the states of all the sites is the CA configuration
mulated in such a framework also makes it easier and is denoted x(t) ¼ (xi, j(t))i,j. Equation 1 is used
to establish results. Additionally, expressing the to map the configuration x(t) to x(t + 1). The
model using a general framework is more likely to cellular automaton map or dynamical system is
produce results that are widely applicable. This the map F that sends x(t) to x(t + 1).
type of framework also supports implementation A central part of CA research is to understand
and validation. Modeling languages like UML how configurations evolve under iteration of the
(Booch et al. 2005) serve a similar purpose but map F and what types of dynamical behavior can
tend to focus solely on software implementation be generated. A general introduction to CA can be
and validation issues and very little on mathemat- found in Ilachinsky (2001).
ical or computational analysis.
0 1
X pair of agents, there is a bidirectional channel
f i ðt Þ ¼ sgn@ti þ wi, j xj ðt ÞA, through which they can communicate. The state
vj N i of an agent at time t + 1 is a function of the current
state and the input (if any) on one or more of the
where sgn is the map from R to { +1, 1}, defined channels. When an agent (FSM) undergoes a tran-
by sition from one state to another, it may also choose
to send a message to another agent or receive a
1, if x 0 and message from an agent. In general, such systems
sgnðxÞ ¼ can be synchronous or asynchronous. As can be
1, otherwise:
seen, CFSMs are a natural formalism for studying
Now, the state of vi at time t + 1 is protocols used in computer networks. The CFSM
model has been used extensively to prove proper-
xi ðt þ 1Þ ¼ f i ðt Þ: ties (e.g., deadlock freedom, fairness) of a number
of protocols used in practice (see Brand and
Many references on Hopfield networks (see, Zafiropulo 1983; Gouda and Chang 1986 and
e.g., Hopfield 1982; Russell and Norwig 2003) the references cited therein).
assume that the underlying undirected graph is Other frameworks include interacting particle
complete; that is, there is an edge between pairs systems (Liggett 2005) and Petri nets (Moncion
of nodes. In the definition presented above, the et al. 2006). There is a vast literature on both, but
graph need not be complete. However, this does space limitations prevent a discussion here.
not cause any difficulties since the missing edges
can be assigned weight 0. As a consequence, such
edges will not play any role in determining the Finite Dynamical Systems
dynamics of the system. Both synchronous and
asynchronous update models of Hopfield neural Another, quite general, modeling framework that
networks have been considered in the literature. has been proposed is that of finite dynamical sys-
For theoretical results concerning Hopfield net- tems, both synchronous and asynchronous. Here,
works, see Orponen (1994, 1996) and the refer- the proposed mathematical object representing an
ences cited therein. Reference Russell and Norwig agent-based simulation is a time-discrete dynami-
(2003) presents a number of applications of neural cal system on a finite state set. The description of
networks. In Macy et al. (2003), a Hopfield model the systems is modeled after the key components of
is used to study polarization in dynamic networks. an agent-based simulation, namely, agents, the
dependency graph, local update functions, and an
update order. This makes a mapping to agent-based
Communicating Finite-State Machines simulations natural. In the remainder of this entry,
The model of communicating finite-state we will show that finite dynamical systems satisfy
machines (CFSM) was proposed to analyze pro- our criteria for a good mathematical framework in
tocols used in computer networks. In some of the that they are general enough to serve as a broad
literature, this model is also referred to as “con- computing tool and mathematically rich enough to
current transition systems” (Gouda and Chang allow the derivation of formal results.
1986).
In the CFSM model, each agent is a process Definitions, Background, and Examples
executing on some node of a distributed comput- Let x1, . . ., xn be a collection of variables, which
ing system. Although there are minor differences take values in a finite set X. (As will be seen, the
among the various CFSM models proposed in the variables represent the entities in the system being
literature (Brand and Zafiropulo 1983; Gouda and modeled and the elements of X represent their
Chang 1986), the basic setup models each process states.) Each variable xi has associated to it a
as a finite-state machine (FSM). Thus, each agent “local update function” fi : X n ! X, where
is in a certain state at any time instant t. For each “local” refers to the fact that fi takes inputs from
690 Agent-Based Modeling, Mathematical Formalism for
the variables in the “neighborhood” of xi, in a sense xn) where g is a parallel update dynamical system.
to be made precise below. By abuse of notation, we However, the maps gi are not local functions.
also let fi denote the function Xn ! X which
changes the ith coordinate and leaves the other The dynamics of F is usually represented as a
coordinates unchanged. This allows for the sequen- directed graph on the vertex set X n, called the
tial composition of the local update functions. phase space of F. There is a directed edge from
These functions assemble to a dynamical system v Xn to w Xn if and only if F(v) ¼ w.
A second graph that is usually associated with a
F ¼ ðf 1 , . . . , f n Þ : X n ! X n , finite dynamical system is its dependency graph
Y(V, E). In general, this is a directed graph, and its
with the dynamics generated by iteration of F. As vertex set is V ¼ {1, . . ., n}. There is a directed
an example, if X ¼ {0, 1} with the standard edge from i to j if and only if xi appears in the
Boolean operators AND and OR, then F is a function fj. In many situations, the interaction rela-
Boolean network. tionship between pairs of variables is symmetric;
The assembly of F from the local functions fi that is, variable xi appears in fj if and only if xj
can be done in one of several ways. One can update appears in fi. In such cases, the dependency graph
each of the variables simultaneously, that is, can be thought of as an undirected graph. We recall
that the dependency graphs mentioned in the con-
Fðx1 , . . . , xn Þ ¼ ðf 1 ðx1 , . . . , xn Þ, . . . , f n ðx1 , . . . , xn ÞÞ: text of the voting game (see section “A Voting
Game”) and Hopfield networks (see section
In this case, one obtains a parallel dynamical “Hopfield Networks”) are undirected graphs. The
system. dependency graph plays an important role in the
Alternatively, one can choose to update the study of finite dynamical systems and is sometimes
states of the variables according to some fixed listed explicitly as part of the specification of F.
update order, for example, a permutation (p1, p2,
. . ., pn) of the set {1, . . ., n}. More generally, one Example 2 Let X ¼ {0, 1} (the Boolean case).
could use a word on the set {1, . . ., n}, that is, Suppose we have four variables and the local
p ¼ (p1, . . ., pt) where t is the length of the word. Boolean update functions are
The function composition
f 1 ¼ x1 þ x2 þ x3 þ x 4 ,
Fp ¼ f pt ∘f pt1 ∘ ∘f p1 , ð2Þ
f 2 ¼ x1 þ x2 ,
f 3 ¼ x1 þ x3 ,
is called a sequential dynamical system (SDS), and
as before, the dynamics of Fp is generated by f 4 ¼ x1 þ x4 ,
iteration. The case when p is a permutation on {1,
. . ., n} has been studied extensively (Barrett and where “+” represents sum modulo 2. The dynam-
Reidys 1999; Barrett et al. 2000, 2001b, 2003c). It ics of the function F ¼ ( f1, . . ., f4) : X 4 ! X 4 is
is clear that using a different permutation or word s the directed graph in Fig. 1a, while the depen-
may result in a different dynamical system Fs. dency graph is in Fig. 1b.
Using a word rather than a permutation allows
one to capture the case where some vertices have Example 3 Consider the local functions in
states that are updated more frequently than others. Example 2 and let p ¼ (2, 1, 3, 4). Then
Agent-Based Modeling, Mathematical Formalism for, Fig. 1 The phase space of the parallel system F (a) and
dependency graph of the Boolean functions from Example 2 (b)
a b
0000 0001 0010 0011 0100 0101 0110 0111 0000 0001 0010 0011 0100 0101 0110 0111
Agent-Based Modeling, Mathematical Formalism for, Fig. 2 The phase spaces from Example 3: Fp (a) and Fid (b)
Notice that the phase space of any function is a This observation has many useful conse-
directed graph in which every vertex has out- quences, since polynomial functions have been
degree one; this is a characteristic property of studied extensively and many analytical tools are
deterministic functions. available.
Making use of Boolean arithmetic is a power- Notice that cellular automata and Boolean net-
ful tool in studying Boolean networks, which is works, both parallel and sequential, are special
not available in general. In order to have an classes of polynomial dynamical systems. In
enhanced set of tools available, it is often natural fact, it is straightforward to see that
to make an additional assumption regarding X,
namely, that it is a finite number system, a finite x ^ y ¼ x y x _ y ¼ x þ y þ xy
ð4Þ
field (Lidl and Niederreiter 1997). This amounts to and :x ¼ x þ 1 :
the assumption that there are “addition” and “mul-
Therefore, the modeling framework of finite
tiplication” operations defined on X that satisfy
dynamical systems includes that of cellular
the same rules as ordinary addition and multipli-
automata discussed earlier. Also, since a Hopfield
cation of numbers. Examples include Zp, the inte-
network is a function X n ! X n, which can be
gers modulo a prime p. This assumption can be
represented through its local constituent func-
thought of as the discrete analog of imposing a
tions, it follows that Hopfield networks also are
coordinate system on an affine space.
special cases of finite dynamical systems.
When X is a finite field, it is easy to show that
for any local function g, there exists a polynomial
h such that g(x1, . . ., xn) ¼ h(x1, . . ., xn) for all (x1, Stochastic Finite Dynamical Systems
. . ., xn) Xn. To be precise, suppose X is a finite The deterministic framework of finite dynamical
field with q elements. Then systems can be made stochastic in several differ-
ent ways, making one or more of the system’s
g ð x1 , . . . , xn Þ ¼
defining data stochastic. For example, one could
X Y
n
use one or both of the following criteria.
g ðc1 , . . . , cn Þ ð1 ðxi ci Þq1 Þ:
ðc1 , ..., cn Þ X n i¼1
• Assume that each variable has a nonempty set
ð3Þ of local functions assigned to it, together with a
692 Agent-Based Modeling, Mathematical Formalism for
probability distribution on this set, and each be viewed as a Markov chain over the state space
time a variable is updated, one of these local Xn. The adjacency matrix of GO directly encodes
functions is chosen at random to update its the Markov transition matrix. This is of course not
state. We call such systems probabilistic finite new and has been done in, e.g., (Dawson 1974;
dynamical systems (PFDS), a generalization of Shmulevich et al. 2002b; Vasershtein 1969). But
probabilistic Boolean networks (Shmulevich we emphasize the point that even though SFDS
et al. 2002b). give rise to Markov chains, our study of SFDS is
• Fix a subset of permutations T Sn together greatly facilitated by the rich additional structure
with a probability distribution. When it is time available in these systems. To understand the effect
for the system to update its state, a permutation of structural components such as the topology of
p T is chosen at random, and the agents are the dependency graph or the stochastic nature of
updated sequentially using p. We call such the update, it is important to study them not as
systems stochastic finite dynamical systems Markov chains but as SFDS.
(SFDS). Example 5. Consider Fp and Fg from Example
3 and let Gp and Gg be their phase spaces as shown
Remark 4 By Remark, each system Fp is a par- in Fig. 2. Let p1 ¼ p2 ¼ 1/2 . The phase space
allel system. Hence a SFDS is nothing but a set of (1/2)Gp + (1/2)Gg of the stochastic sequential
parallel dynamical systems {Fp: p T}, together dynamical system obtained from Fp and Fg
with a probability distribution. When it is time for (with equal probabilities) is presented in Fig. 3.
the system to update its state, a system Fp is
chosen at random and used for the next iteration.
Agent-Based Simulations as Finite Dynamical
To describe the phase space of a stochastic Systems
finite dynamical system, a general method is as In the following, we describe the generic structure
follows. Let O be a finite collection of systems F1, of the systems typically modeled and studied
. . ., Ft, where Fi : Xn ! Xn for all i, and consider through agent-based simulations. The central
the probabilities p1, . . ., pt which sum to 1. We notion is naturally that of an agent.
obtain the stochastic phase space Each agent carries a state that may encode its
preferences, internal configuration, perception of
GO ¼ p1 G1 þ p2 G2 þ þ pt Gt , ð5Þ its environment, and so on. In the case of TRANS-
IMS, for instance, the agents are the cells making
where Gi is the phase space of Fi. The associated up the road network. The cell state contains infor-
probability space is F ¼ (O, 2O, m), where the mation about whether or not the cell is occupied
probability measure m is induced by the probabili- by a vehicle as well as the velocity of the vehicle.
ties pi. It is clear that the stochastic phase space can One may assume that each cell takes on states
Agent-Based Modeling, Mathematical Formalism for, Fig. 3 The stochastic phase space for Example 5 induced by
the two deterministic phase spaces of Fp and Fg from Fig. 2. For simplicity, the weights of the edges have been omitted
Agent-Based Modeling, Mathematical Formalism for 693
from the same set of possible states, which may be constructed to illustrate how implementation
chosen to support the structure of a finite field. choices for the system components have a clear
The agents interact with each other, but typi- and direct bearing on the dynamics and simulation
cally an agent only interacts with a small subset of outcomes.
agents, its neighbors. Through such an interaction,
an agent may decide to change its state based on the Let the voter dependency graph be the star graph
states (or other aspects) of the agents with which it on 5 vertices with center vertex a and surrounding
interacts. We will refer to the process where an vertices b, c, d, and e. Furthermore, assume that
agent modifies its state through interaction as an everybody votes opportunistically using the major-
agent update. The precise way in which an agent ity rule: the vote cast by an individual is equal to the
modifies its state is governed by the nature of the preference of the majority of his/her friends with the
particular agent. In TRANSIMS, the neighbors of a person’s own preference included. For simplicity,
cell are the adjacent road network cells. From this assume candidate 1 is preferred in the case of a tie.
adjacency relation, one obtains a dependency If the initial preference is xa ¼ 1 and xb ¼ xc ¼
graph of the agents. The local update function for xd ¼ xe ¼ 0, then if voter a goes first, he/she will
a given agent can be obtained from the rules vote for candidate 0 since that is the choice of the
governing traffic flow between cells. majority of the neighbor voters. However, if b and
The updates of all the agents may be scheduled c go first, they only know a’s preference. Voter
in different ways. Classical approaches include b therefore casts his/her vote for candidate 1 as
synchronous, asynchronous, and event-driven does c. Note that this is a tie situation with an
schemes. The choice will depend on system prop- equal number of preferences for candidate 1 (a)
erties or particular considerations about the simu- and for candidate 2 (b). If voter a goes next, then
lation implementation. the situation has changed: the preference of b and
In the case of CImmSim, the situation is some- c has already changed to 1. Consequently, voter
what more complicated. Here, the agents are also a picks candidate 1. At the end of the day, candi-
the spatial units of the system, each representing a date 1 is the election winner, and the choice of
small volume of lymph tissue. The total volume is update order has tipped the election!
represented as a two-dimensional CA, in which This example is of course constructed to illus-
every agent has four neighbors, so that the depen- trate our point. However, in real systems, it can be
dency graph is a regular two-dimensional grid. The much more difficult to detect the presence of such
state of each agent is a collection of counts for the sensitivities and their implications. A solid math-
various immune cells and pathogens that are pre- ematical framework can be very helpful in
sent in this particular agent (volume). Movement detecting such effects.
between cells is implemented as diffusion. Immune
cells can interact with each other and with patho-
gens while they reside in the same volume. Thus, Finite Dynamical Systems as Theoretical
the local update function for a given cell of the and Computational Tools
simulation is made up of the two components of
movement between cells and interactions within a If finite dynamical systems are to be useful as a
cell. For instance, a B cell could interact with the modeling paradigm for agent-based simulations,
Epstein-Barr virus in a given volume and perform a it is necessary that they can serve as a fairly
transition from uninfected to infected by the next universal model of computation. We discuss here
time step. Interactions as well as movement are how such dynamical systems can mimic Turing
stochastic, resulting in a stochastic finite dynamical machines (TMs), a standard universal model for
system. The update order is parallel. computation. For a more thorough exposition, we
refer the reader to the series of papers by Barrett
Example 6 The Voting Game (see section “A et al. (2003a, b, 2006, 2007a, b). To make the
Voting Game”) The following scenario is discussion reasonably self-contained, we provide
694 Agent-Based Modeling, Mathematical Formalism for
a brief and informal discussion of the TM model. configurations such as C and C0 are represented by
Additional information on TMs can be found in nodes in the phase space, the CR problem boils
any standard text on the theory of computation down to the question of whether there is a directed
(e.g., Sipser 1997). path in the phase space from C to C0 . This abstract
problem can be mapped to several problems in the
A Computational View of Finite Dynamical simulation of multiagent systems. Consider, for
Systems: Definitions example, the TRANSIMS context. Here, the ini-
In order to understand the relationship of finite tial configuration C may represent the state of the
dynamical systems to TMs, it is important to view system early in the day (when the traffic is very
such systems from a computational stand point. light) and C0 may represent an “undesirable” state
Recall that a finite dynamical system F : X n ! X n, of the system (such as heavy traffic congestion).
where X is a finite set, has an underlying depen- Similarly, in the context of modeling an infectious
dency graph Y(V, E). From a computational point disease, C may represent the initial onset of the
of view, the nodes of the dependency graph (the disease (when only a small number of people are
agents in the simulation) are thought of as devices infected), and C0 may represent a situation where a
that compute appropriate functions. For simplic- large percentage of the population is infected. The
ity, we will assume in this section that the depen- purpose of studying computational problems such
dency graph is undirected, that is, all dependency as CR is to determine whether one can efficiently
relations are symmetric. At any time, the state predict the occurrence of certain events in the
value of each node vi V is from the specified system from a description of the system. If com-
domain X. The inputs to fi are the current state of putational analysis shows that the system can
vi and the states of the neighbors of vi as specified indeed reach undesirable configurations as it
by Y. The output of fi, which is also a member of X, evolves, then one can try to identify steps needed
becomes the state of vi at the next time instant. The to deal with such situations.
discussion in this section will focus on sequen-
tially updated systems (SDS), but all results Turing Machines: A Brief Overview
discussed apply to parallel systems as well. A Turing machine (TM) is a simple and com-
Each step of the computation carried out by an monly used model for general purpose computa-
SDS can be thought as consisting of n “mini tional devices. Since our goal is to point out how
steps”; in each mini step, the value of the local SDSs can also serve as computational devices, we
transition function at a node is computed and the will present an informal overview of the TM
state of the node is changed to the computed model. Readers interested in a more formal
value. Given an SDS F, a configuration C of F description may consult (Sipser 1997).
is a vector (c1, c2, . . ., cn) X n. It can be seen that A TM consists of a set Q of states, a one-way
each computational step of an SDS causes a tran- infinite input tape and a read/write head that can
sition from one configuration to another. read and modify symbols on the input tape. The
input tape is divided into cells and each cell con-
Configuration Reachability Problem for SDSs tains a symbol from a special finite alphabet. An
Based on the computational view, a number of input consisting of n symbols is written on the
different problems can be defined for SDSs (see leftmost n cells of the input tape. (The other cells
for, e.g., Barrett et al. 2001a, 2006, 2007b). To are assumed to contain a special symbol called
illustrate how SDSs can model TM computations, blank.) One of the states in Q, denoted by qs, is the
we will focus on one such problem, namely, the designated start state. Q also includes two other
configuration reachability (CR) problem: Given special states, denoted by qa (the accepting state)
an SDS F, an initial configuration C and another and qr (the rejecting state). At any time, the
configuration C0 , will F, starting from C, ever machine is in one of the states in Q. The transition
reach configuration C0 ? The problem can also be function for the TM specifies for each combina-
expressed in terms of the phase space of F. Since tion of the current state and the current symbol
Agent-Based Modeling, Mathematical Formalism for 695
under the head, a new state, a new symbol for the CR-TM problem which characterize different
current cell (which is under the head), and a computational complexity classes (Sipser 1997).
movement (i.e., left or right by one cell) for the
head. The machine starts in state qs with the head How SDSs Can Mimic Turing Machines
on the first cell of the input tape. Each step of the The above discussion points out an important
machine is carried out in accordance with the similarity between SDSs and TMs. Under both
transition function. If the machine ever reaches of these models, each computational step causes
either the accepting or the rejecting state, it halts a transition from one configuration to another. It is
with the corresponding decision; otherwise, the this similarity that allows one to construct a dis-
machine runs forever. crete dynamical system F that can simulate a
A configuration of a TM consists of its current TM. Typically, each step of a TM is simulated
state, the current tape contents, and the position of by a short sequence of successive iterations F.
the head. Note that the transition function of a TM As part of the construction, one also identifies a
specifies how a new configuration is obtained suitable mapping between the configurations of
from the current configuration. the TM being simulated and those of the dynam-
The above description is for the basic TM ical system. This is done in such a way that the
model (also called single-tape TM model). For answer to CR-TM problem is “yes” if and only if
convenience in describing some computations, the answer to the CR problem for the dynamical
several variants of the above basic model have system is also “yes.”
been proposed. For example, in a multi-tape TM, To illustrate the basic ideas, we will informally
there are one or more work tapes in addition to the sketch a construction from (Barrett et al. 2006).
input tape. The work tapes can be used to store For simplicity, this construction produces an SDS
intermediate results. Each work tape has its own F that simulates a restricted version of TMs; the
read/write head, and the definitions of configura- restriction being that for any input containing
tion and transition function can be suitably mod- n symbols, the number of work tape cells that
ified to accommodate work tapes. While such an the machine may use is bounded by a linear func-
enhancement to the basic TM model makes it tion of n. Such a TM is called a linear bounded
easier to carry out certain computations, it does automaton (LBA) (Sipser 1997). Let M denote the
not add to the machine’s computational power. In given LBA and let n denote the length of the input
other words, any computation that can be carried to M. The domain X for the SDS F is chosen to be
out using the enhanced model can also be carried a finite set based on the allowed symbols in the
out using the basic model. input to the TM. The dependency graph is chosen
As in the case of dynamical systems, one can to be a simple path on n nodes, where each node
define a configuration reachability (CR) problem serves as a representative for a cell on the input
for TMs: Given a TM M, an initial configuration tape. The initial and final configurations C and C0
I M and another configuration CM, will the TM for F are constructed from the corresponding con-
starting from I M ever reach CM? We refer to the figurations of M. The local transition function for
CR problem in the context of TMs as CR-TM. In each node of the SDS is constructed from the
fact, it is this problem for TMs that captures the given transition function for M in such a way
essence of what can be effectively computed. In that each step of M corresponds to exactly one
particular, by choosing the state component of CM step of F. Thus, there is a simple bijection
to be one of the halting states (qa or qr), the between the sequence of configurations that
problem of determining whether a function is M goes through during its computation and the
computable is transformed into an appropriate sequence of states that F goes through as it
CR-TM problem. By imposing appropriate evolves. Using this bijection, it is shown in Barrett
restrictions on the resources used by a TM (e.g., et al. (2006) that the answer to the CR-TM prob-
the number of steps, the number of cells on the lem is “yes” if and only if F reaches C0 starting
work tapes), one obtains different versions of the from C. Reference (Barrett et al. 2006) also
696 Agent-Based Modeling, Mathematical Formalism for
and negation.) When each fi is a linear polynomial represents either an invertible or a nilpotent linear
of the form fi(x1, . . ., xn) ¼ ai1x1 + + ainxn, the transformation. Consider an invertible block B. If
map F is nothing but a linear transformation on kn m(x) is the minimal polynomial of B, then there
over k, and, by using the standard basis, F has the exists s such that m(x) divides xs 1. Hence Bs
matrix representation I ¼ 0 which implies that Bsv ¼ v. That is, every
state vector v in the phase space of B is in a cycle
2 3 whose length is a divisor of s.
02 31 a1n 2 3
x1 6 a11 7 x1
B6 7 C 6 76 7 Definition 7 For any polynomial l(x) in k[x], the
F@4 ⋮ 5A ¼ 6 ⋮ ⋱ ⋮ 7 4 ⋮ 5,
4 5 order of l(x) is the least integer s such that l(x)
xn an1 xn
ann divides xs 1.
where aij k for all i, j. The cycle structure of the phase space of F can
Linear finite dynamical systems were first stud- be completely determined from the orders of the
ied by Elspas (1959). His motivation came from irreducible factors of the minimal polynomial of
studying feedback shift register networks and F. The computation of these orders involves in
their applications to radar and communication particular the factorization of numbers of the form
systems and automatic error correction circuits. qr 1, which makes the computation of the order
For linear systems over finite fields of prime car- of a polynomial potentially quite costly. The nil-
dinality, that is, q is a prime number, Elspas potent blocks in the decomposition of A determine
showed that the exact number and length of each the tree structure at the nodes of the limit cycles. It
limit cycle can be determined from the elementary turns out that all trees at all periodic nodes are
divisors of the matrix A. Recently, Hernández- identical. This generalizes a result in Martin et al.
Toledo (2005) rediscovered Elspas’ results and (1984) for additive cellular automata over the field
generalized them to arbitrary finite fields. Further- with two elements.
more, he showed that the structure of the tree of While the fact that the structure of the phase
transients at each node of each limit cycle is the space of a linear system can be determined from
same and can be completely determined from the the invariants associated with its matrix may not
nilpotent elementary divisors of the form xa. For be unexpected, it is a beautiful example of how the
affine Boolean networks (i.e., finite dynamical right mathematical viewpoint provides powerful
systems over the Boolean field with two elements, tools to completely solve the problem of relating
whose local functions are linear polynomials the structure of the local functions to the resulting
which might have constant terms), a method to (or emerging) dynamics. Linear and affine sys-
analyze their cycle length has been developed in tems have been studied extensively in several
Milligan and Wilson (1993). After embedding the different contexts and from several different
matrix of the transition function, which is of points of view, in particular the case of cellular
dimension n (n + 1), into a square matrix of automata. For instance, additive cellular automata
dimension n + 1, the problem is then reduced to over more general rings as state sets have been
the linear case. A fast algorithm based on studied, e.g., in Chaudhuri (1997). Further results
(Hernández-Toledo 2005) has been implemented on additive CAs can also be found there. One
in Jarrah et al. (2007), using the symbolic compu- important focus in Chaudhuri (1997) is on the
tation package Macaulay2. problem of finding CAs with limit cycles of max-
It is not surprising that the phase space struc- imal length for the purpose of constructing pseudo
ture of F should depend on invariants of the random number generators.
matrix A ¼ (aij). The rational canonical form of Unfortunately, the situation is more compli-
A is a block-diagonal matrix, and one can recover cated for nonlinear systems. For the special class
the structure of the phase space of A from that of of Boolean synchronous systems whose local
the blocks in the rational form of A. Each block update functions consist of monomials, there is a
698 Agent-Based Modeling, Mathematical Formalism for
polynomial time algorithm that determines the dependency graph Y of the local update func-
whether a given monomial system has only fixed tions and the update graph of Y. The update graph
points as periodic points (Colón-Reyes et al. U(Y) of Y is the graph whose vertex set consists of
2004). This question was motivated by applica- all permutations of the vertex set of Y (Reidys
tions to the modeling of biochemical networks. 1998). There is an (undirected) edge between two
The criterion is given in terms of invariants of the permutations s ¼ (s1, . . . , sn) and t ¼ (t1, . . ., tn)
dependency graph Y. For a strongly connected if they differ by a transposition of two adjacent
directed graph Y (i.e., there is a directed path entries si and si +1 such that there is no edge in
between any pairs of vertices), its loop number is Y between si and si +1.
the greatest common divisor of all directed loops The update graph encodes the fact that one can
at a particular vertex. (This number is independent commute two local update functions fi and fj with-
of the vertex chosen.) out affecting the end result F if i and j are not
connected by an edge in Y. That is, fi ∘ fj
Theorem 8 (Colón-Reyes et al. 2004) A Bool- ¼ fj ∘ fi if and only if i and j are not
ean monomial system has only fixed points as connected by an edge in Y.
periodic points if and only if the loop number of All permutations belonging to the same
every strongly connected component of its depen- connected component in U(Y) give identical SDS
dency graph is equal to 1. maps. The number of (connected) components in
U(Y) is therefore an upper bound for the number of
In Colón-Reyes et al. (2006), it is shown that functionally inequivalent SDS that can be gener-
the problem for general finite fields can be ated by just changing the update order. It is conve-
reduced to that of a Boolean system and a linear nient to introduce an equivalence relation Y on SY
system over rings of the form Z/prZ, p prime. by p Ys if p and s belong to the same connected
Boolean monomial systems have been studied component in the graph U(Y). It is then clear that if
before in the cellular automaton context (Bartlett p Ys, then corresponding sequential dynamical
and Garzon 1993). systems are identical as maps. This can also be
characterized in terms of acyclic orientations of
Sequential Update Systems the graph Y: Each component in the update graph
The update order in a sequential dynamical sys- induces a unique acyclic orientation of the graph Y.
tem has been studied using combinatorial and Moreover, we have the following result:
algebraic techniques. A natural question to ask
here is how the system depends on the update
Proposition 9 (Reidys 1998) There is a bijection
schedule. In Barrett et al. (2000, 2001b), Mortveit
and Reidys (2001), Reidys (1998), this was
F Y : S Y =e ! AcycðY Þ,
answered on several levels for the special case Y
where the update schedule is a permutation. We
describe these results in some detail. Results about where S Y =e denotes the set of equivalence classes
Y
the more general case of update orders described of Ye and Acyc(Y) denotes the set of acyclic orien-
by words on the indices of the local update func- tations of Y.
tions can be found in Garcia et al. (2006).
Given local update functions fi:kn ! k and per- This upper bound on the number of function-
mutation update orders s, p, a natural question is ally different systems has been shown in Reidys
when the two resulting SDS Fs and Fp are identi- (1998) to be sharp for Boolean systems, in the
cal and, more generally, how many different sys- sense that for a given Y one constructs this number
tems one obtains by varying the update order over of different systems, using appropriate combina-
all permutations. Both questions can be answered tions of NOR functions.
in great generality. The answer involves invariants For two permutations s and t, it is easy to
of two graphs, namely, the acyclic orientations of determine if they give identical SDS maps: One
Agent-Based Modeling, Mathematical Formalism for 699
can just compare their induced acyclic orienta- invariant sets as in Hansson et al. (2005). We have
tions. The number of acyclic orientations of the already seen how the automorphisms of a graph
graph Y tells how many functionally different SDS give rise to equivalence (Mortveit and Reidys
maps one can obtain for a fixed graph and fixed 2001). Also, if the graph Y has nontrivial covering
vertex functions. The work of Cartier and Foata maps, we can derive simplified or reduced (in an
(1969) on partially commutative monoids studies appropriate sense) versions of the original SDS
a similar question, but their work is not concerned over the image graphs of Y (see, e.g., Mortveit and
with finite dynamical systems. Reidys 2004; Reidys 2005).
Note that permutation update orders have been Parallel and sequential dynamical systems dif-
studied sporadically in the context of cellular fer when it comes to invertibility. Whereas it is
automata on circle graphs (Park et al. 1986) but generally computationally intractable to deter-
not in a systematic way, typically using the order mine if a CA over Zd is invertible for d
(1, 2, . . ., n) or the even-odd/odd-even orders. As a 2 (Kari 2005), it is straightforward to determine
side note, we remark that this work also confirms this for a sequential dynamical system (Mortveit
our findings that switching from a parallel update and Reidys 2001). For example, it turns out that
order to a sequential order turns the complex the only invertible Boolean sequential dynamical
behavior found in Wolfram’s “class III and IV” systems with symmetric local functions are the
automata into much more regular or mundane ones where the local functions are either the parity
dynamics (see, e.g., Schönfisch and de Roos 1999). function or the logical complement of the parity
The work on functional equivalence was function (Barrett et al. 2001b).
extended to dynamical equivalence (topological Some classes of sequential dynamical systems
conjugation) in Barrett et al. (2001b), Mortveit such as the ones induced by the nor -function have
and Reidys (2001). The automorphism group of desirable stability properties (Hansson et al. 2005).
the graph Y can be made to act on the components These systems have minimal invariant sets (i.e.,
of the update graph U(Y ) and therefore also on the periodic states) that do not depend on the update
acyclic orientations of Y. All permutations order. Additionally, these invariant sets are stable
contained in components of an orbit under with respect to configuration perturbations.
Aut(Y ) give rise to dynamically equivalent If a state c is perturbed to a state c0 that is not
sequential dynamical systems, that is, to isomor- periodic, this state will evolve to a periodic state
phic phase spaces. However, here one needs some c00 in one step; that is, the system will quickly
more technical assumptions, i.e., the local func- return to the invariant set. However, the states
tions must be symmetric and induced (see Barrett c and c00 may not necessarily be in the same
et al. 2003c). This of course also leads to a bound periodic orbit.
for the number of dynamically inequivalent sys-
tems that can be generated by varying the update The Category of Sequential Dynamical
order alone. Again, this was first done for permu- Systems
tation update orders. The theory was extended to As a general principle, in order to study a given
words over the vertex set of Y in Garcia et al. class of mathematical objects, it is useful to study
(2006), Reidys (2006). transformations between them. In order to provide
The structure of the graph Y influences the a good basis for a mathematical analysis, the
dynamics of the system. As an example, graph objects and transformations together should form
invariants such as the independent sets of Y turn a category, that is, the class of transformations
out to be in a bijective correspondence with the between two objects should satisfy certain reason-
invariant set of sequential systems over the Bool- able properties (see, e.g., Mac Lane 1998). Sev-
ean field k ¼ {0, 1} that have nort : kt ! k given by eral proposed definitions of a transformation of
nort : (x1, . . ., xt) ¼ (1 + x1) (1 + xt) as local SDS have been published, notably in
functions (Reidys 2001). This can be extended to Laubenbacher and Pareigis (2003) and Reidys
other classes such as those with order independent (2005). One possible interpretation of a
700 Agent-Based Modeling, Mathematical Formalism for
transformation of SDS from the point of view of Laubenbacher and Pareigis 2003) that a transfor-
agent-based simulation is that the transformation mation of SDS induces a map of directed graphs on
represents the approximation of one simulation by the phase spaces of the two systems. That is, a
another or the embedding/projection of one sim- transformation of the local structural elements of
ulation into/onto another. These concepts have SDS induces a transformation of global dynamics.
obvious utility when considering different simu- One of the results proven in Laubenbacher and
lations of the same complex system. Pareigis (2003) is that every SDS can be
One can take different points of view in defin- decomposed uniquely into a direct product (in the
ing a transformation of SDS. One approach is to categorical sense) of indecomposable SDS.
require that a transformation is compatible with Another possible point of view is that a
the defining structural elements of an SDS, that is, transformation
with the dependency graph, the local update func-
tions, and the update schedule. If this is done F : ðFp : k n ! k n Þ ! Fg : k m ! k m
properly, then one should expect to be able to
prove that the resulting transformation induces a is a function F:kn ! km such that F ∘ Fp ¼ Fg ∘ F,
transformation at the level of phase spaces. That without requiring specific structural properties.
is, transformations between SDS should preserve This is the approach in Reidys (2005). This defi-
the local and global dynamic behavior. This nition also results in a category and a collection of
implies that transformations between SDS lead mathematical results. Whatever definition chosen,
to transformations between the associated global much work remains to be done in studying these
update functions. categories and their properties.
Since the point of view of SDS is that global
dynamics emerges from system properties that are
defined locally, the notion of SDS transformation Future Directions
should focus on the local structure. This is the
point of view taken in Laubenbacher and Pareigis Agent-based computer simulation is an important
(2003). The definition given there is rather tech- method for modeling many complex systems,
nical and the details are beyond the scope of this whose global dynamics emerges from the interac-
entry. The basic idea is as follows. Let Fp ¼ fp(n) ∘ tion of many local entities. Sometimes this is the
∘ fp(1) and Fs ¼ gs(m) ∘ ∘ gs(1) with the only feasible approach, especially when available
dependency graphs Yp and Yg, respectively. information is not enough to construct global
A transformation F:Fp ! Fs is determined by dynamic models. The size of many realistic sys-
the following: tems, however, leads to computer models that are
themselves highly complex, even if they are
• A graph mapping ’:Yp ! Yg (reverse constructed from simple software entities. As a
direction) result, it becomes challenging to carry out verifi-
• A family of maps kf(v) ! kv with v Yp cation, validation, and analysis of the models,
• An order preserving map s ! p of update since these consist in essence of complex com-
schedules puter programs. This entry argues that the appro-
priate approach is to provide a formal
These maps are required to satisfy the property mathematical foundation by introducing a class
that they “locally” assemble to a coherent transfor- of mathematical objects to which one can map
mation. Using this definition of transformation, it is agent-based simulations. These objects should
shown (Theorem 2.6 in Laubenbacher and Pareigis capture the key features of an agent-based simu-
2003) that the class of SDS forms a category. One lation and should be mathematically rich enough
of the requirements, for instance, is that the com- to allow the derivation of general results and tech-
position of two transformations is again a transfor- niques. The mathematical setting of dynamical
mation. Furthermore, it is shown (Theorem 3.2 in systems is a natural choice for this purpose.
Agent-Based Modeling, Mathematical Formalism for 701
The class of finite dynamical systems over a Computational aspects of analyzing social network
state set X which carries the structure of a finite dynamics. In: Proceedings of international joint confer-
ence on artificial intelligence IJCAI. Paris, France,
field satisfies all these criteria. Parallel, sequential, pp 2268–2273
and stochastic versions of these are rich enough to Barrett CL, Hunt HB III, Marathe MV, Ravi SS,
serve as the mathematical basis for models of a Rosenkrantz DJ, Stearns RE, Thakur M (2007b) Pre-
broad range of complex systems. While finite decessor existence problems for finite discrete dynam-
ical systems. Theor Comput Sci 1–2:3–37
dynamical systems have been studied extensively Bartlett R, Garzon M (1993) Monomial cellular automata.
from an experimental point of view, their mathe- Complex Syst 7(5):367–388
matical theory should be considered to be in its Bernaschi M, Castiglione F (2002) Selection of escape
infancy, providing a fruitful area of research at the mutants from immune recognition during hiv infection.
Immunol Cell Biol 80:307–313
interface of mathematics, computer science, and Bernaschi M, Succi S, Castiglione F (2000) Large-scale
complex systems theory. cellular automata simulations of the immune system
response. Phys Rev E 61:1851–1854
Booch G, Rumbaugh J, Jacobson I (2005) Unified model-
ing language user guide, 2nd edn. Addison-Wesley,
Bibliography Reading
Brand D, Zafiropulo P (1983) On communicating finite-
Primary Literature state machines. J ACM 30:323–342
Bagrodia RL (1998) Parallel languages for discrete-event Cartier P, Foata D (1969) Problemes combinatoires de
simulation models. IEEE Comput Sci Eng 5(2):27–38 commutation et reárrangements, vol 85, Lecture Notes
Barrett CL, Reidys CM (1999) Elements of a theory of in Mathematics. Springer, Berlin
simulation I: sequential CA over random graphs. Appl Castiglione F, Agur Z (2003) Analyzing hypersensitivity to
Math Comput 98:241–259 chemotherapy in a cellular automata model of the
Barrett CL, Mortveit HS, Reidys CM (2000) Elements of a immune system. In: Preziosi L (ed) Cancer modeling
theory of simulation II: sequential dynamical systems. and simulation. Chapman and Hall/CRC, London
Appl Math Comput 107(2–3):121–136 Castiglione F, Bernaschi M, Succi S (1997) Simulating the
Barrett CL, Hunt III HB, Marathe MV, Ravi SS, immune response on a distributed parallel computer. Int
Rosenkrantz DJ, Stearns RE, Tosic P (2001) Garden J Mod Phys C 8:527–545. https://doi.org/10.1142/
of eden and fixed point configurations in sequential S0129183197000424
dynamical systems. In: Proceedings of international Castiglione F, Duca K, Jarrah A, Laubenbacher R,
conference on combinatorics, computation and geom- Hochberg D, Thorley-Lawson D (2007) Simulating
etry DM-CCG’. Paris, France, pp 95–110 Epstein-Barr virus infection with C-ImmSim. Bioinfor-
Barrett CL, Mortveit HS, Reidys CM (2001b) Elements of matics 23(11):1371–1377
a theory of simulation III: equivalence of SDS. Appl Celada F, Seiden P (1992a) A computer model of cellular
Math Comput 122:325–340 interactions in the immune system. Immunol Today
Barrett CL, Marathe MV, Smith JP, Ravi SS 13(2):56–62
(2002) A mobility and traffic generation framework Celada F, Seiden P (1992b) A model for simulating cog-
for modeling and simulating ad hoc communication nate recognition and response in the immune system.
networks. In: SAC’02. ACM, Madrid, pp 122–126 J Theor Biol 158:235–270
Barrett CL, Hunt HB III, Marathe MV, Ravi SS, Celada F, Seiden P (1996) Affinity maturation and hyper-
Rosenkrantz DJ, Stearns RE (2003a) On some special mutation in a simulation of the humoral immune
classes of sequential dynamical systems. Ann Comb response. Eur J Immunol 26(6):1350–1358
7(4):381–408 Chaudhuri PP (1997) Additive cellular automata. Theory
Barrett CL, Hunt HB III, Marathe MV, Ravi SS, and applications, vol 1. IEEE Computer Society Press,
Rosenkrantz DJ, Stearns RE (2003b) Reachability Washington, DC
problems for sequential dynamical systems with Colón-Reyes O, Laubenbacher R, Pareigis B (2004) Bool-
threshold functions. Theor Comput Sci ean monomial dynamical systems. Ann Comb
295(1–3):41–64 8:425–439
Barrett CL, Mortveit HS, Reidys CM (2003c) Elements of Colón-Reyes O, Jarrah A, Laubenbacher R, Sturmfels
a theory of computer simulation. IV. Sequential dynam- B (2006) Monomial dynamical systems over finite
ical systems: fixed points, invertibility and equivalence. fields. Complex Syst 16(4):333–342
Appl Math Comput 134(1):153–171 Dawson D (1974) Synchronous and asynchronous revers-
Barrett CL, Hunt HB III, Marathe MV, Ravi SS, ible Markov systems. Canad Math Bull 17(5):633–649
Rosenkrantz DJ, Stearns RE (2006) Complexity of Ebeling W, Schweitzer F (2001) Swarms of particle
reachability problems for finite discrete sequential agents with harmonic interactions. Theor Biosci
dynamical systems. J Comput Syst Sci 72:1317–1345 120–3(4):207–224
Barrett CL, Hunt III HB, Marathe MV, Ravi SS, Elspas B (1959) The theory of autonomous linear sequen-
Rosenkrantz DJ, Stearns RE, Thakur M (2007) tial networks. IRE Trans Circuit Theor 1:45–60
702 Agent-Based Modeling, Mathematical Formalism for
Farmer J, Packard N, Perelson A (1986) The immune Lind DA (1984) Applications of ergodic theory and sofic
system, adaptation, and machine learning. Phys systems to cellular automata. Phys D 10D:36–44
D 2(1–3):187–204 Lindgren K, Moore C, Nordahl M (1998) Complexity of
Frish U, Hasslacher B, Pomeau Y (1986) Lattice-gas two-dimensional patterns. J Stat Phys
automata for the Navier-Stokes equations. Phys Rev 91(5–6):909–951
Lett 56:1505–1508 Mac Lane S (1998) Category theory for the working math-
Fukś H (2004) Probabilistic cellular automata with con- ematician, 2nd edn. Springer, New York, No 5. in GTM
served quantities. Nonlinearity 17:159–173 Macy MW, Kitts JA, Flache A (2003) Polarization in
Garcia LD, Jarrah AS, Laubenbacher R (2006) Sequential dynamic networks: a Hopfield model of emergent
dynamical systems over words. Appl Math Comput structure. In: Dynamic social network modeling and
174(1):500–510 analysis. The National Academies Press, Washington,
Gardner M (1970) The fantastic combinations of John DC, pp 162–173
Conway’s new solitaire game “life”. Sci Am Martin O, Odlyzko A, Wolfram S (1984) Algebraic prop-
223:120–123 erties of cellular automata. Commun Math Phys
Gouda M, Chang C (1986) Proving liveness for networks 93:219–258
of communicating finite-state machines. ACM Trans Milligan D, Wilson M (1993) The behavior of affine Bool-
Program Lang Syst 8:154–182 ean sequential networks. Connect Sci 5(2):153–167
Guo Y, Gong W, Towsley D (2000) Time-stepped hybrid Minar N, Burkhart R, Langton C, Manor A (1996) The
simulation (TSHS) for large scale networks. In: swarm simulation system: a toolkit for building multi-
INFOCOM 2000. Proceedings of nineteenth annual agent simulations. Santa Fe Institute preprint series.
joint conference of the IEEE computer and communi- http://www.santafe.edu/research/publications/wpabstra
cations societies, vol 2. IEEE, Washington, DC, ct/199606042. Accessed 11 Aug 2008
pp 441–450 Misra J (1986) Distributed discrete-event simulation. ACM
Gupta A, Katiyar V (2005) Analyses of shock waves and Comput Surv 18(1):39–65
jams in traffic flow. J Phys A 38:4069–4083 Moncion T, Hutzler G, Amar P (2006) Verification of
Hansson AÅ, Mortveit HS, Reidys CM (2005) On asyn- biochemical agent-based models using petri nets. In:
chronous cellular automata. Adv Complex Syst Robert T (ed) International symposium on agent based
8(4):521–538 modeling and simulation, ABModSim’06. Austrian
Hedlund G (1969) Endomorphisms and automorphisms of Society for Cybernetics Studies, pp 695–700. http://
the shift dynamical system. Math Syst Theory 3:320–375 www.ibisc.univ-evry.fr/pub/basilic/OUT/Publications/
Hernández-Toledo A (2005) Linear finite dynamical sys- 2006/MHA06
tems. Commun Algebra 33(9):2977–2989 Morpurgo D, Serentha R, Seiden P, Celada F (1995)
Hopcroft JE, Motwani R, Ullman JD (2000) Automata Modelling thymic functions in a cellular automaton.
theory, languages and computation. Addison Wesley, Int Immunol 7:505–516
Reading Mortveit HS, Reidys CM (2001) Discrete, sequential
Hopfield J (1982) Neural networks and physical systems dynamical systems. Discret Math 226:281–295
with emergent collective computational properties. Mortveit HS, Reidys CM (2004) Reduction of discrete
Proc Natl Acad Sci U S A 79:2554–2588 dynamical systems over graphs. Adv Complex Syst
Ilachinsky A (2001) Cellular automata: a discrete universe. 7(1):1–20
World Scientific, Singapore Nagel K, Schreckenberg M (1992) A cellular automaton
Jarrah A, Laubenbacher R, Stillman M, Vera-Licona model for freeway traffic. J Phys I 2:2221–2229
P (2007) An efficient algorithm for the phase space Nagel K, Wagner P (2006) Traffic flow: approaches to
structure of linear dynamical systems over finite fields modelling and control. Wiley, Hoboken, NJ
(submitted) Nagel K, Schreckenberg M, Schadschneider A, Ito
Jefferson DR (1985) Virtual time. ACM Trans Program N (1995) Discrete stochastic models for traffic flow.
Lang Syst 7(3):404–425 Phys Rev E 51:2939–2949
Kari J (2005) Theory of cellular automata: a survey. Theory Nagel K, Rickert M, Barrett CL (1997) Large-scale traffic
Comput Sci 334:3–33 simulation, vol 1215, Lecture notes in computer sci-
Keyfitz BL (2004) Hold that light! Modeling of traffic flow ence. Springer, Berlin, pp 380–402
by differential equations. Stud Math Libr 26:127–153 Nance RE (1993) A history of discrete event simulation
Kozen DC (1997) Automata and computability. Springer, programming languages. ACM SIGPLAN Not
New York 28:149–175
Laubenbacher R, Pareigis B (2003) Decomposition and North MJ, Collier NT, Vos JR (2006) Experiences creating
simulation of sequential dynamical systems. Adv three implementations of the repast agent modeling
Appl Math 30:655–678 toolkit. ACM Trans Model Comput Simul 16:1–25
Lidl R, Niederreiter H (1997) Finite fields. Cambridge Orponen P (1994) Computational complexity of neural
University Press, Cambridge networks: a survey. Nord J Comput 1:94–110
Liggett TM (2005) Interacting particle systems. Classics in Orponen P (1996) The computational power of discrete
mathematics. Springer, Berlin, Reprint of the 1985 hopfield networks with hidden units. Neural Comput
original 8:403–415
Agent-Based Modeling, Mathematical Formalism for 703
Park JK, Steiglitz K, Thruston WP (1986) Soliton-like Shmulevich I, Dougherty ER, Zhang W (2002b) From bool-
behavior in automata. Phys D 19D:423–432 ean to probabilistic boolean networks as models of gen-
Reidys C (1998) Acyclic orientations of random graphs. etic regulatory networks. Proc IEEE 90(11):1778–1792
Adv Appl Math 21:181–192 Sipser M (1997) Introduction to the theory of computation.
Reidys CM (2001) On acyclic orientations and sequential PWS Publishing Co, Boston
dynamical systems. Adv Appl Math 27:790–804 Vasershtein L (1969) Markov processes over denumerable
Reidys CM (2005) On certain morphisms of sequential products of spaces describing large system of automata.
dynamical systems. Discret Math 296(2–3):245–257 Probl Peredachi Inf 5(3):64–72
Reidys CM (2006) Sequential dynamical systems over von Neumann J, Burks AW (eds) (1966) Theory of self-
words. Ann Comb 10(4):481–498 reproducing automata. University of Illinois Press,
Rickert M, Nagel K, Schreckenberg M, Latour A (1996) Champaign
Two lane traffic simulations using cellular automata. Whitham G (1999) Linear and nonlinear waves, reprint
Phys A 231:534–550 edition edn. Pure and applied mathematics: a Wiley-
Rothman DH (1988) Cellular-automaton fluids: a model Interscience series of texts, monographs and tracts.
for flow in porous media. Geophysics 53:509–518 Wiley-Interscience, New York
Russell S, Norwig P (2003) Artificial intelligence: a mod- Wolfram S (1983) Statistical mechanics of cellular autom-
ern approach. Prentice-Hall, Upper Saddle River ata. Rev Mod Phys 55:601–644
Schönfisch B, de Roos A (1999) Synchronous and asyn- Wolfram S (1986) Theory and applications of cellular
chronous updating in cellular automata. BioSystems automata, vol 1, Advanced series on complex systems.
51:123–143 World Scientific Publishing Company, Singapore
Shmulevich I, Dougherty ER, Kim S, Zhang W (2002a) Wolfram S (2002) A new kind of science. Wolfram Media,
Probabilistic boolean networks: a rule-based uncer- Champaign Books and Reviews, Champaign
tainty model for gene regulatory networks. Bioinfor- Wooldridge M (2002) Introduction to multiagent systems.
matics 18(2):261–274 Wiley, Chichester
Interaction A pattern of actions in a multiagent
Logic and Geometry of Agents system. Each action is performed by some
in Agent-Based Modeling agent and may be observed by others.
Linear logic A substructural logic in which the
Samson Abramsky operations of copying and deleting premises
Department of Computer Science, University of are not allowed in general.
Oxford, Oxford, UK
Computation in the age of the Internet Interaction Complex behavior arises as the
Instead of isolated systems, with rudimentary global effect of a system of interacting agents
interactions with their environment, the standard (or processes).
unit of description or design becomes a process or The key building block is the agent. The key
agent, the essence of whose behavior is how it operation is interaction – plugging agents
interacts with its environment. together so that they interact with each other.
708 Logic and Geometry of Agents in Agent-Based Modeling
Logic and Geometry of Agents in Agent-Based Modeling, Fig. 2 Tertium non datur?
Logic and Geometry of Agents in Agent-Based Modeling 709
Logic and Geometry of Agents in Agent-Based Modeling, Fig. 3 How to beat a Grandmaster
from one place to another. Indeed, as shall even- ideas already lead us naturally to the setting of a
tually be seen, such processes are computation- resource-sensitive logic, in which in particular
ally universal. the contraction rule, which can be expressed as
A ! A ^ A (or equivalently as :A _ (A ^ A)),
cannot be assumed to be valid.
The geometry of information flow
What about the other obvious variation, where
From a dynamical point of view, the copycat
we play on two boards as White and one as Black?
strategy realizes a channel between the two
game boards, by performing the actions of copy- Kasparov Kasparov Kasparov
ing moves. But there is also some implicit geom-
etry here. Indeed, the very idea of two boards laid B B W
out side by side appeals to some basic underlying
spatial structure. In these terms, the copycat chan- W W B
nel can also be understood geometrically, as cre-
ating a graphical link between these two spatial
locations. These two points of view are comple-
mentary and link the logical perspective to pow- It seems that the copycat strategy does still
erful ideas arising in modern geometry and work here, since we can simply ignore one of the
mathematical physics. boards where we play as White. However, a geo-
Further evidence that the copycat strategy metrical property of the original copycat strategy
embodies more substantial ideas than might at has been lost, namely, a connectedness property
first be apparent can be obtained by varying the that information flows to every part of the system.
scenario. Consider now the case where we play This at least calls the corresponding logical prin-
against Kasparov on three boards; one as Black ciple of weakening, which can be expressed as
and two as White. A ^ A ! A, (or equivalently as :A _ :A _ A) into
question.
Kasparov Kasparov Kasparov
These remarks indicate that we are close to the
realm of linear logic and its variants and, mathe-
B W W matically, to the world of monoidal (rather than
Cartesian) categories.
W B B
Game Semantics
These ideas find formal expression in game
Does the copycat strategy still work here? In semantics. Games play the role of:
fact, it can easily be seen that it does not. Suppose
Kasparov makes an opening move m1 in the left- • Interface types for computation modules
hand board where he plays as White; we copy it to • Propositions with dynamic content
the board where we play as White; he responds
with m2; and we copy m2 back to the board where In particular, two-person games capture the
Kasparov opened. So far, all has proceeded as in duality of:
our original scenario. But now Kasparov has the
option of playing a different opening move, m3 • Player versus opponent
say, in the rightmost board. We have no idea how • System versus environment
to respond to this move; nor can we copy it any-
where, since the board where we play as White is Agents are Strategies In this setting, agents or
already “in use.” This shows that these simple processes can be modeled as strategies for playing
Logic and Geometry of Agents in Agent-Based Modeling 711
Logic and Geometry of Agents in Agent-Based Modeling, Fig. 4 Two composable systems
Logic and Geometry of Agents in Agent-Based Modeling 713
some move m in C. This is visible only to Nigel, et al. 2004; Ghica and McCusker 2000; Murawski
who as a strategy for B-oC has a response. Sup- et al. 2005).
pose this response m1 is in B. This is a move by From the point of view of the general analysis
Nigel as player in B⊤, hence appears to Gary as a of information, there are the following promising
move by opponent in B. Gary as a strategy for lines of development:
A-oB has a response m2 to this move. If this
response is again in B, Nigel sees it as a response • Game semantics provides a promising arena for
by the environment to his move and will have a exploring the combination of quantitative and
response again and so on. Thus there is a sequence qualitative theories of information. In particular,
of moves m1,..., mk in B, ping-ponging back and it provides a setting for quantifying information
forth between Nigel and Gary. If, eventually, flow between agents. Important quantitative
Nigel responds to Gary’s last move by playing in questions can be asked about rate of informa-
C, or Gary responds to Nigel’s last move by tion flow through a strategy (representing a
playing in A, then this provides the response of program or a proof); how can a system gain
the composed strategy Gary and Nigel to the maximum information from its environment
original move m. Indeed, all that is visible to the while providing minimal information in return;
environment is that it played m, and eventually robustness in the presence of noise, etc.
some response appeared in A or C. • As in the discussion of the copycat strategy,
Moreover, if both Nigel and Gary are win- there is an intuition of logical principles arising
ning strategies, then so is the composed strat- as conservation laws for information flow.
egy; and the composed strategy will not get (And indeed, in the case of multiplicative lin-
stuck forever in the internal ping-pong in B. ear logic, the proofs correspond exactly to
To see this, suppose for a contradiction that it “generalized copycat strategies.”) Can this
did in fact get stuck in B. Then there would be intuition be developed into a full-fledged the-
an infinite play in B following the winning ory? Can logical principles be characterized
strategy Gary for player in B and the same as those expressing the conservation principles
infinite play following the winning strategy of this information flow dynamics?
Nigel for player in B⊥, hence for opponent in • There is also the hope that the more structured
B. Hence the same play would count as a win setting of game semantics will usefully con-
for both player and opponent. This yields the strain the exuberant variety of possibilities
desired contradiction. offered by process algebra and allow a sharper
exploration of the logical space of possibilities
for information dynamics. This has already
Discussion been borne out in part by the success of game
semantics in exploring the space of program-
Game semantics in the sense discussed in this ming language semantics. It has been possible
section has had an extensive development over to give crisp characterizations of the “shapes”
the past decade and a half, with a wealth of appli- of computations carried out within certain pro-
cations to the semantics of programming lan- gramming disciplines: including purely func-
guages, type theories, and logics (Abramsky and tional programming (Abramsky et al. 2000;
Jagadeesan 1994b; Abramsky and McCusker Hyland and Ong 2000), stateful programming
1997, 1999a; b; Abramsky and Mellies 1999; (Abramsky and McCusker 1997, 1999a), gen-
Abramsky et al. 2000; Hyland and Ong 2000). eral references (Abramsky et al. 1998), pro-
More recently, there has been an algorithmic turn gramming with nonlocal jumps and
and some striking applications to verification and exceptions (Laird 1997, 2001), non-
program analysis (Abramsky 2002; Abramsky determinism (Harmer and McCusker 1999),
714 Logic and Geometry of Agents in Agent-Based Modeling
probability (Danos and Harmer 2002), concur- graphical calculi which provide an intuitive and
rency (Ghica and Murawski 2004; Ghica and visually appealing window onto the various for-
Murawski 2006), names (Abramsky et al. malisms to be encountered. These calculi have a
2004b), polymorphism (Abramsky and substantial mathematical content, founded on the
Jagadeesan 2005; Hughes 2000), and more. diagrammatic approach to tensor categories; fur-
See (Abramsky and McCusker 1999b) for an ther details can be found in the references.
overview (now rather out of date).
There has also been a parallel line of devel-
Logic
opment of giving full completeness results for
Firstly multiplicative linear logic (Girard 1987),
a range of logics and type theories, character-
the logic of the linear connectives , ⅋, (_)⊥
izing the “space of proofs” for a logic in terms
which have already been encountered, will be
of informatic or geometric constraints which
considered as a basic paradigmatic example.
pick out those processes which are proofs for
A key insight (Girard 1987) is that the essential
that logic (Abramsky and Jagadeesan 1994b;
information in a proof in this system is given by a
Abramsky and Mellies 1999; Blute et al. 1998,
pairwise matching of the occurrences of positive
2005; Devarajan et al. 1999; Loader 1994).
and negative literals in the sequent – a proof
This allows a new look at such issues as the
structure. For example, the two possible proof
boundaries between classical and constructive
structures for the sequent a⊥⅋⊥a⊥,a a are:
logic or the fine structure of polymorphism and
second-order quantification.
• This also gives some grounds for optimism that
⊥ ⊥
what computational processes are can be a a a a
captured – in a “machine-independent”, and
moreover “geometrical”, noninductive way –
⊥ & ⊥ a a
without referring back to Turing machines or a a
any other explicit machine model.
• In the same spirit as for computability, can
polynomial time computation and other com-
plexity classes be characterized in such terms? ⊥ ⊥
a a a a
answer is geometric (or topological) in character. when the other levels of connective in linear logic
For each proof structure, a switching graph can are incorporated:
be obtained by deleting, for each occurrence of a
subformula A ⅋ B, exactly one of the arcs A — Additives The additive conjunction and
A ⅋ B — B connecting it to its immediate sub- disjunction allow causality,
formulas. If all such switching graphs are trees, conflict, and case analysis/
the proof structure is said to be a proof net. conditionals to be expressed.
A second form of characterization is interac- The interaction between the
tive. An orthogonality relation can be defined, additive and multiplicative levels
between permutations on the set of occurrences is rather subtle. The theory
of literals in the sequent G: f ⊥g if fg is cyclic. The sketched above for the
idea is that f is a candidate proof net, while g is an multiplicatives has in large part
attempted counterproof – a passage through the been lifted to multiplicative
literals induced by a choice of switching graph. additive linear logic (Abramsky
Note that the alternating composition of f and and Mellies 1999), but a number
g expresses interaction between f and g, thought of key questions and issues
of as strategy and counter-strategy. It generates needed for a deeper analysis
a path along which information flows around the remain to be investigated.
system. Exponentials The multiplicative fragment only
A semantics of MLL proofs can be given by allows linear time computation to
specifying, for each formula A, a set S of permu- be expressed (under the
tations on the set of literal occurrences |A|, such Curry–Howard paradigm). For a
that S = S⊥⊥, where S⊥ = {g | 8 f S: f ⊥g}. For full analysis of computationally
a literal, the unique permutation (the identity) is expressive systems, it is
specified: necessary to allow for copying
and deleting, as regulated by the
SðA BÞ ¼ ff þ gj f SðAÞ ^ g SðBÞg⊥⊥ exponential connectives of
⊥ linear logic. Existing results have
SðA&BÞ ¼ SðA⊥ B⊥ Þ : extended the multiplicative
theory to various systems of
Here f + g is a disjoint union of permutations, typed l-calculus (corresponding
expressing the absence of information flow to various forms of intuitionistic
(or information independence). multiplicative exponential linear
logic) and have begun to
Theorem 1 (Sequentializability (Girard 1987) investigate systems which, by
and Full Completeness (Abramsky and constraining the exponential
Jagadeesan 1994b)) Let f be a literal-respecting types in certain natural ways,
involution on |G|. The following are equivalent: capture significant complexity
(i) f is the permutation assigned to a sequent proof classes, especially PTIME.
of G; (ii) f is a proof net; (iii) f S(G).
This shows that the geometric and interactive
characterizations of the space of proofs Diagram Algebras
coincide. It will now be indicated how this apparently very
specialized corner of proof theory in fact connects
Further Developments From a computational directly to a broad topic arising in representation
perspective, the multiplicative connectives theory and knot theory, with connections to math-
embody concurrency and causal independence. ematical physics. On the one hand, some structure
The scope of the enterprise is greatly expanded will be lost, by obliterating the distinction
716 Logic and Geometry of Agents in Agent-Based Modeling
between and ⅋; this corresponds to moving of abstract generators and relations (VFR 1985). It
from *-autonomous to compact closed catego- was recast in beautifully elementary and concep-
ries. This means that the formula tree structure can tual terms by Louis Kauffman as a planar dia-
be dispensed with altogether; it is simply a matter gram algebra (Kauffman 1990).
of connecting up literal occurrences, which shall
be drawn as “joining up the dots.” Motivation: Generators:
compact closed categories show up in many con-
1 2 3 n 1 n
texts of interest!
On the other hand, rather than one-sided
sequents, general arrows or two-sided sequents
will be represented diagrammatically. This
means arrows of the form
1 2 3 n 1 n
A1 A n ! B1 Bm , U1 Un−1
and
U1 U3 U3 U1
a a* a c* c a
The general form of an element of the algebra
is given by (actually of the basic multiplicative monoid: the
algebra is then constructed freely over this as the
“monoid algebra”) is obtained by “joining up the
dots” in a planar fashion. Multiplication xy is
defined by identifying the bottom row of x with
the top row of y. In general loops may be formed –
these are “scalars,” which can float freely across
these figures, represented symbolically by d
above.
How does this connect to knots? A key con-
Temperley–Lieb Algebra ceptual insight is due to Kauffman, who saw how
The Temperley–Lieb algebra played a central to recast the Jones polynomial in elementary com-
role in the Jones polynomial invariant of knots binatorial form in terms of his bracket polyno-
(VFR 1985) and ensuing developments. It was mial. The basic idea of the bracket polynomial is
originally presented, rather forbiddingly, in terms expressed by the following equation:
Logic and Geometry of Agents in Agent-Based Modeling 717
C B C
B g B B C f g
A f B A B f C f g A B
A f A A B h
A E
Logic and Geometry of Agents in Agent-Based Modeling, Fig. 5 Graphical calculus for monoidal categories
Logic and Geometry of Agents in Agent-Based Modeling, Fig. 7 Linear Conjunction - no information flow
A π
π
Kets, Bras, and Scalars A special role is played ψ ποψ A
A ψ
by boxes with either no input or no output,
corresponding to states and costates, respectively
(cf. Dirac’s kets and bras (PAM 1947)), which are
depicted by triangles. Scalars then arise naturally
by composing these elements (cf. inner product or Bell States and Costates The cups and caps
Dirac’s bra-ket): which have already appeared in their various
720 Logic and Geometry of Agents in Agent-Based Modeling
= =
Modeling, ψ
Fig. 11 Teleportation
ψ ψ
Alice Bob Alice Bob Alice Bob
guises as axiom and cut links, or in abstraction and quantum entanglement. (Mathematically, they
application, now take on the role of Bell states arise as the transpose and co-transpose of the
and costates, the fundamental building blocks of identity, which exist in any finite-dimensional
Logic and Geometry of Agents in Agent-Based Modeling 721
A∗ A
A∗ A
Further Directions
The formation of names and conames of This article has described what is still very much
arrows (i.e., map-state and map-costate duality) an emerging area of research, rather than survey-
is conveniently depicted thus: ing an established field. Based on the progress
which has already been made, a number of prom-
ising directions for future work are apparent,
f which may lead to important contributions to the
f = f = f general study of agent-based systems:
Abramsky S (2002) Algorithmic game semantics: a tutorial the nu-calculus. In: Proceedings of the LICS’04. IEEE
introduction. In: Proof and system-reliability. Kluwer, Computer Society Press, Los Alamitos, pp 150–159
Dordrecht Blute R, Scott PJ (1998) The Shuffle Hopf Algebra and
Abramsky S (2004) High-level methods for quantum com- noncommutative full completeness. J Symb Log
putation and information. In: Proceedings of the 19th 63(4):1413–1436
annual IEEE symposium on logic in computer science. Blute R, Hamano M, Scott PJ (2005) Softness of hyper-
IEEE Computer Science Press, Los Alamitos, coherences and MALL full completeness. Ann Pure
pp 410–414 Appl Log 131(1–3):1–63
Abramsky S (2005) Abstract scalars, loops, and free traced Danos V, Harmer R (2002) Probabilistic game semantics.
and strongly compact closed categories. In: Proceed- ACM Trans Comput Log 3(3):359–382
ings of CALCO 2005. Springer lecture notes in com- Devarajan H, Hughes D, Plotkin G, Pratt V (1999) Full
puter science, vol 3629. Springer, Berlin, pp 1–31 completeness of the multiplicative linear logic of Chu
Abramsky S (2007) Temperley–Lieb algebras and geome- spaces. In: Proceedings of the 14th Annual IEEE sym-
try of interaction. In: Chen G, Kauffman L, Lamonaco posium on logic in computer science. pp 234–242
S (eds) Mathematics of quantum computing and tech- Ghica DR, McCusker G (2000) Reasoning about idealized
nology. Chapman and Hall/CRC, Boca Raton, algol using regular languages. In: Proceedings of the
pp 413–458 ICALP’00. LNCS, vol 1853. pp 103–116
Abramsky S, Coecke B (2004) A categorical semantics of Ghica DR, Murawski AS (2004) Angelic semantics of
quantum protocols. In: Proceedings of the 19th Annual finegrained concurrency. In: Proceedings of the
IEEE symposium on logic in computer science: LICS FOSSACS’04. LNCS, vol 2987. pp 211–225
2004. IEEE Computer Society, Los Alamitos, Ghica DR, Murawski AS (2006) Compositional model
pp 415–425 extraction for higher-order concurrent programs. In:
Abramsky S, Coecke B (2005) Abstract physical traces. Proceedings of the TACAS’06. LNCS
Theory Appl Categ 14:111–124 Girard J-Y (1987) Linear logic. Theor Comput Sci
Abramsky S, Jagadeesan R (1994a) New foundations for 50(1):1–102
the geometry of interaction. Inf Comput 111:53–119 Girard J-Y (1989) Geometry of interaction I: interpretation
Abramsky S, Jagadeesan R (1994b) Games and full com- of system F. In: Ferro R et al (eds) Logic Colloquium
pleteness for multiplicative linear logic. J Symb Log ‘88. Elsevier, Amsterdam, pp 221–260
59:543–574 Harmer R, McCusker G (1999) A fully abstract game
Abramsky S, Jagadeesan R (2005) A game semantics for semantics for finite nondeterminism. In: Proceedings
generic polymorphism. Ann Pure Appl Log 133:3–37 of the 14th Annual IEEE symposium on logic in com-
Abramsky S, McCusker G (1997) Linearity, sharing and puter science. IEEE Computer Society Press, Los
state. In: O’Hearn P, Tennent RD (eds) Algol-like lan- Alamitos
guages. Birkhauser, Basel, pp 317–348 Hughes D (2000) Hypergame semantics: full completeness
Abramsky S, McCusker G (1999a) Full abstraction for for system F. D Phil mathematical sciences. Oxford
idealized Algol with passive expressions. Theor University, Oxford
Comput Sci 227:3–42 Hyland JME, Ong C-HL (2000) On full abstraction for
Abramsky S, McCusker G (1999b) Game semantics. In: PCF:i. Models, observables and the full abstraction
Computational logic: Proceedings of the 1997 problem, ii. Dialogue games and innocent strategies,
Marktoberdorf Summer School. Springer, Berlin, iii. A fully abstract and universal game model. Inf
pp 1–56 Comput 163:285–408
Abramsky S, Mellies P-A (1999) Concurrent games and Kauffman LH (1990) An invariant of regular isotopy. Trans
full completeness. In: Proceedings of the 14th interna- Am Math Soc 318(2):417–471
tional symposium on logic in computer science. Com- Kelly GM, Laplaza ML (1980) Coherence for compact
puter Society Press of the IEEE, Los Alamitos, closed categories. J Pure Appl Algebra 19:193–213
pp 431–442 Laird J (1997) Full abstraction for functional languages
Abramsky S, Honda K, McCusker G (1998) A fully with control. In: Proceedings of the 12th Annual sym-
abstract game semantics for general references. In: Pro- posium on logic in computer science, LICS ‘97.
ceedings of the 13th international symposium on logic Extended abstract
in computer science. Computer Society Press of the Laird J (2001) A fully abstract games semantics of local
IEEE, Los Alamitos, pp 334–344 exceptions. In: Proceedings of the 16th Annual sympo-
Abramsky S, Jagadeesan R, Malacaria P (2000) Full sium on logic in computer science, LICS ‘01. Extended
abstraction for PCF. Inf Comput 163:409–470 abstract
Abramsky S, Ghica DR, Murawski AS, Ong C-HL (2004a) Loader R (1994) Models of Lambda calculi and linear
Applying game semantics to compositional software logic. PhD thesis, Oxford University, Oxford
modeling and verification. In: Proceedings of the Murawski AS, Ong C-HL, Walukiewicz I (2005) Idealized
TACAS’04. LNCS, vol 2988. pp 421–435 Algol with ground recursion and DPDA equivalence.
Abramsky S, Ghica DR, Murawski AS, Stark IDB, Ong In: Proceedings of the ICALP’05. LNCS, vol 3580.
C-HL (2004b) Nominal games and full abstraction for pp 917–929
Logic and Geometry of Agents in Agent-Based Modeling 723
PAM D (1947) Principles of quantum mechanics. Oxford Hindley JR, Seldin JP (1986) Introduction to combinators
University Press, Oxford and the l.-calculus. Cambridge University Press,
VFR J (1985) A polynomial invariant for links via von Cambridge
Neumann algebras. Bull Am Math Soc 129:103–112 Kauffman LH (1994) Knots in physics. World Scientific
Press, Singapore
Troelstra AS (1992) Lectures on linear logic. Center for the
Books and Reviews
Study of Language and Information Lecture Notes
Girard J-Y, Lafont Y, Taylor P (1989) Proof and types.
No. 29
Cambridge Tracts in Theoretical Computer Science
motivated by the neuronal structure of the
Agent-Based Modeling and brain.
Artificial Life Autocatalytic set A closed set of chemical reac-
tions that is self-sustaining.
Charles M. Macal Autonomous The characteristic of being capable
Center for Complex Adaptive Agent Systems of making independent decisions over a range
Simulation (CAS2), Decision and Information of situations.
Sciences Division, Argonne National Laboratory, Avida An advanced artificial life computer pro-
Argonne, IL, USA gram developed by Adami (Adami 1998) and
others that models populations of artificial
organisms and the essential features of life
Article Outline such as interaction and replication.
Biologically inspired computational
Glossary algorithm Any kind of algorithm that is
Definition of the Subject based on biological metaphors or analogies.
Introduction Cellular automaton (CA) A mathematical con-
Artificial Life struct and technique that models a system in
ALife in Agent-Based Modeling discrete time and discrete space in which the
Future Directions state of a cell depends on transition rules and
Bibliography the states of neighboring cells.
Coevolution A process by which many entities
Glossary adapt and evolve their behaviors as a result of
mutually effective interactions.
Adaptation The process by which organisms Complex system A system comprised of a large
(agents) change their behavior or by which number of strongly interacting components
populations of agents change their collective (agents).
behaviors with respect to their environment. Complex adaptive system (CAS) A system
Agent-based modeling (ABM) A modeling and comprised of a large number of strongly
simulation approach applied to a complex sys- interacting components (agents) that adapt at
tem or complex adaptive system, in which the the individual (agent) level or collectively at
model is comprised of a large number of the population level.
interacting elements (agents). Decentralized control A feature of a system in
Ant colony optimization (ACO) A heuristic opti- which the control mechanisms are distributed
mization technique motivated by collective decision over multiple parts of the system.
processes followed by ants in foraging for food. Digital organism An entity that is represented
Artificial chemistry Chemistry based on the by its essential information-theoretic elements
information content and transformation possi- (genomes) and implemented as a computa-
bilities of molecules. tional algorithm or model.
Artificial life (ALife) A field that investigates Downward causation The process by which a
life’s essential qualities primarily from an higher-order emergent structure takes on its
information content perspective. own emergent behaviors and these behaviors
Artificial neural network (ANN) A heuristic exert influence on the constituent agents of the
optimization and simulated learning technique emergent structure.
At about the same time, the introduction of the intertwined with developments in ABM and vice
personal computer suddenly made computing verse. Agent-based models demonstrate the emer-
accessible, convenient, inexpensive, and compel- gence of lifelike features using ALife frame-
ling as an experimental tool. The future seemed to works; ALife algorithms are widely used in
have almost unlimited possibilities for the devel- agent-based models to represent agent behaviors.
opment of ALife computer programs to explore life These threads are explored in this entry. In ALife
and its possibilities. Thus, several ALife software terminology, one could say that ALife and ABM
programs emerged that sought to encapsulate the have coevolved to their present states. In all like-
essential elements of life through incorporation of lihood, they will continue to do so.
ALife-related algorithms into easily usable soft- This entry covers in a necessarily brief and per-
ware packages that could be widely distributed. haps superficial, but broad, way these relationships
Computational programs for modeling populations between ABM and ALife and extrapolates to future
of digital organisms, such as Tierra, Avida, and possibilities. This entry is organized as follows.
Echo, were developed along with more general Section “Artificial Life” introduces artificial life, its
purpose agent-based simulators such as Swarm. essential elements, and its relationship to computing
Yet, the purpose of ALife was never restricted and agent-based modeling. Section “ALife in
to understanding or recreating life as it exists Agent-Based Modeling” describes several examples
today. According to Langton: of ABM applications spanning many scales.
Section “Future Directions” concludes with future
Artificial systems which exhibit lifelike behaviors
are worthy of investigation on their own right,
directions for ABM and ALife. A bibliography is
whether or not we think that the processes that included for further reading.
they mimic have played a role in the development
or mechanics of life as we know it to be. Such
systems can help us expand our understanding of Artificial Life
life as it could be. (p. xvi in Langton 1989a)
The field of ALife addresses lifelike properties Artificial life was initially motivated by the need to
of systems at an abstract level by focusing on the model biological systems and brought with it the
information content of such systems independent need for computation. The field of ALife has always
of the medium in which they exist, whether it be been multidisciplinary and continues to encompass
biological, chemical, physical, or in silico. This a broad research agenda covering a variety of topics
means that computation, modeling, and simula- from a number of disciplines, including:
tion play a central role in ALife investigations.
The relationship between ALife and ABM is • Essential elements of life and artificial life
complex. A case can be made that the emergence • Origins of life and self-organization
of ALife as a field was essential to the creation of • Evolutionary dynamics
agent-based modeling. Computational tools were • Replication and development processes
both required and became possible in the 1980s • Learning and evolution
for developing sophisticated models of digital • Emergence
organisms and general purpose artificial life sim- • Computation of living systems
ulators. Likewise, a case can be made that the • Simulation systems for studying ALife
possibility for creating agent-based models was • Many others
essential to making ALife a promising and pro-
ductive endeavor. ABM made it possible to under- Each of these topics has threads leading into
stand the logical outcomes and implications of agent-based modeling.
ALife models and lifelike processes. Traditional
analytical means, although valuable in The Essence of ALife
establishing baseline information, were limited The essence of artificial life is summed up by
in their capabilities to include essential features Langton (p. xxii in Langton (1989a)) with a list
of ALife. Many threads of ALife are still of essential characteristics:
Agent-Based Modeling and Artificial Life 729
• Lifelike behavior on the part of man-made replication. Did a machine to replicate such a
systems machine need to contain both the instructions for
• Semiautonomous entities whose local interac- the machine’s operation and replication, as well as
tions with one another are governed by a set of instructions for replicating the instructions on
simple rules how to replicate the original machine? (see
• Populations, rather than individuals Fig. 1.) Von Neumann used the abstract mathe-
• Simple rather than complex specifications matical construct of cellular automata, originally
• Local rather than global control conceived in discussions with Stanislaw Ulam, to
• Bottom-up rather than top-down modeling prove that such a machine could be designed, at
• Emergent rather than prespecified behaviors least in theory. Von Neumann was never able to
build such a machine due to the lack of sophisti-
Langton observes that complex high-level cated computers that existed at the time.
dynamics and structures often emerge (in living Cellular automata (CA) have been central to
and artificial systems), developing over time out the development of computing artificial life
of the local interactions among low-level primi- models. Virtually all of the early agent-based
tives. Agent-based modeling has grown up around models that required agents to be spatially located
the need to model the essentials of ALife. were in the form of von Neumann’s original cel-
lular automata. A cellular automaton is a finite-
Self-Replication and Cellular Automata state machine in which time and space are treated
Artificial life traces its beginnings to the work of as discrete rather than continuous, as would be the
John von Neumann in the 1940s and investiga- case, for example, in differential equation models.
tions into the theoretical possibilities for develop- A typical CA is a two-dimensional grid or lattice
ing a self-replicating machine (Taub 1961). Such consisting of cells. Each cell assumes one of a
a self-replicating machine carries instructions not finite number of states at any time. A cell’s neigh-
only for its operations but also for its replication. borhood is the set of cells surrounding a cell,
The issue is concerned with how to replicate such typically a five-cell neighborhood (von Neumann
a machine that contained the instructions for its neighborhood) or a nine-cell neighborhood
operation along with the instructions for its (Moore neighborhood), as in Fig. 2.
Agent-Based Modeling and Artificial Life, Fig. 1 Von Neumann’s self-replication problem
730 Agent-Based Modeling and Artificial Life
Cell Cell
A set of simple state transition rules determines 1. A cell will be On in the next generation if
the value of each cell based on the cell’s state and exactly three of its eight neighboring cells are
the states of neighboring cells. Every cell is currently On.
updated at each time according to the transition 2. A cell will retain its current state if exactly two
rules. Each cell is identical in terms of its update of its neighbors are On.
rules. Cells differ only in their initial states. A CA 3. A cell will be Off otherwise.
is deterministic in the sense that the same state for
a cell and its set of neighbors always results in the Initially, a small set of On cells is randomly
same updated state for the cell. Typically, CAs are distributed over the grid. The three rules are then
set up with periodic boundary conditions, mean- applied repeatedly to all cells in the grid.
ing that the set of cells on one edge of the grid After several updates of all cells on the grid,
boundary is the neighbor cells to the cells on the distinctive patterns emerge, and in some cases
opposite edge of the grid boundary. The space of these patterns can sustain themselves indefinitely
the CA grid forms a surface on a toroid, or donut- throughout the simulation (Fig. 3). The state of
shape, so there is no boundary per se. It is straight- each cell is based only on the current state of the
forward to extend the notion of cellular automata cell and the cells touching it in its immediate
to two, three, or more dimensions. neighborhood. The nine-neighbor per neighbor-
Von Neumann solved the self-replication prob- hood assumption built into Life determines the
lem by developing a cellular automaton in which scope of the locally available information for
each cell had 29 possible states and five neighbors each cell to update its state.
(including the updated cell itself). In the von Neu- Conway showed that, at least in theory, the
mann neighborhood, neighbor cells are in the structures and patterns that can result during a
north, south, east, and west directions from the Life computation are complex enough to be the
updated cell. basis for a fully functional computer that is com-
plex enough to spontaneously generate self-
replicating structures (see the section below on
The Game of Life Conway’s Game of Life, or universal computation). Two observations are
Life, developed in the 1970s, is an important important about the Life rules:
example of a CA (Berlekamp et al. 2003; Gardner
1970; Poundstone 1985). The simplest way to • As simple as the state transition rules are, by
illustrate some of the basic ideas of agent-based using only local information, structures of arbi-
modeling is through a CA. The Game of Life is a trarily high complexity can emerge in a CA.
two-state, nine-neighbor cellular automaton with • The specific patterns that emerge are extremely
three rules that determine the state (either On, i.e., sensitive to the specific rules used. For exam-
shaded, or Off, i.e., white) of each cell: ple, changing Rule 1 above to “A cell will be
Agent-Based Modeling and Artificial Life 731
On in the next generation if exactly four of its themselves and produce emergent structures.
eight neighboring cells are currently On” Langton’s loop is a self-replicating
results in the development of completely dif- two-dimensional cellular automaton, much sim-
ferent patterns. pler than von Neumann’s (Langton 1984).
• The Game of Life provides insights into the role Although not complex enough to be a universal
of information in fundamental life processes. computer, Langton’s loop was the simplest known
structure that could reproduce itself. Langton’s ant
is a two-dimensional CA with a simple set of rules,
Cellular Automata Classes Wolfram investi-
but complicated emergent behavior. Following a
gated the possibilities for complexity in cellular
simple set of rules for moving from cell to cell, a
automata across the full range of transition rules
simulated ant displays unexpectedly complex
and initial states, using one-dimensional cellular
behavior. After an initial period of chaotic move-
automata (Wolfram 1984). He categorized four dis-
ments in the vicinity of its initial location, the ant
tinct classes for the resulting patterns produced by a
begins to build a recurrent pattern of regular struc-
CA as it is solved repeatedly over time. These are:
tures that repeats itself indefinitely (Langton
• Class I: homogeneous state 1986). Langton’s ant has behaviors complex
• Class II: simple stable or periodic structure enough to be a universal computer.
• Class III: chaotic (non-repeating) pattern
• Class IV: complex patterns of localized structures Genotype/Phenotype Distinction
Biologists distinguish between the genotype and
The most interesting of these is Class IV cellu- the phenotype as hallmarks of biological systems.
lar automata, in which very complex patterns of The genotype is the template – the set of instruc-
non-repeating localized structures emerge that are tions, the specification, and the blueprint – for an
often long lived. Wolfram showed that these Class organism. DNA is the genotype for living organ-
IV structures were also complex enough to sup- isms, for example. A DNA strand contains the
port universal computation (Wolfram 2002). complete instructions for the replication and
Langton (1992) coined the term “life at the edge development of the organism. The phenotype is
of chaos” to describe the idea that Class IV sys- the organism – the machine, the product, and the
tems are situated in a thin region between Class II result – that develops from the instructions in the
and Class III systems. Agent-based models often genotype (Fig. 4).
yield Class I, Class II, and Class III behaviors. Morphogenesis is the developmental process by
Other experiments with CAs investigated the which the phenotype develops in accord with the
simplest representations that could replicate genotype, through interactions with and resources
732 Agent-Based Modeling and Artificial Life
Development Process
(Genotype Expression)
Population of Individuals
subject to selection
Phentype Space
obtained from its environment. In a famous paper, Recursive systems are logic systems in which
Turing (1952) modeled the dynamics of morpho- strings of symbols are recursively rewritten based on
genesis and, more generally, the problem of how a minimum set of instructions. Recursive systems, or
patterns self-organize spontaneously in nature. term replacement systems, as they have been called,
Turing used differential equations to model a sim- can result in complex structures. Examples of recur-
ple set of reaction-diffusion chemical reactions. sive systems include cellular automata, as described
Turing demonstrated that only a few assumptions above, and Lindenmayer systems, called L-systems
were necessary to bring about the emergence of (Le Novere and Shimizu 2001). An L-system consists
wave patterns and gradients of chemical concen- of a formal grammar, which is a set of rules for
tration, suggestive of morphological patterns that rewriting strings of symbols. L-systems have been
commonly occur in nature. Reaction-diffusion sys- used extensively for modeling living systems, for
tems are characterized by the simultaneous pro- example, plant growth and development, producing
cesses of attraction and repulsion and are the highly realistic renderings of plants, with intricate
basis for the agent behavioral rules (attraction and morphologies and branching structures.
repulsion) in many social agent-based models. Wolfram (1999) used symbolic recursion as a
More recently, Bonabeau extended Turing’s basis for developing Mathematica, the computa-
treatment of morphogenesis to a theory of pattern tional mathematics system based on symbolic pro-
formation based on agent-based modeling. cessing and term replacement. Unlike numeric
Bonabeau (1997) states the reason for relying on programming languages, a symbolic program-
ABM: “because pattern-forming systems based ming language allows a variable to be a basic
on agents are (relatively) more easily amenable object and does not require a variable to be
to experimental observations.” assigned a value before it is used in a program.
Any agent-based model is essentially a recursive
system. Time is simulated by the repeated applica-
Information Processes One approach to build- tion of the agent updating rules. The genotype is the
ing systems from a genotype specification is based set of rules for the agent behaviors. The phenotype is
on the methodology of recursively generated a set of the patterns and structures that emerge from
objects. Such recursive systems are compact in the computation. As in cellular automata and recur-
their specification, and their repeated application sive systems, extremely complex structures emerge
can result in complex structures, as demonstrated in agent-based models that are often unpredictable
by cellular automata. from examination of the agent rules.
Agent-Based Modeling and Artificial Life 733
Emergence One of the primary motivations for • How does one automatically identify and mea-
the field of ALife is to understand emergent pro- sure the emergence of entities in a model?
cesses, that is, the processes by which life emerges • How do agents that comprise an emergent
from its constituent elements. Langton writes: “The entity perceived by an observer recognize that
‘key’ concept in ALife, is emergent behavior” (p. 2 they are part of that entity?
in Langton 1989b). Complex systems exhibit pat-
terns of emergence that are not predictable from
inspection of the individual elements. Emergence Artificial Chemistry
is described as unexpected, unpredictable, or other- Artificial chemistry is a subfield of ALife. One of
wise surprising. That is, the modeled system the original goals of artificial chemistry was to
exhibits behaviors that are not explicitly built into understand how life could originate from prebi-
the model. Unpredictability is due to the nonlinear otic chemical processes. Artificial chemistry stud-
effects that result from the interactions of entities ies self-organization in chemical reaction
having simple behaviors. Emergence by these defi- networks by simulating chemical reactions
nitions is something of a subjective process. between artificial molecules. Artificial chemistry
In biological systems, emergence is a central specifies well-understood chemical reactions and
issue whether it be the emergence of the pheno- other information such as reaction rates, relative
type from the genotype, the emergence of protein molecular concentrations, probabilities of reac-
complexes from genomic information networks tion, etc. These form a network of possibilities.
(Kauffman 1993), or the emergence of conscious- The artificial substances and the networks of
ness from networks of millions of brain cells. chemical reactions that emerge from the possibil-
One of the motivations for agent-based model- ities are studied through computation. Reactions
ing is to explore the emergent behaviors exhibited are specified as recursive algebras and activated as
by the simulated system. In general, agent-based term replacement systems (Fontana 1992).
models often exhibit patterns and relationships that
emerge from agent interactions. An example is the
observed formation of groups of agents that collec- Hypercycles The emergence of autocatalytic
tively act in coherent and coordinated patterns. sets, or hypercycles, has been a prime focus of
Complex adaptive systems, widely investigated artificial chemistry (Eigen and Schuster 1979).
by Holland in his agent-based model Echo A hypercycle is a self-contained system of mole-
(Holland 1995), are often structured in hierarchies cules and a self-replicating, and thereby self-
of emergent structures. Emergent structures can sustaining, cyclic linkage of chemical reactions.
collectively form higher-order structures, using Hypercycles evolve through a process by which
the lower-level structures as building blocks. An self-replicating entities compete for selection.
emergent structure itself can take on new emergent The hypercycle model illustrates how an ALife
behaviors. These structures in turn affect the agents process can be adopted to the agent-based model-
from which the structure has emerged in a process ing domain. Inspired by the hypercycle model,
called downward causation (Gilbert 2002). For Padgett et al. (2003) developed an agent-based
example, in the real world, people organize and model of the coevolution of economic production
identify with groups, institutions, nations, and economic firms, focusing on skills. Padgett
etc. They create norms, laws, and protocols that used the model to establish three principles of
in turn act on the individuals comprising the group. social organization that provide foundations for
From the perspective of agent-based modeling, the evolution of technological complexity:
emergence has some interesting challenges for
modeling: • Structured topology (how interaction networks
form)
• How does one operationally define emergence • Altruistic learning (how cooperation and
with respect to agent-based modeling? exchange emerge)
734 Agent-Based Modeling and Artificial Life
• Stigmergy (how agent communication is facil- has close ties with important events in the history
itated by using the environment as a means of of computation.
information exchange among agents) Alan Turing (1938) investigated the limitations
of computation by developing an abstract and
Digital Organisms The widespread availabil- idealized computer, called a universal Turing
ity of personal computers spurred the develop- machine (UTM). A UTM has an infinite tape
ment of ALife programs used to study (memory) and is therefore an idealization of any
evolutionary processes in silico. Tierra was the actual computer that may be realized. A UTM is
first system devised in which computer programs capable of computing anything that is comput-
were successfully able to evolve and adapt (Ray able, that is, anything that can be derived via a
1991). Avida extended Tierra to account for the logical, deductive series of statements. Are the
spatial distribution of organisms and other fea- algorithms used in today’s computers, and in
tures (Ofria and Wilke 2004; Wilke and Adami ALife calculations and agent-based models in
2002). Echo is a simulation framework for particular, as powerful as universal computers?
implementing models to investigate mechanisms Any system that can effectively simulate a
that regulate diversity and information processing small set of logical operations (such as AND and
in complex adaptive systems (CAS), systems NOT) can effectively produce any possible com-
comprised of many interacting adaptive agents putation. Simple rule systems in cellular automata
(Holland 1975, 1995). In implementations of were shown to be equivalent to universal com-
Echo, populations evolve interaction networks, puters (von Neumann 1966; Wolfram 2002) and
resembling species communities in ecological in principal able to compute anything that is com-
systems, which regulate the flow of resources. putable – perhaps, even life!
Systems such as Tierra, Avida, and Echo simu- Some have argued that life, in particular human
late populations of digital organisms, based on the consciousness, is not the result of a logical-
genotype/phenotype schema. They employ compu- deductive or algorithmic process and therefore
tational algorithms to mutate and evolve populations not computable by a universal Turing machine.
of organisms living in a simulated computer envi- This problem is more generally referred to as the
ronment. Organisms are represented as strings of mind-body problem (Lucas 1961). Dreyfus (1979)
symbols, or agent attributes, in computer memory. argues against the assumption often made in the
The environment provides them with resources field of artificial intelligence that human minds
(computation time) they need to survive, compete, function like general purpose symbol manipula-
and reproduce. Digital organisms interact in various tion machines. Penrose (1989) argues that the
ways and develop strategies to ensure survival in rational processes of the human mind transcend
resource-limited environments. formal logic systems. In a somewhat different
Digital organisms are extended to agent-based view, biological naturalism contends (Searle
modeling by implementing individual-based 1990) that human behavior might be able to be
models of food webs in a system called DOVE simulated, but human consciousness is outside the
(Wilke and Chow 2006). Agent-based models bounds of computation.
allow a more complete representation of agent Such philosophical debates are as relevant to
behaviors and their evolutionary adaptability at agent-based modeling as they are to artificial intel-
both the individual and population levels. ligence, for they are the basis of answering the
question of what kind of systems and processes
ALife and Computing agent-based models will ultimately be able, or
Creating lifelike forms through computation is unable, to simulate.
central to artificial life. Is it possible to create life
through computation? The capabilities and limi- Artificial Life Algorithms
tations of computation constrain the types of arti- ALife use several biologically inspired computa-
ficial life that can be created. The history of ALife tional algorithms (Olariu and Zomaya 2006).
Agent-Based Modeling and Artificial Life 735
Bioinspired algorithms include those based on 2. Evaluation: Calculate the fitness of all individ-
Darwinian evolution, such as evolutionary algo- uals according to a specified fitness function.
rithms; those based on neural structures, such as 3. Checking: If any of the individuals has
neural networks; and those based on decentralized achieved an acceptable level of fitness, stop;
decision-making behaviors observed in nature. the problem is solved. Otherwise, continue
These algorithms are commonly used to model with selection.
adaptation and learning in agent-based modeling 4. Selection: Select the best pair of individuals in
or to optimize the behaviors of whole systems. the population for reproduction according to
their high fitness levels.
Evolutionary Computing 5. Crossover: Combine the chromosomes for the
Evolutionary computing includes a family of two best individuals through a crossover oper-
related algorithms and programming solution ation, and produce a pair of offspring.
techniques inspired by evolutionary processes, 6. Mutation: Randomly mutate the chromosomes
especially the genetic processes of DNA replica- for the offspring.
tion and cell division (Eiben and Smith 2007). 7. Replacement: Replace the least fit individuals
These techniques are known as evolutionary algo- in the population with the offspring.
rithms and include the following (Back 1996): 8. Continue at Step 2.
• Genetic algorithms (Goldberg 1989, 1994; Steps 5 and 6 above, the operations of cross-
Holland 1975; Holland et al. 2000; Mitchell over and mutation, comprise the set of genetic
and Forrest 1994) operators inspired by nature. This series of steps
• Evolution strategies (Rechenberg 1973) for a GA comprise a basic framework rather than a
• Learning classifier systems (Holland specific implementation. Actual GA
et al. 2000. implementations include numerous variations
• Genetic programming (Koza 1992) and alternative implementations in several of the
• Evolutionary programming (Fogel et al. 1966. GA steps.
Evolution strategies (ES) are similar to genetic
Genetic algorithms (GA) model the dynamic algorithms but rely on mutation as its primary
processes by which populations of individuals genetic operator.
evolve to improved levels of fitness for their par- Learning classifier systems (LCS) build on
ticular environment over repeated generations. genetic algorithms and adaptively assign relative
GAs illustrate how evolutionary algorithms pro- weights to sensor-action sets that result in the
cess a population and apply the genetic operations most positive outcomes relative to a goal.
of mutation and crossover (see Fig. 5). Each Genetic programming (GP) has similar fea-
behavior is represented as a chromosome tures to genetic algorithms, but instead of using
consisting of a series of symbols, for example, as 0s and 1s or other symbols for comprising chro-
a series of 0s and 1s. The encoding process mosomes, GPs combine logical operations and
establishing correspondence between behaviors directives in a tree structure. In effect, chromo-
and their chromosomal representations is part of somes in GPs represent whole computer programs
the modeling process. that perform a variety of functions with varying
The general steps in a genetic algorithm are as degrees of success and efficiencies. GP chromo-
follows: somes are evaluated against fitness or perfor-
mance measures and recombined. Better-
1. Initialization: Generate an initial population of performing chromosomes are maintained and
individuals. The individuals are unique and expand their representation in the population.
include specific encoding of attributes in chro- For example, an application of a GP is to evolve
mosomes that represents the characteristics of a better-performing rule set that represents an
the individuals. agent’s behavior.
736 Agent-Based Modeling and Artificial Life
Agent-Based Modeling
and Artificial Life,
Fig. 5 Genetic algorithm
Population:
Selection:
Chromosomes:
1 1 0 0 1 1 0 0 1 0
0 1 0 0 0 0 0 1 1 1
Crossover:
New Chromosomes:
1 1 0 0 1 1 0 0 1 0
0 1 0 0 0 0 0 1 1 1
Mutationt:
New Chromosome:
0 1 0 0 1 0 0 1 1 1
Replacement:
New Population:
ants or particles over a search space. In terms of its memory and is aware of the best positions
agent-based modeling, the ants or particles are the obtained by its neighboring particles. The velocity
agents, and the search space is the environment. of each particle adapts over time based on the
Agents have position and state as attributes. In the locations of the best global and local solutions
case of particle swarm optimization, agents also obtained so far, incorporating a degree of stochas-
have velocity. tic variation in the updating of the particle posi-
Ant colony optimization (ACO) mimics tech- tions at each iteration.
niques that ants use to forage and find food effi-
ciently (Bonabeau et al. 1999; Engelbrecht 2006). Artificial Life Algorithms and Agent-Based
The general idea of ant colony optimization algo- Modeling
rithms is as follows: Biologically inspired algorithms are often used
with agent-based models. For example, an agent’s
1. In a typical ant colony, ants search randomly behavior and its capacity to learn from experience
until one of them finds food. or to adapt to changing conditions can be modeled
2. Then they return to their colony and lay down a abstractly through the use of genetic algorithms or
chemical pheromone trail along the way. neural networks. In the case of a GA, a chromo-
3. When other ants find such a pheromone trail, some effectively represents a single agent action
they are more likely to follow the trail rather (output) given a specific condition or environmen-
than to continue to search randomly. tal stimulus (input). Behaviors that are acted on
4. As other ants find the same food source, they and enable the agent to respond better to environ-
return to the nest, reinforcing the original pher- mental challenges are reinforced and acquire a
omone trail as they return. greater share of the chromosome pool. Behaviors
5. As more and more ants find the food source, that fail to improve the organism’s fitness dimin-
the ants eventually lay down a strong phero- ish in their representation in the population.
mone trail to the point that virtually all the ants Evolutionary programming can be used to
are directed to the food source. directly evolve programs that represent agent
6. As the food source is depleted, fewer ants are behaviors. For example, Manson (2006) develops
able to find the food, and fewer ants lay down a a bounded rationality model using evolutionary
reinforcing pheromone trail; the pheromone nat- programming to solve an agent multi-criteria opti-
urally evaporates, and eventually, no ants pro- mization problem.
ceed to the food source, as the ants shift their Artificial neural networks have also been applied
attention to searching for new food sources. to modeling adaptive agent behaviors, in which an
agent derives a statistical relationship between the
In an ant colony optimization computational environmental conditions it faces, its history, and its
model, the optimization problem is represented actions, based on feedback on the success or failures
as a graph, with nodes representing places and of its actions and the actions of others. For example,
links representing possible paths. An ant colony an agent may need to develop a strategy for bidding
algorithm mimics ant behavior with simulated in a market, based on the success of its own and
ants moving from node to node in the graph, other’s previous bids and outcomes.
laying down pheromone trails, etc. The process Finally, swarm intelligence approaches are
by which ants communicate indirectly by using agent based in their basic structure, as described
the environment as an intermediary is known as above. They can also be used for system optimi-
stigmergy (Bonabeau et al. 1999) and is com- zation through the selection of appropriate param-
monly used in agent-based modeling. eters for agent behaviors.
Particle swarm optimization (PSO) is another
decentralized problem-solving technique in which ALife Summary
a swarm of particles is simulated as it moves over Based on the previous discussion, the essential
a search space in search of a global optimum. features of an ALife program can be summarized
A particle stores its best position found so far in as follows:
738 Agent-Based Modeling and Artificial Life
Agent-Based Modeling and Artificial Life, Fig. 6 Schelling housing segregation model
Agent-Based Modeling and Artificial Life, Fig. 7 Sugarscape artificial society simulation in the Repast agent-based
modeling toolkit
Agent-Based Modeling and Artificial Life 741
conflict, and war, as well as externalities such as From Cellular Automata to Cells
pollution. As agents interacted with their neigh- Cellular automata are a natural application to
bors as they moved around the grid, the interac- modeling cellular systems (Alber et al. 2003;
tions resulted in a contact network, that is, a Ermentrout and Edelstein-Keshet 1993). One
network consisting of nodes and links. The approach uses the cellular automata grid and
nodes are agents, and the links indicate the agents cells to model structures of stationary cells com-
that have been neighbors at some point in the prising a tissue matrix. Each cell is a tissue agent.
course of their movements over the grid. Contact Mobile cells consisting of pathogens and anti-
networks were the basis for studying contagion bodies are also modeled as agents. Mobile agents
and epidemics in the Sugarscape model. Under- diffuse through tissue and interact with tissue and
standing the agent rules that govern how networks other colocated mobile cells. This approach is the
are structured and grow, how quickly information basis for agent-based models of the immune sys-
is communicated through networks, and the kinds tem. Celada and Seiden (1992) used bit strings to
of relationships that networks embody is an model the cell receptors in a cellular automaton
important aspect of modeling agents. model of the immune system called IMMSIM .
This approach was extended to a more general
Culture and Generative Social Science
agent-based model and implemented to maximize
Dawkins, who has written extensively on aspects of the number of cells that could be modeled in the
Darwinian evolution, coined the term meme as the CIMMSIM and ParImm systems (Bernaschi and
smallest element of culture that is transmissible Castiglione 2001). The Basic Immune Simulator
between individuals, similar to the notion of the uses a general agent-based framework (the Repast
gene as being the primary unit of transmitting genetic agent-based modeling toolkit) to model the inter-
information (Dawkins 1989). Several social agent- actions between the cells of the innate and adap-
based models are based on a meme representation of tive immune system (Folcik et al. 2007). These
culture as shared or collective agent attributes. approaches for modeling the immune system have
In the broadest terms, social agent-based sim- inspired several agent-based models of intrusion
ulation is concerned with social interaction and detection for computer networks (see, e.g.,
social processes. Emergence enters into social Azzedine et al. 2007) and have found use in
simulation through generative social science modeling the development and spread of cancer
whose goal is to model social processes as emer- (Preziosi 2003).
gent processes and their emergence as the result of At the more macroscopic level, agent-based epi-
social interactions. Epstein has argued that social demic models have been developed using network
processes are not fully understood unless one is topologies. These models include people and some
able to theorize how they work at a deep level and representation of pathogens as individual agents for
have social processes emerge as part of a compu- natural (Bobashev et al. 2007) and potentially man-
tational model (Epstein 2007). More recent work made (Carley et al. 2006) epidemics.
has treated culture as a fluid and dynamic process Modeling bacteria and their associated behav-
subject to interpretation of individual agents, iors in their natural environments are another
more complex than the genotype/phenotype direction of agent-based modeling. Expanding
framework would suggest. beyond the basic cellular automata structure into
continuous space and network topologies,
Emonet et al. (2005) developed AgentCell, a
ALife and Biology multi-scale agent-based model of E. coli bacteria
ALife research has motivated many agent-based motility (Fig. 8). In this multi-scale agent-based
computational models of biological systems, and at simulation, molecules within a cell are modeled as
all scales, ranging from the cellular level, or even the individual agents. The molecular reactions com-
subcellular molecular level, as the basic unit of prising the signal transduction network for che-
agency, to complex organisms embedded in larger motaxis are modeled using an embedded
structures such as food webs or complex ecosystems. stochastic simulator, StochSim (Le Novere and
742 Agent-Based Modeling and Artificial Life
Y
A
Y
Molecule
Y
Cell Attributes A
• motion: run, tumble • reactivity
Y
Cell A • reaction rate
Y
• position A
• orientation
Sensors • speed
Yp
flagella
Agent-Based Modeling and Artificial Life, Fig. 8 AgentCell multi-scale agent-based model of bacterial chemotaxis
Shimizu 2001). This multi-scale approach allows Adaptation and Learning in Agent-Based Models
the motile (macroscopic) behavior of colonies of Biologists consider adaptation to be an essential
bacteria to be modeled as a direct result of the part of the process of evolutionary change.
modeled microlevel processes of protein produc- Adaptation occurs at two levels: the individual
tion within the cells, which are based on individ- level and the population level. In parallel with
ual molecular interactions. these notions, agents in an ABM adapt by
changing their individual behaviors or by
changing their proportional representation in
Artificial Ecologies the population. Agents adapt their behaviors at
Early models of ecosystems used approaches the individual level through learning from expe-
adapted from physical modeling, especially rience in their modeled environment.
models of idealized gases based on statistical With respect to agent-based modeling, theories
mechanics. More recently, individual-based of learning by individual agents or collectives of
models have been developed to represent the full agents, as well as algorithms for how to model
range of individual diversity by explicitly model- learning, become important. Machine learning is a
ing individual attributes or behaviors and aggre- field consisting of algorithms for recognizing pat-
gating across individuals for an entire population terns in data (such as data mining) through tech-
(DeAngelis and Gross 1992). Agent-based niques such as supervised learning, unsupervised
approaches model a diverse set of agents and learning, and reinforcement learning (Alpaydın
their interactions based on their relationships, 2004; Bishop 2007). Genetic algorithms (Goldberg
incorporating adaptive behaviors as appropriate. 1989) and related techniques such as learning clas-
For example, food webs represent the complex, sifier systems (Holland et al. 2000) are commonly
hierarchical network of agent relationships in used to represent agent learning in agent-based
local ecosystems (Peacor et al. 2006). Agents are models. In ABM applications, agents learn through
individuals or species representatives. Adaptation interactions with the simulated environment in
and learning for agents in such food webs can be which they are embedded as the simulation precedes
modeled to explore diversity, relative population through time, and agents modify their behaviors
sizes, and resiliency to environmental insult. accordingly.
Agent-Based Modeling and Artificial Life 743
Agents may also adapt collectively at the pop- Advancing social sciences beyond the geno-
ulation level. Those agents having behavioral type/phenotype framework to address the genera-
rules better suited to their environments survive tive nature of social systems in their full
and thrive, and those agents not so well suited are complexity is a requirement for advancing com-
gradually eliminated from the population. putational social models. Recent work has treated
culture as a fluid and dynamic process subject to
interpretation of individual agents, more complex
Future Directions in many ways than that provided by the genotype/
phenotype framework.
Agent-based modeling continues to be inspired by Agent-based modeling will continue to be the
ALife – in the fundamental questions it is trying to avenue for exploring new constructs in ALife. If
answer, in the algorithms that it employs to model true artificial life is ever developed in silico, it will
agent behaviors and solve agent-based models, most likely be done using the methods and tools of
and in the computational architectures that are agent-based modeling.
employed to implement agent-based models.
The future of the fields of both ALife and ABM
will continue to be intertwined in essential ways in Bibliography
the coming years.
Computational advances will continue at an
Primary Literature
ever-increasing pace, opening new vistas for com- Adami C (1998) Introduction to artificial life. TELOS,
putational possibilities in terms of expanding the Santa Clara
scale of models that are possible. Computational Alber MS, Kiskowski MA, Glazier JA, Jiang Y (2003)
On cellular automaton approaches to modeling bio-
advances will take several forms, including logical cells. In: Rosenthal J, Gilliam DS (eds) Math-
advances in computer hardware including new ematical systems theory in biology, communication,
chip designs, multi-core processors, and advanced and finance, IMA volume. Springer, New York,
integrated hardware architectures. Software that pp 1–39
Alpaydın E (2004) Introduction to machine learning. MIT
take advantage of these designs and in particular
Press, Cambridge
computational algorithms and modeling tech- Axelrod R (1984) The evolution of cooperation. Basic
niques and approaches will continue to provide Books, New York
opportunities for advancing the scale of applica- Axelrod R (1997) The complexity of cooperation: agent-
based models of competition and collaboration.
tions and allow more features to be included in
Princeton University Press, Princeton
agent-based models as well as ALife applications. Azzedine B, Renato BM, Kathia RLJ, Joao Bosco MS,
These will be opportunities for advancing appli- Mirela SMAN (2007) An agent based and biological
cations of ABM to ALife in both the realms of inspired real-time intrusion detection and security
model for computer network operations. Comput
scientific research and in policy analysis.
Commun 30(13):2649–2660
Real-world optimization problems routinely Back T (1996) Evolutionary algorithms in theory and prac-
solved by business and industry will continue to tice: evolution strategies, evolutionary programming,
be solved by ALife-inspired algorithms. The use genetic algorithms. Oxford University Press, New York
Berlekamp ER, Conway JH, Guy RK (2003) Winning
of ALife-inspired agent-based algorithms for
ways for your mathematical plays, 2nd edn. AK Peters,
solving optimization problems will become more Natick
widespread because of their natural implementa- Bernaschi M, Castiglione F (2001) Design and implemen-
tion and ability to handle ill-defined problems. tation of an immune system simulator. Comput Biol
Med 31(5):303–331
Emergence is a key theme of ALife. ABM offers
Bishop CM (2007) Pattern recognition and machine learn-
the capability to model the emergence of order in a ing. Springer, New York
variety of complex and complex adaptive systems. Bobashev GV, Goedecke DM, Yu F, Epstein JM
Inspired by ALife, identifying the fundamental (2007) A hybrid epidemic model: combining the advan-
tages of agent-based and equation-based approaches. In:
mechanisms responsible for higher-order emergence
Henderson SG, Biller B, Hsieh M-H, Shortle J, Tew JD,
and exploring these with agent-based modeling will Barton RR (eds) Proceeding 2007 winter simulation
be an important and promising research area. conference, Washington, pp 1532–1537
744 Agent-Based Modeling and Artificial Life
Bonabeau E (1997) From classical models of morphogen- evolution, Chicago, 11–12 Oct 2002, pp 1–11. Avail-
esis to agent-based models of pattern formation. Artif able on CD and at www.agent2007.anl.gov
Life 3:191–211 Gilbert N, Troitzsch KG (1999) Simulation for the social
Bonabeau E, Dorigo M, Theraulaz G (1999) Swarm intel- scientist. Open University Press, Buckingham
ligence: from natural to artificial systems. Oxford Uni- Goldberg DE (1989) Genetic algorithms in search, optimi-
versity Press, New York zation, and machine learning. Addison-Wesley, Reading
Carley KM, Fridsma DB, Casman E, Yahja A, Altman N, Goldberg DE (1994) Genetic and evolutionary algorithms
Chen LC, Kaminsky B, Nave D (2006) Biowar: scal- come of age. Commun ACM 37(3):113–119
able agent-based model of bioattacks. IEEE Trans Syst Holland JH (1975) Adaptation in natural and artificial
Man Cybern Part A: Syst Hum 36(2):252–265 systems. University of Michigan, Ann Arbor
Celada F, Seiden PE (1992) A computer model of cellular Holland J (1995) Hidden order: how adaptation builds
interactions in the immune system. Immunol Today complexity. Addison-Wesley, Reading
13(2):56–62 Holland JH, Booker LB, Colombetti M, Dorigo M, Gold-
Clerc M (2006) Particle swarm optimization. ISTE Pub- berg DE, Forrest S, Riolo RL, Smith RE, Lanzi PL,
lishing, London Stolzmann W, Wilson SW (2000) What is a learning
Dawkins R (1989) The selfish gene, 2nd edn. Oxford classifier system? In: Lanzi PL, Stolzmann W, Wilson
University Press, Oxford SW (eds) Learning classifier systems, from foundations
DeAngelis DL, Gross LJ (eds) (1992) Individual-based to applications. Springer, London, pp 3–32
models and approaches in ecology: populations, com- Kauffman SA (1993) The origins of order: self-
munities and ecosystems. Proceedings of a symposium/ organization and selection in evolution. Oxford Uni-
workshop, Knoxville, 16–19 May 1990. Chapman & versity Press, Oxford
Hall, New York. ISBN 0-412-03171-X Koza JR (1992) Genetic programming: on the program-
Dorigo M, Stützle T (2004) Ant colony optimization. MIT ming of computers by means of natural selection. MIT
Press, Cambridge Press, Cambridge, 840 pp
Dreyfus HL (1979) What computers can’t do: the limits of Langton CG (1984) Self-reproduction in cellular automata.
artificial intelligence. Harper & Row, New York Physica D 10:135–144
Eiben AE, Smith JE (2007) Introduction to evolutionary Langton CG (1986) Studying artificial life with cellular
computing, 2nd edn. Springer, New York automata. Physica D 22:120–149
Eigen M, Schuster P (1979) The hypercycle: a principle of Langton CG (1989a) Preface. In: Langton CG (ed) Artificial
natural self-organization. Springer, Berlin life: proceedings of an interdisciplinary workshop on the
Emonet T, Macal CM, North MJ, Wickersham CE, Cluzel synthesis and simulation of living systems, Los Alamos,
P (2005) AgentCell: a digital single-cell assay for bac- Sept 1987, Addison-Wesley, Reading, pp xv–xxvi
terial chemotaxis. Bioinformatics 21(11):2714–2721 Langton CG (1989b) Artificial life. In: Langton CG
Engelbrecht AP (2006) Fundamentals of computational (ed) Artificial life: the proceedings of an interdisciplin-
swarm intelligence. Wiley, Hoboken ary workshop on the synthesis and simulation of living
Epstein JM (2007) Generative social science: studies in systems, Los Alamos, Sept 1987, Santa Fe Institute
agent-based computational modeling. Princeton Uni- studies in the sciences of complexity, vol VI. Addison-
versity Press, Princeton Wesley, Reading, pp 1–47
Epstein JM, Axtell R (1996) Growing artificial societies: Langton CG (1992) Life at the edge of chaos. In: Langton
social science from the bottom up. MIT Press, Cambridge CG, Taylor C, Farmer JD, Rasmussen S (eds) Artificial
Ermentrout GB, Edelstein-Keshet L (1993) Cellular life II: proceedings of the workshop on artificial life,
automata approaches to biological modeling. J Theor Santa Fe, Feb 1990, Santa Fe Institute studies in the
Biol 160(1):97–133 sciences of the complexity, vol X. Addison-Wesley,
Fogel LJ, Owens AJ, Walsh MJ (1966) Artificial intelli- Reading, pp 41–91
gence through simulated evolution. Wiley, Hoboken Le Novere N, Shimizu TS (2001) Stochsim: modelling of
Folcik VA, An GC, Orosz CG (2007) The basic immune stochastic biomolecular processes. Bioinformatics
simulator: an agent-based model to study the interac- 17(6):575–576
tions between innate and adaptive immunity. Theor Lindenmeyer A (1968) Mathematical models for cellular
Biol Med Model 4(39):1–18. http://www.tbiomed. interaction in development. J Theor Biol 18:280–315
com/content/4/1/39 Lucas JR (1961) Minds, machines and godel. Philosophy
Fontana W (1992) Algorithmic chemistry. In: Langton CG, 36(137):112–127
Taylor C, Farmer JD, Rasmussen S (eds) Artificial life Manson SM (2006) Bounded rationality in agent-based
II: proceedings of the workshop on artificial life, Santa models: experiments with evolutionary programs. Int
Fe, Feb 1990, Santa Fe Institute studies in the sciences J Geogr Inf Sci 20(9):991–1012
of the complexity, vol X. Addison-Wesley, Reading, Mehrotra K, Mohan CK, Ranka S (1996) Elements of
pp 159–209 artificial neural networks. MIT Press, Cambridge
Gardner M (1970) The fantastic combinations of John Mitchell M, Forrest S (1994) Genetic algorithms and arti-
Conway’s new solitaire game life. Sci Am 223:120–123 ficial life. Artif Life 1(3):267–289
Gilbert N (2002) Varieties of emergence. In: Macal C, Ofria C, Wilke CO (2004) Avida: a software platform for
Sallach D (eds) Proceedings of the agent 2002 confer- research in computational evolutionary biology. Artif
ence on social agents: ecology, exchange and Life 10(2):191–229
Agent-Based Modeling and Artificial Life 745
Olariu S, Zomaya AY (eds) (2006) Handbook of bioinspired Batty M (2007) Cities and complexity: understanding cit-
algorithms and applications. Chapman, Boca Raton, p 679 ies with cellular automata, agent-based models, and
Padgett JF, Lee D, Collier N (2003) Economic production fractals. MIT Press, Cambridge
as chemistry. Ind Corp Chang 12(4):843–877 Bedau MA (2002) The scientific and philosophical scope
Peacor SD, Riolo RL, Pascual M (2006) Plasticity and of artificial life. Leonardo 35:395–400
species coexistence: modeling food webs as complex Bedau MA (2003) Artificial life: organization, adaptation
adaptive systems. In: Pascual M, Dunne JA (eds) Eco- and complexity from the bottom up. TRENDS Cognit
logical networks: linking structure to dynamics in food Sci 7(11):505–512
webs. Oxford University Press, New York, pp 245–270 Copeland BJ (2004) The essential turing. Oxford Univer-
Penrose R (1989) The emperor’s new mind: concerning sity Press, Oxford, 613 pp
computers, minds, and the laws of physics. Oxford Ganguly N, Sikdar BK, Deutsch A, Canright G, Chaudhuri
University Press, Oxford PP (2008) A survey on cellular automata. www.cs.
Poundstone W (1985) The recursive universe. Contempo- unibo.it/bison/publications/CAsurvey.pdf
rary Books, Chicago, 252 pp Griffeath D, Moore C (eds) (2003) New constructions in
Preziosi L (ed) (2003) Cancer modelling and simulation. cellular automata, Santa Fe Institute studies in the sci-
Chapman, Boca Raton ences of complexity proceedings. Oxford University
Ray TS (1991) An approach to the synthesis of life (tierra Press, New York, 360 pp
simulator). In: Langton CG, Taylor C, Farmer JD, Ras- Gutowitz H (ed) (1991) Cellular automata: theory and
mussen S (eds) Artificial life Ii: proceedings of the work- experiment. Special issue of Physica D. 499 pp
shop on artificial life. Wesley, Redwood City, pp 371–408 Hraber T, Jones PT, Forrest S (1997) The ecology of echo.
Rechenberg I (1973) Evolutionsstrategie: optimierung Artif Life 3:165–190
Technischer Systeme Nach Prinzipien Der Biologischen International Society for Artificial Life web page
evolution. Frommann-Holzboog, Stuttgart (2008) www.alife.org. Accessed 8 Mar 2008
Sakoda JM (1971) The checkerboard model of social inter- Jacob C (2001) Illustrating evolutionary computation with
action. J Math Soc 1:119–132 mathematica. Academic, San Diego, 578 pp
Schelling TC (1971) Dynamic models of segregation. Michael CF, Fred WG, Jay A (2005) Simulation optimiza-
J Math Soc 1:143–186 tion: a review, new developments, and applications. In:
Searle JR (1990) Is the brain a digital computer? Presiden- Proceedings of the 37th conference on Winter simula-
tial Address to the American Philosophical Association tion, Orlando
Taub AH (ed) (1961) John Von Neumann: collected works. Miller JH, Page SE (2007) Complex adaptive systems: an
vol V: Design of computers, theory of automata and introduction to computational models of social life.
numerical analysis (Delivered at the Hixon Sympo- Princeton University Press, Princeton
sium, Pasadena, Sept 1948). Pergamon Press, Oxford North MJ, Macal CM (2007) Managing business complex-
Turing AM (1938) On computable numbers with an appli- ity: discovering strategic solutions with agent-based
cation to the entscheidungsproblem. Process Lond modeling and simulation. Oxford University Press,
Math Soc 2(42):230–265 New York
Turing AM (1952) The chemical basis of morphogenesis. Pascual M, Dunne JA (eds) (2006) Ecological networks:
Philos Trans Royal Soc B 237:37–72 linking structure to dynamics in food webs, Santa Fe
von Neumann J (1966) In: Burks AW (ed) Theory of self- Institute studies on the sciences of complexity. Oxford
reproducing automata. University of Illinois Press, University Press, New York
Champaign Simon H (2001) The sciences of the artificial. MIT Press,
Wilke CO, Adami C (2002) The biology of digital organ- Cambridge
isms. Trends Ecol Evol 17(11):528–532 Sims K (1991) Artificial evolution for computer graphics.
Wilke CO, Chow SS (2006) Exploring the evolution of ACM SIGGRAPH 0 91, Las Vegas, July 1991, pp 319–328
ecosystems with digital organisms. In: Pascual M, Sims K (1994) Evolving 3D morphology and behavior by
Dunne JA (eds) Ecological networks: linking structure competition. Artif Life IV:28–39
to dynamics in food webs. Oxford University Press, Terzopoulos D (1999) Artificial life for computer graphics.
New York, pp 271–286 Commun ACM 42(8):33–42
Wolfram S (1984) Universality and complexity in cellular Toffoli T, Margolus N (1987) Cellular automata machines:
automata. Physica D 1–35 a new environment for modeling. MIT Press, Cam-
Wolfram S (1999) The mathematica book, 4th edn. Wol- bridge, 200 pp
fram Media/Cambridge University Press, Champaign Tu X, Terzopoulos D (1994) Artificial fishes: physics,
Wolfram S (2002) A new kind of science. Wolfram Media, locomotion, perception, behavior. In: Proceedings of
Champaign SIGGRAPH`94, 24–29 July 1994, Orlando, pp 43–50
Weisbuch G (1991) Complex systems dynamics: an intro-
Books and Reviews duction to automata networks, translated from French
Artificial Life (journal) web page (2008) http://www. by Ryckebusch S. Addison-Wesley, Redwood City
mitpressjournals.org/loi/artl. Accessed 8 Mar 2008 Wiener N (1948) Cybernetics, or control and communica-
Banks ER (1971) Information processing and transmission tion in the animal and the machine. Wiley, New York
in cellular automata. PhD dissertation, Massachusetts Wooldridge M (2000) Reasoning about rational agents.
Institute of Technology MIT Press, Cambridge
vision sensors). For a more restricted defini-
Embodied and Situated tion, see the concluding section of the entry.
Agents, Adaptive Behavior in Morphological computation Indicates the abil-
ity of the body of an agent (with certain spe-
Stefano Nolfi cific characteristics) to control its interaction
Institute of Cognitive Sciences and with the environment so to produce a given
Technologies, National Research Council desired behavior.
(CNR), Rome, Italy Ontogenesis Indicates the variations which
occur in the phenotypical characteristics of
Article Outline an artificial agent (i.e., in the characteristics
of the control system or of the body of the
Definition of the Subject agent), while it interacts with the environment.
Introduction Phylogenesis Indicates the variations of the
Embodiment and Situatedness genetic characteristics of a population of arti-
Behavior and Cognition as Complex Adaptive ficial agents throughout generations.
Systems Situated agent Indicates an artificial system
Behavior and Cognition as Phenomena which is located in a physical environment
Originating from the Interaction Between (simulated or real) with which it interacts on
Coupled Dynamical Processes the basis of the law of physics. For a more
Behavior and Cognition as Phenomena restricted definition, see the concluding sec-
with a Multilevel and Multi-scale tion of the entry.
Organization
On the Top-Down Effect from Higher to Lower
Levels of Organization
Definition of the Subject
Adaptive Methods
Evolutionary Robotics Methods
Adaptive behavior concerns the study of how
Developmental Robotics Methods
organisms develop their behavioral and cognitive
The Incremental Nature of the Developmental
skills through a synthetic methodology which
Process
consists in designing artificial agents which are
The Social Nature of the Developmental Process
able to adapt to their environment autonomously.
Exploitation of the Interaction Between
These studies are important both from a modeling
Concurrent Developmental Processes
point of view (i.e., for making progress in our
Discussion and Conclusion
understanding of intelligence and adaptation in
Bibliography
natural beings) and from an engineering point of
view (i.e., for making progresses in our ability to
develop artifacts displaying effective behavioral
Glossary and cognitive skills).
to the task they have to fulfill autonomously (i.e., “Behavior and Cognition as Complex Adaptive
without human intervention). This goal is Systems,” we claim that behavior and cognition
achieved through a synthetic methodology, i.e., in embodied and situated adaptive agents should
through the synthesis of artificial creatures which be characterized as a complex adaptive system. In
(i) have a body, (ii) are situated in an environment section “Adaptive Methods,” we briefly describe
with which they interact, and (iii) have character- the methods which can be used to synthesize
istics which vary during an adaptation process. In embodied and situated adaptive agents. Finally
the rest of the entry, we will use the term “agent” in section “Discussion and Conclusions,” we
to indicate artificial creatures which posses the draw our conclusions.
first two features described above and the term
“adaptive agent” to indicate artificial creatures
which also posses the third feature. Embodiment and Situatedness
The agents and the environment might be sim-
ulated or real. In the former case, the character- The notion of embodiment and situatedness has
istics of agents’ body, motor, and sensory system, been introduced (Brooks 1991a, b; Clark 1997;
the characteristics of the environment, and the Pfeifer and Bongard 2007; Varela et al. 1991) to
rules that regulate the interactions between all characterize systems (e.g., natural organism and
the elements are simulated on a computer. In the robots) which have a physical body and which are
latter case, the agents consist of physical entities situated in a physical environment with which
(mobile robots) situated in a physical environ- they interact. In this and in the following sections,
ment with which they interact on the basis of we will briefly discuss the general implications of
the physical laws. these two fundamental properties. This analysis
The adaptive process which regulates how the will be further extended in the concluding section
characteristics of the agents (and eventually of where we will claim on the necessity to distin-
the social environment) change might consist of a guish between a weak and a strong notion of
population-based evolutionary process and/or of embodiment and situatedness.
a developmental/learning process. In the former One first important implication of being
case, the characteristics of the agents do not vary embodied and situated consists in the fact that
during their “lifetime” (i.e., during the time in these agents and their parts are characterized by
which the agents interact with the environment) their physical properties (e.g., weight, dimension,
but, phylogenetically, while individual agents shape, elasticity, etc.), are subjected to the laws of
“reproduce.” In the latter case, the characteristics physics (e.g., inertia, friction, gravity, energy
of the agents vary ontogenetically, while they consumption, deterioration, etc.), and interact
interact with the environment. The criteria with the environment through the exchange of
which determine how variations are generated energy and physical material (e.g., forces, sound
and/or whether or not variations are retained can waves, light waves, etc.). Their physical nature
be task dependent and/or task independent, i.e., also implies that they are quantitative in state and
might be based on an evaluation of whether the time (van Gelder 1998). The fact that these agents
variation increases or decreases agents’ ability to are quantitative in time implies, for example, that
display a behavior which is adapted to the task/ the joints which connect the parts of a robotic arm
environment or might be based on task-indepen- can assume any possible position within a given
dent criteria (i.e., general criteria which do not range. The fact that these agents are quantitative
reward directly the exhibition of the requested in time implies, for example, that the effects of
skill). the application of a force to a joint depend from
The entry is organized as follows. In section the time duration of its application.
“Embodiment and Situatedness,” we briefly One second important implication is that the
introduce the notion of embodiment and information measured by the sensors is not only a
situatedness and their implications. In section function of the environment but also of the
Embodied and Situated Agents, Adaptive Behavior in 749
relative position of the agent in the environment. These aspects and the complex system nature
This implies that the motor actions performed by of behavior and cognition will be illustrated in
an agent, by modifying the agent/environmental more details in the next subsections also with the
relation or the environment, co-determine the help of examples. The theoretical and practical
agent sensory experiences. implication of these aspects for developing arti-
One third important implication is that the ficial agents able to exhibit effective behavioral
information extracted by the sensors from the and cognitive skills will be discussed in the forth-
external environment is egocentric (depends coming sections.
from the current position and the orientation of
the agent in the environment), local (only provide Behavior and Cognition as Emergent
information related to the local observable por- Dynamical Properties
tion of the environment), incomplete (e.g., due to Behavior and cognition are dynamical properties
visual occlusion), and subjected to noise. Similar which unfold in time and which emerge from
characteristics apply to the motor actions pro- high-frequent nonlinear interactions between the
duced by the agent’s effectors. agent, its body, and the external environment
It is important to notice that these characteris- (Chiel and Beer 1997).
tics do not only represent constraints but also At any time step, the environmental and the
opportunities to be exploited. Indeed, as we will agent/environmental relation co-determine the
see in the next section, the exploitation of some of body and the motor reaction of the agent which,
these characteristics might allow embodied and in turn, co-determines how the environment
situated agents to solve their adaptive problems and/or the agent/environmental relation vary.
through solutions which are robust and parsimo- Sequences of these interactions, occurring at a
nious (i.e., minimal) with respect to the complex- fast-time rate, lead to a dynamical
ity of the agent’s body and control system. process – behavior – which extends over signifi-
cant larger time span than the interactions
(Fig. 1).
Behavior and Cognition as Complex Since interactions between the agent’s control
Adaptive Systems system, the agent’s body, and the external envi-
ronment are nonlinear (i.e., small variations in
In embodied and situated agents, behavioral and sensory states might lead to significantly different
cognitive skills are dynamical properties which motor actions) and dynamical (i.e., small varia-
unfold in time and which arise from the interac- tions in the action performed at time t might
tion between agents’ nervous system, body, and significantly impact later interactions at time
the environment (Beer 1995; Chiel and Beer ðtþx Þ), the relation between the rules that govern
1997; Keijzer 2001; Nolfi 2005; Nolfi and the interactions and the behavioral and cognitive
Floreano 2000) and from the interaction between skills originating from the interactions tend to be
dynamical processes occurring within the agents’ very indirect. Behavioral and cognitive skills thus
control system, the agents’ body, and within the emerge from the interactions between the three
environment (Beer 2003; Gigliotta and Nolfi foundational elements and cannot be traced back
2008; Tani and Fukumura 1997). Moreover, to any of the three elements taken in isolation.
behavioral and cognitive skills typically display Indeed, the behavior displayed by an embodied
a multilevel and multi-scale organization involv- and situated agent can hardly be predicted or
ing bottom-up and top-down influences between inferred from an external observer even on the
entities at different levels of organization. These basis of a complete knowledge of the interacting
properties imply that behavioral and cognitive elements and of the rules governing the
skills in embodied and situated agents can be interactions.
properly characterized as complex adaptive sys- A clear example of how behavioral skill might
tems (Nolfi 2005). emerge from the interaction between the agents’
750 Embodied and Situated Agents, Adaptive Behavior in
Behavior
Environment
Body
Control
System
Embodied and Situated Agents, Adaptive Behavior might be dynamical systems on their own. In this case,
in, Fig. 1 A schematic representation of the relation agents’ behavioral and cognitive skills not only result
between agent’s control system, agent’s body, and the from the dynamics originating from the agent/body/envi-
environment. The behavioral and cognitive skills ronmental interactions but also from the combination and
displayed by the agent are the emergent result of the the interaction between dynamical processes occurring
bidirectional interactions (represented with full arrows) within the agent’s body, within the agent’s control system,
between the three constituting elements – agent’s control and within the environment (see section “Behavior and
system, agent’s body, and environment. The dotted Cognition as Complex Adaptive Systems”)
arrows indicate that the three constituting elements
body and the environment is constituted by the the characteristics of the environment, the phys-
passive walking machines developed in simula- ics law which regulate the interaction between
tion by McGeer (1990) – a two-dimensional the body and the environment, and the character-
bipedal machines able to walk down a 4 slope istics of the body. The first two factors can be
with no motors and no control system (Fig. 2). considered as fixed, but the third factor, the body
The walking behavior arises from the fact that the structure, can be adapted to achieve a given func-
physical forces resulting from gravity and from tion. Indeed, in the case of this biped robot, the
the collision between the machine and the slope author carefully selected the leg length, the leg
produce a movement of the robot and the fact that mass, and the foot size to obtain the desired
robot’s movements produce a variation of the walking behavior. In more general term, this
agent-environmental relation which in turn pro- example shows how the role of regulating the
duce a modification of the physical forces to interaction between the robot and the environ-
which the machine will be subjected in the next ment in the appropriate way can be played not
time step. The sequence of bidirectional effects only from the control system but also from the
between the robot’s body and the environment body itself providing that the characteristics of
can lead to a stable dynamical process – the walk- the body have been shaped so to favor the exhi-
ing behavior. bition of the desired behavior. This property, i.e.,
The type of behavior which arises from the the ability of the body to control its interaction
robot/environmental interaction depends from with the environment, has been named with the
Embodied and Situated Agents, Adaptive Behavior in 751
Shank
Mass
Slope
term “morphological computation” (Pfeifer discriminating the two objects is far from trivial
et al. 2006). For related work which demonstrates since the two classes of sensory patterns experi-
how effective walking machines can be obtained enced by robots close to a wall and close to
by integrating passive walking techniques with cylindrical objects largely overlap.
simple control mechanisms, see Bongard and The attempt to solve this problem through an
Paul (2001), Endo et al. (2002), and Vaughan evolutionary adaptive method (see section
et al. (2004). For related works which show the “Adaptive Methods”) in which the free parame-
role of elastic material and elastic actuators for ters (i.e., the parameters which regulate the fine-
morphological computing, see Massera grained interaction between the robot and the
et al. (2007) and Schmitz et al. (2007). environment) are varied randomly and in which
To illustrate how behavioral and cognitive variations are retained or discarded on the basis
skills might emerge from agent’s body, agent’s on an evaluation of the overall ability of the robot
control system, and environmental interactions, (i.e., on the basis of the time spent by the robot
we describe a simple experiment in which a close to the cylindrical object) demonstrated how
small-wheeled robot situated in an arena adaptive robots can find solutions which are
surrounded by walls has been evolved to find robust and parsimonious in terms of control
and to remain close to a cylindrical object. The mechanisms (Nolfi 2002). Indeed, in all replica-
Khepera robot (Mondada et al. 1993) is provided tions of this experiment, evolved robot solves the
with eight infrared sensors and two motors con- problem by moving forward, by avoiding walls,
trolling the two corresponding wheels (Fig. 3). and by oscillating back and forth and left and
From the point of view of an external observer, right close to cylindrical objects (Fig. 3, right).
solving this problem requires robots able to: All these behaviors result from sequences of
(a) explore the environment until an obstacle is interactions between the robot and the environ-
detected, (b) discriminate whether the obstacle ment mediated by four types of simple control
detected is a wall or a cylindrical object, and rules which consist in: turning left when the right
(c) approach or avoid objects depending on the infrared sensors are activated, turning right when
object type. Some of these behaviors (e.g., the the left infrared sensors are activated, moving
wall-avoidance behavior) can be obtained back when the frontal infrared sensors are acti-
through simple control mechanisms, but others vated, and moving forward when the frontal
require nontrivial control mechanisms. Indeed, a infrared sensors are not activated.
detailed analysis of the sensory patterns experi- To understand how these simple control rules
enced by the robot indicated that the task of can produce the required behaviors and the
752 Embodied and Situated Agents, Adaptive Behavior in
45
25
5
180 270 0 90 180
45
25
5
180 270 0 90 180
Embodied and Situated Agents, Adaptive Behavior random position in the environment, leaving it free to
in, Fig. 3 Left, the agent situated in the environment. The move for 500 time steps each lasting 100 ms, and record-
agent is a Khepera robot (Mondada et al. 1993). The ing its relative movements with respect to the two types of
environment consists of an arena of 60 35 cm objects for distances smaller than 45 mm. The x-axis and
containing cylindrical objects placed in a randomly the y-axis indicate the relative angle (in degrees) and
selected location. Right, angular trajectories of an evolved distance (in mm) between the robot and the corresponding
robot close to a wall (top graph) and to a cylinder (bottom object. For the sake of clarity, arrows are used to indicate
graph). The picture was obtained by placing the robot at a the relative direction but not the amplitude of movements
required arbitration between behaviors, we the rough classification of the robot motor
should consider that the same motor responses responses into four different types of actions is
produce different effects on different agent/envi- useful to describe the strategy with which these
ronmental situations. For example, the execution robots solve the problem qualitatively, the quan-
of a left-turning action close to a cylindrical titative aspects which characterize the robot
object and the subsequent modification of the motor reactions (e.g., how sharply a robot turns
robot/object relative position produce a new sen- given a certain pattern of activation of the infra-
sory state which triggers a right-turning action. red sensors) are crucial for determining whether
Then, the execution of the latter action and the the robot will be able to solve the problem or not.
subsequent modification of the robot/object rela- Indeed, small differences in the robot’s motor
tive position produce a new sensory state which response tend to cumulate in time and might
triggers a left-turning action. The combination prevent the robot for producing successful behav-
and the alternation of these left- and right-turning ior (e.g., might prevent the robot to produce a
actions over time produce an attractor in the behavioral attractor close to cylindrical objects).
agent/environmental dynamics (Fig. 3, right, bot- This experiment clearly exemplifies some
tom graph) which allows the robot to remain important aspects which characterize all adaptive
close to the cylindrical object. On the other behavioral system, i.e., systems which are
hand, the execution of a left-turning behavior embodied and situated and which have been
close to a wall object and the subsequent modifi- designed or adapted so to exploit the properties
cation of the robot/wall position produce a new that emerge from the interaction between their
sensory state which triggers the reiteration of the control system, their body, and the external envi-
same motor action. The execution of a sequence ronment. In particular, it demonstrates how
of left-turning action then leads to the avoidance required behavioral and cognitive skills (i.e.,
of the object and to a modification of the robot/ object categorization skills) might emerge from
environmental relation which finally lead to a the fine-grained interaction between the robot’s
perception of a sensory state which trigger a control system, body, and the external environ-
move-forward behavior (Fig. 4, right, top graph). ment without the need of dedicated control mech-
Before concluding the description of this anisms. Moreover, it demonstrates how the
experiment, it is important to notice that although relation between the control rules which mediate
Embodied and Situated Agents, Adaptive Behavior in 753
Embodied and Situated Agents, Adaptive Behavior produced by the light bulb located on the left side of the
in, Fig. 4 Left, the e-puck robot developed at EPFL, central corridor cannot be perceived from the other two
Switzerland http://www.e-puck.org/. Center, the environ- corridors. Right, the motor trajectory produced by the
ment which has a size of 52 cm by 60 cm. The light robot during a complete lap of the environment
the interaction between the robot body and the The existence of several concurrent dynamical
environment and the behavioral skills exhibited processes represents an important opportunity for
by the agents is rather indirect. This means, for the possibility to exploit emergent features.
example, that an external human observer can Indeed, behavioral and cognitive skills might
hardly predict the behaviors which will be pro- emerge not only from the external dynamics, as
duced by the robot, before observing the robot we showed in the previous section, but also from
interacting with the environment, even on the the internal dynamical processes or from the
basis of a complete description of the character- interaction between different dynamical
istics of the body, of the control rules, and of the processes.
environment. As an example which illustrates how complex
cognitive skills can emerge from the interaction
between a simple agent/body/environmental
Behavior and Cognition as Phenomena dynamic and a simple agent’s internal dynamic,
Originating from the Interaction consider the case of a wheeled robot placed in a
Between Coupled Dynamical Processes maze environment (Fig. 4) which has been
trained to (a) produce a wall-following behavior
Up to this point, we restricted our analysis to the which allows the robot to periodically visit and
dynamics originating from the agent’s control revisit all environmental areas, (b) identify a tar-
system, agents’ body, and environmental interac- get object constituted by a black disk which is
tions. However, the body of an agent, its control placed in a randomly selected position of the
system, and the environment might have their environment for a limited time duration, and
own dynamics (dotted arrows in Fig. 1). For the (c) recognize the location in which the target
sake of clarity, we will refer to the dynamical object was previously found every time the
processes occurring within the agent control sys- robot revisits the corresponding location
tem, within the agent body, or within the envi- (Gigliotta and Nolfi 2008).
ronment as internal dynamics and to the The robot has infrared sensors (which provide
dynamics originating from the agent/body/envi- information about nearby obstacles), light sen-
ronmental interaction as external dynamics. In sors (which provide information about the light
cases in which agents’ body, agents’ control sys- gradient generated by the light bulb placed in the
tem, or the environment have their own dynam- central corridor), ground sensors (which detect
ics, behavior should be characterized as a the color of the ground), two motors (which con-
property emerging from the combination of sev- trol the desired speed of the two corresponding
eral coupled dynamical processes. wheels), and one additional output units which
754 Embodied and Situated Agents, Adaptive Behavior in
0.55 LF
LR
0.5
0.45 a s
I2 OR
0.4
b
c NO
0.35
0.3
d
OFR
0.25
0.7 0.75 0.8 0.85 0.9
I1
Embodied and Situated Agents, Adaptive Behavior point attractors in the robot’s internal dynamics
in, Fig. 5 The state of the two internal neurons (i1 and i2) corresponding to five types of sensory states experienced
of the robot recorded for 330 s while the robot performs by the robot when it detects a light in its frontal side (LF), a
about five laps of the environment. The s, a, b, c, and light on its rear side (LR), an obstacle on its right and
d labels indicate the internal states corresponding to five frontal side (OFR), an obstacle on its right side (OR), and
different positions of the robot in the environment shown no obstacles and no lights (NO)
in Fig. 4. The other labels indicate the position of the fixed
should be turned on when the robot revisits the different locations of the robot in the environ-
environmental area in which the black disk was ment (Fig. 5).
previously found. The robot’s controller consists As we mentioned above, the ability to gener-
of a three-layer neural network which includes a ate this form of representation which allows the
layer of sensory neurons (which encode the state robot to solve its adaptive problem originates
of the corresponding sensors), a layer of motor from the coupling between a simple robot’s inter-
neurons which encode the state of the actuators, nal dynamics and a simple robot/body/environ-
and a layer of internal neurons which consist of mental dynamics. The former dynamics is
leaky integrators operating at tunable timescale characterized by the fact that the state of the two
(Beer 1995; Gigliotta and Nolfi 2008). The free internal neurons tends to move slowly toward
parameters of the robot’s neural controllers (i.e., different fixed point attractors, in the robot’s
the connection weights and the time constant of internal dynamics, which correspond to different
the internal neurons which regulate the time rate types of sensory states exemplified in Fig. 5. The
at which these neurons change their state over- latter dynamics originate from the fact that dif-
time) were adapted through an evolutionary tech- ferent types of sensory states last for different
nique (Nolfi and Floreano 2000). time durations and alternate with a given order
By analyzing the evolved robot, the authors while the robot moves in the environment. The
observed how they are able to generate a spatial interaction between these two dynamical pro-
representation of the environment and of their cesses leads to a transient dynamics of agents’
location in the environment while they are situ- internal state which moves slowly toward the
ated in the environment itself. Indeed, while the current fixed point attractor without never fully
robot travel by performing different laps of the reaching it (thus preserving information about
environment (see Fig. 4, right), the states of the previously experienced sensory states, the time
two internal neurons converge on a periodic limit duration of these states, and the order with which
cycle in which different states correspond to they have been experienced). The coupling
Embodied and Situated Agents, Adaptive Behavior in 755
between the two dynamical processes originates reuse of preexisting capabilities (Marocco and
from the fact that the free parameters which reg- Nolfi 2007); and it allows agents to generalize
ulate the agent/environmental dynamics (e.g., the their skills in new task/environmental conditions
trajectory and the speed with which the robot (Nolfi 2005).
moves in the environment) and the agent’s inter- An exemplification of how the multilevel and
nal dynamics (e.g., the direction and the speed multi-scale organization of behavior allows
with which the internal neurons change their agents to generalize their skill in new environ-
state) have been coadapted and co-shaped during mental conditions is represented by the experi-
the adaptive process. ments carried out by Baldassarre et al. (2006) in
For related works which show how navigation which the authors evolved the control system of a
and localization skills might emerge from the group of robots assembled into a linear structure
coupling between the agent’s internal and exter- (Fig. 7) for the ability to move in a coordinated
nal dynamics, see Tani and Fukumura (1997). For manner and for the ability to display a coordi-
other works addressing other behavioral/cogni- nated light-approaching behavior.
tive capabilities, see Beer (2003) for what con- Each robot (Mondada et al. 2004) consists of a
cerns categorization, Goldenberg et al. (2004) mobile base (chassis) and a main body (turret)
and Slocum et al. (2000) for what concerns selec- that can rotate with respect to the chassis along
tive attention, and Sugita and Tani (2005) for the vertical axis. The chassis has two drive mech-
what concerns language and compositionality. anisms that control the two corresponding tracks
and teethed wheels. The turret has one gripper,
which allows robots to assemble together and to
Behavior and Cognition as Phenomena grasp objects, and a motor controlling the rotation
with a Multilevel and Multi-scale of the turret with respect to the chassis. Robots
Organization are provided with a traction sensor, placed at the
turret-chassis junction, that detects the intensity
Another fundamental feature that characterizes and the direction of the force that the turret exerts
behavior is the fact that it is a multilayer system on the chassis (along the plane orthogonal to the
with different levels of organizations extending at vertical axis) and light sensors. Given that the
different timescales (Baldassarre et al. 2006; orientations of individual robots might vary and
Keijzer 2001). More precisely, as exemplified in given that the target light might be out of sight,
Fig. 6, the behavior of an agent or of a group of robots need to coordinate to choose a common
agents involves both lower- and higher-level direction of movement and to change their direc-
behaviors which extend for shorter or longer tion as soon as one or few robots start to detect a
time spans, respectively. Lower-level behaviors light gradient.
arise from few agent/environmental interactions Evolved individuals show the ability to nego-
and short-term internal dynamical processes. tiate a common direction of movement and by
Higher-level behaviors, instead, arise from the approaching light targets as soon as a light gradi-
combination and interaction of lower-level ent is detected. By testing evolved robots in dif-
behaviors and/or from long-term internal dynam- ferent conditions, the authors observed that they
ical processes. are able to generalize their skills in new condi-
The multilevel and multi-scale organization of tions and also to spontaneously produce new
agents’ behavior plays important roles: it is one behaviors which have not been rewarded during
of the factors which allow agents to produce the evolutionary process. More precisely, groups
functionally useful behavior without necessarily of assembled robots display a capacity to gener-
developing dedicated control mechanisms alize their skills with respect to the number of
(Brooks 1991a, b; Nolfi 2005); it might favor robots which are assembled together and to the
the development of new behavioral and/or cog- shape formed by the assembled robots. More-
nitive skills, thanks to the recombination and over, when the evolved controllers are embodied
756 Embodied and Situated Agents, Adaptive Behavior in
behavior
behavior
behavior
Environment
Body
behavior
behavior
behavior
behavior
Control
system
behavior
Embodied and Situated Agents, Adaptive Behavior higher-level behaviors which arise from the combination
in, Fig. 6 A schematic representation of multilevel and and interaction between lower-level behaviors and which
multi-scale organization of behavior. The behaviors extend over longer time spans. The arrows which go from
represented in the inner circles represent elementary higher-level behavior toward lower levels indicate the fact
behaviors which arise from fine-grained interactions that the behaviors currently exhibited by the agents later
between the control system, the body, and the environ- affect the lower-level behaviors and/or the fine-grained
ment and which extend over limited time spans. The interaction between the constituting elements (agent’s
behaviors represented in the external circles represent control system, agent’s body, and the environment)
Embodied and Situated Agents, Adaptive Behavior in, Fig. 7 Left, four robots assembled into a linear structure.
Right, a simulation of the robots shown in the left part of the figure
in eight robots assembled so to form a circular all these behavioral skills allows the robots to
structure and situated in the maze environment reach the light target even in large maze environ-
shown in Fig. 8, the robots display an ability to ments, i.e., even in environmental conditions
collectively avoid obstacles, to rearrange their which are rather different from the conditions
shape so to pass through narrow passages, and that they experienced during the training process
to explore the environment. The ability to display (Fig. 8).
Embodied and Situated Agents, Adaptive Behavior in 757
The combination and the interaction between target cannot be detected. This behavior
these three behaviors produce the following emerges from the combination of the coordi-
higher-level collective behaviors that extend nated-motion behavior and the coordinated-
over a longer time span: obstacle-avoidance behavior which ensures
that the assembled robots can move in the
1. A coordinated-motion behavior which con- environment without getting stuck and with-
sists in the ability of the robots to negotiate a out entering into limit cycle trajectories.
common direction of movement and to keep 2. A shape-rearrangement behavior which con-
moving along such direction by compensating sists in the ability of the assembled robots to
further misalignments originating during dynamically adapt their shape to the current
motion. This behavior emerges from the com- structure of the environment so to pass through
bination and the interaction of the narrow passages especially when the passages to
conformistic behavior (which plays the main be negotiated are in the direction of the light
role when robots are misaligned) and the gradient. This behavior emerges from the com-
move-forward behavior (which plays the bination and the interaction between
main role when robots are aligned). coordinated-motion and coordinated-light-
2. A coordinated-light-approaching behavior approaching behaviors mediated by the effects
which consists in the ability of the robots to produced by relative differences in motion
coordinately move toward a light target. This between robots resulting from the execution of
behavior emerges from the combination of the different motor actions and/or from differences
conformistic and the move-forward and the in the collisions. The fact that the shape of the
phototaxis behaviors (which is triggered assembled robots adapt to the current environ-
when the robots detect a light gradient). The mental structure so to facilitate the overcoming
relative importance of the three control rules of narrow passages can be explained by consid-
which lead to the three corresponding behav- ering that collisions produce a modification of
iors depends both on the strength of the the shape which affects on particular the relative
corresponding triggering condition (i.e., the position of the colliding robots.
extent of lack of traction forces, the intensity
of traction forces, and the intensity of the light The combination and the interaction of all
gradient, respectively) and on priority rela- these behavior lead to a still higher-level
tions among behaviors (i.e., the fact that the behavior:
conformistic behavior tends to play a stronger
role than the phototaxis behavior). 1. A collective-navigation behavior which con-
3. A coordinated-obstacle-avoidance behavior sists in the ability of the assembled robots to
which consists in the ability of the robots to navigate toward the light target by producing
coordinately turn to avoid nearby obstacles. coordinated movements, exploring the envi-
This behavior arises as the result of the com- ronment, passing through narrow passages,
bination of the obstacle avoidance and the and producing a coordinated-light-
conformistic and the move-forward behaviors. approaching behavior (Fig. 8).
The combination and the interaction between This analysis illustrates two important mech-
these behaviors lead to the following higher-level anisms which explain the remarkable generaliza-
collective behaviors that extend over still longer tion abilities of these robots. The first mechanism
time spans: consists in the fact that the control rules which
regulate the interaction between the agents and
1. A collective-exploration behavior which con- the environment so to produce certain behavioral
sists in the ability of the robots to visit differ- skills in certain environmental conditions will
ent areas of the environment when the light produce different but related behavioral skills in
Embodied and Situated Agents, Adaptive Behavior in 759
other environmental conditions. In particular, the agents’ control system, and the environment
control rules which generate the behaviors #5 and lead to behavioral and cognitive skills and how
#6 for which evolving robots have been evolved such skills have a multilevel and multi-scale
in an environment without obstacles also produce organization in which the interaction between
behavior #7 in an environment with obstacles. lower-level skills leads to the emergence of
The second mechanism consists in the fact that higher-level skills. However, higher-level skills
the development of certain behaviors at a given also affect lower-level skills up to the fine-
level of organization which extend for a given grained interaction between the constituting ele-
time span will automatically lead to the exhibi- ments (agents’ body, agents’ control system, and
tion of related higher-level behaviors extending environment). More precisely, the behaviors,
at longer time spans which originate from the which originate from the interaction between
interactions from the former behaviors (even if the agent and the environment and from the inter-
these higher-level behaviors have not been action between lower-level behaviors, later affect
rewarded during the adaptation process). In par- the lower-level behaviors and the interaction
ticular, the combination and the interaction of from which they originate. These bidirectional
behaviors #5, #6, and #7 (which have been influences between different levels of organiza-
rewarded during the evolutionary process or tion can lead to circular causality (Kelso 1995)
which arise from the same control rules which where high-level processes act as independent
lead to the generation of rewarded behaviors) entities which constraint the lower-level pro-
automatically lead to the production of behaviors cesses from which they originate.
#8, #9, and #10 (which have not been rewarded). One of the most important effects of these
Obviously, there is no warranty that the new top-down influences is that the behavior
behaviors obtained as a result of these generali- exhibited by an agent constraint the type of sen-
zation processes will play useful functions. How- sory patterns that the agent will experience later
ever, the fact that these behaviors are related to on (i.e., constraint the fine-grained agent/envi-
the other functional behavioral skills implies that ronmental interactions which determine the
the probabilities that these new behaviors will behavior that will be later exhibited by the
play useful functions are significant. agent). Since the complexity of the problem
These generalization mechanisms can also be faced by an agent depends on the sensory infor-
exploited by agents during their adaptive process mation experienced by the agent itself, these
to generate behavioral skills which play new top-down influences can be exploited in order to
functionalities and which emerge from the com- turn hard problems into simple ones.
bination and the interaction between preexisting One neat demonstration of this type of phe-
behavioral skills playing different functions. nomena is given by the experiments conducted by
Indeed, by analyzing the evolutionary course of Nolfi and Marocco (2002) in which a simulated
another series of experiments, we observed how finger robot with six degree of freedom provided
the development of new behavioral capacities with sensors of its joint positions and with rough
often creates the adaptive basis for the develop- touch sensors is asked to discriminate between
ment of further skills that are produced by reusing cubic and spherical objects varying in size. The
and recombining previously developed skills problem is not trivial since, in general terms, the
(De Greef and Nolfi 2010). sensory patterns experienced by the robot do not
provide clear regularities for discriminating
between the two types of objects. However, the
On the Top-Down Effect from Higher type of sensory states which are experienced by
to Lower Levels of Organization the agent also depends on the behavior previously
exhibited by the agent itself – agents exhibiting
In the previous sections, we have discussed how different behaviors might face simpler or harder
the interactions between the agents’ body, the problems. By evolving the robots in simulation
760 Embodied and Situated Agents, Adaptive Behavior in
for the ability to solve this problem and by ana- their strategy to solve their adaptive problems
lyzing the complexity of the problem faced by within a large number of potentially alternative
robots of successive generations, the authors solutions. This choice is motivated by the follow-
observed that the evolved robot manages to ing considerations:
solve their adaptive problem on the basis of sim-
ple control rules which allow the robot to 1. These methods allow agents to identify the
approach the object and to move following the behavioral and cognitive skills which should
surface of the object from left to right, indepen- be possessed, combined, and integrated so to
dently from the object shape. The exhibition of solve the given problem. In other words, these
this behavior in interaction with objects charac- methods can come up with effective ways of
terized by a smooth or irregular surface (in the decomposing the overall required skill into a
case of spherical or cubic objects, respectively) collection of simpler lower-level skills. Indeed,
ensures that the same control rules lead to two as we showed in the previous section, evolu-
types of behaviors depending on the type of the tionary adaptive techniques can discover ways
object. These behaviors consist in following the of decomposing the high-level requested skill
surface of the object and then moving away from into lower-level behavioral and cognitive skills
the object in the case of spherical objects and in to find solutions which are effective and parsi-
following the surface of the object by getting monious, thanks to the exploitation of proper-
stuck in a corner in the case of cubic objects. ties emerging from the interaction between
The exhibition of these two behaviors allows the lower-level processes and skills and thanks to
agent to experience rather different proprioceptor the recruitment of previously developed skills
states as a consequence of having had interacted for performing new functions. In other words,
with spherical or cubic object which nicely these methods release the designer from the
encodes the regularities which are necessary to burden of deciding how the overall skill should
differentiate the two types of objects. be divided into a set of simpler skills and how
For other examples which show how adaptive these skills should be integrated. More impor-
agents can exploit the fact that behavioral and tantly, these methods can come up with solu-
cognitive processes which arise from the interac- tions exploiting emergent properties which
tion between lower-level behaviors or between would be hard to design (Harvey 2000; Nolfi
the constituting elements later affect these lower- and Floreano 2000).
level processes, see Beer (2003), Nolfi (2002), 2. These methods allow agents to identify how a
and Scheier et al. (1998). given behavioral and cognitive skill can be
produced, i.e., the appropriate fine-grained
characteristics of agents’ body structure and
Adaptive Methods control rules regulating the agent/environmen-
tal interaction. As for the previous aspect, the
In this section, we briefly review the methods advantage of using adaptive techniques lies not
through which artificial embodied and situated only in the fact that the experimenter is released
agents can develop their skill autonomously from the burden of designing the fine-grained
while they interact at different levels of organi- characteristics of the agents but also in the fact
zation with the environment and eventually with that adaptation might prove more effective than
other agents. These methods are inspired by the human design due to the inability of an external
adaptive process observed in nature: evolution, observer to foresee the effects of a large num-
maturation, development, and learning. ber of nonlinear interactions occurring at dif-
We will focus in particular on self-organized ferent levels of organization.
adaptive methodologies in which the role of the 3. These methods allow agents to adapt to varia-
experimenter/designer is reduced to the mini- tions of the task, of the environment, and of the
mum and in which the agents are free to develop social conditions.
Embodied and Situated Agents, Adaptive Behavior in 761
Gen. n
Current approaches, in this respect, can be automatically evaluated. In cases in which this
grouped into two families which will be illus- methodology is applied to collective behaviors,
trated in the following subsections and which agents are evaluated in groups which might be
include evolutionary robotics methods and devel- heterogeneous or homogeneous (i.e., might con-
opmental robotics methods. sist of agents who differ not with respect to their
genetic and phenotypic characteristics). The fit-
test individuals (those having higher fitness) are
Evolutionary Robotics Methods allowed to reproduce by generating copies of
their genotype with the addition of changes intro-
Evolutionary robotics (Floreano et al. 2008; Nolfi duced by some genetic operators (e.g., mutations,
and Floreano 2000) is a method which allows to exchange of genetic material). This process is
create embodied and situated agents able to adapt repeated for a number of generations until an
to their task/environment autonomously through individual or a group of individuals is born
an adaptive process inspired by natural evolution which satisfies the performance level set by the
(Holland 1975) and, eventually, through the com- user.
bination of evolutionary, developmental, and The process that determines how a genotype
learning processes. (i.e., typically a string of binary values) is turned
The basic idea goes as follows (Fig. 9). An into a corresponding phenotype (i.e., a robot with
initial population of different artificial genotypes, a given morphology and control system) might
each encoding the control system (and possibly consist of a simple one-to-one mapping or of a
the morphology) of an agent, is randomly created. complex developmental process. In the former
Each genotype is translated into a corresponding case, many of the characteristics of the phenotyp-
phenotype (i.e., a corresponding agent) which is ical individual (e.g., the shape of the body, the
then left free to act (move, look around, manipu- number and position of the sensors and of the
late the environment, etc.) while its performance actuators, and in some case the architecture of
(fitness) with respect to a given task is the neural controller) are predetermined and
762 Embodied and Situated Agents, Adaptive Behavior in
fixed, and the genotype encodes a vector of free number and type of sensors, body shape, archi-
parameters (e.g., the connection weights of the tecture, and connection weights of the neural
neural controller Nolfi and Floreano 2000). In the controller) which by interacting between them-
latter case, the genotype encode a set of rules that selves and with the environment will produce the
determine how the body structure and the control required skill. In the latter case, the adaptive
system of the individual grow during an artificial process leads to the identification of the lower-
developmental process. Through these types of level skills (at different levels of organization)
indirect developmental mappings, most of the which are necessary to produce the required
characteristics of the phenotypical robot can be high-level skill, the identification of the way in
encoded in the genotype and subjected to the which these lower-level skills should be com-
evolutionary adaptive process (Nolfi and bined and integrated and (as for the formed
Floreano 2000; Pollack et al. 2001). Finally, in case) the identification of the fine-grained fea-
some cases, the adaptation process might involve tures of the agent which, in interaction with the
both an evolutionary process that regulates how physical and social environment, will produce the
the characteristics of the robots vary phylogenet- required behavioral or cognitive skills.
ically (i.e., throughout generations) and a devel-
opmental/learning process which regulates how
the characteristics of the robots vary ontogeneti- Developmental Robotics Methods
cally (i.e., during the phase in which the robots
act in the environment Nolfi and Floreano 1999). Developmental robotics (Asada et al. 2001;
Evolutionary methods can be used to allow Brooks et al. 1998; Lungarella et al. 2003), also
agents to develop the requested behavioral and known as epigenetic robotics, is a method for
cognitive skills from scratch (i.e., starting from developing embodied and situated agents that
agents which do not have any behavioral or cog- adapt to their task/environment autonomously
nitive capability) or in an incremental manner through processes inspired by biological devel-
(i.e., starting from pre-evolved robots which opmental and learning processes.
already have some behavioral capability which Evolutionary and developmental robotics
consists, for example, in the ability to solve a methods share the same fundamental assump-
simplified version of the adaptive problem). tions and also present differences for what con-
The fitness function which determines cerns the way in which they are realized and the
whether an individual will be reproduced or not type of situations in which they are typically
might also include, in the addition to a component applied. For what concerns the former aspect,
that scores the performance of the agent with unlike evolutionary robotics methods which
respect to a given task, additional task-indepen- operate on long phylogenetic timescales, devel-
dent components. These additional components, opmental methods typically operate on short
in fact, can lead to the development of behavioral ontogenetic timescales. For what concerns the
skills which are not necessarily functional but latter aspects, unlike evolutionary methods
which can favor the development of functional which are usually used to develop behavioral
skills later on (Prokopenko et al. 2006). and cognitive skills from scratch, developmental
Evolutionary methods can allow agents to methods are typically adopted to model the
develop low-level behavioral and cognitive skills development of complex developmental and cog-
which have been previously identified by the nitive skills from simpler preexisting skills which
designer/experimenter, which might later be represent prerequisites for the development of the
combined and integrated in order to realize the required skills.
high-level requested skill or directly to develop At the present stage, developmental robotics
the high-level requested skill. In the former case, does not consist of a well-defined methodology
the adaptive process leads to the identification of (Asada et al. 2001; Lungarella et al. 2003) but
the fine-grained features of the agent (e.g., rather of a collection of approaches and methods
Embodied and Situated Agents, Adaptive Behavior in 763
often addressing complementary aspects which development of social skills (Breazeal 2003) but
hopefully would be integrated in a single meth- also as facilitators for the development of indi-
odology in the future. The sections below briefly vidual cognitive and behavioral skills (Tani
summarize some of the most important method- et al. 2008). Moreover, other types of social inter-
ological aspects of the developmental robotics actions (i.e., alignment processes or social
approach. games) might lead to the development of cogni-
tive and/or behavioral skills which are generated
by a collection of individuals and which could not
The Incremental Nature be developed by a single individual robot (Steels
of the Developmental Process 2003).
behavioral and cognitive skills displayed by ability to develop effective machines) and from a
adaptive agents can be properly characterized as modeling point of view (i.e., for understanding
complex system with multilevel and multi-scale the characteristics of biological organisms).
properties resulting from a large number of inter- In particular, from an engineering point of
action at different levels of organization and view, progresses in our ability to develop adaptive
involving both bottom-up processes (in which embodied and situated agents can lead to develop-
the interaction between elements at lower levels ment of machines playing useful functionalities.
of organization leads to higher-level properties) From a modeling point of view, progresses in our
and top-down processes (in which properties at a ability to model and analyze artificial adaptive
certain level of organization later affect lower- agents can improve our understanding of the general
level properties or processes). mechanisms behind animal and human intelligence.
Finally, we briefly introduced the methods For example, the comprehension of the complex
which can be used to synthesize adaptive embod- system nature of behavioral and cognitive skills
ied and situated agents. illustrated in this entry can allow us to better define
The complex system nature of adaptive agents the notion of embodiment and situatedness which
which are embodied and situated has important represent two foundational concepts in the study of
implications which constraint the organization of natural and artificial intelligence. Indeed, although
these systems and the dynamics of the adaptive possessing a body and being in a physical environ-
process through which they develop their skills. ment certainly represent a prerequisite for consider-
For what concerns the organization of these ing an agent embodied and situated, a more useful
systems, it implies that agents’ behavioral and/or definition of embodiment (or of truly embodiment)
cognitive skills (at any stage of the adaptive pro- can be given in terms of the extent to which a given
cess) cannot be traced back to anyone of the three agent exploits its body characteristics to solve its
foundational elements (i.e., the body of the adaptive problem (i.e., the extent to which its body
agents, the control system of the agents, and the structure is adapted to the problem to be solved, or,
environment) in isolation but should rather be in other words, the extent to which its body performs
characterized as properties which emerge from morphological computation). Similarly, a more use-
the interactions between these three elements and ful definition of situatedness (or truly situatedness)
the interaction between behavioral and cognitive can be given in terms of the extent to which an agent
properties emerging from the former interactions exploits its interaction with the physical and social
at different levels of organizations. Moreover, it environment and the properties originating from this
implies that complex behavioral or cognitive interaction to solve its adaptive problem. For the
skills might emerge from the interaction between sake of clarity, we can refer to the former definition
simple properties and processes. of the terms (i.e., possessing a physical body and
For what concerns agents’ adaptive process, it being situated in a physical environment) as
implies that the development of new complex embodiment and situatedness in a weak sense and
skills does not necessarily require the develop- to the latter definition as embodiment and
ment of new complex morphological features or situatedness in a strong sense.
new complex control mechanisms. Indeed, new
complex skills might arise from the addition of
new simple features or new simple control rules
Bibliography
which, in interaction with the preexisting features
and processes, might produce the required new Asada M, MacDorman K, Ishiguro H, Kuniyoshi Y (2001)
behavioral or cognitive skills. Cognitive developmental robotics as a new paradigm
The study of adaptive behavior in artificial for the design of humanoid robots. Robot Auton Syst
37:185–193
agents which has been reviewed in this entry
Baldassarre G, Parisi D, Nolfi S (2006) Distributed coor-
has an important implication both from an engi- dination of simulated robots based on self-organisa-
neering point of view (i.e., for progressing in our tion. Artif Life 3(12):289–311
Embodied and Situated Agents, Adaptive Behavior in 765
Beer RD (1995) A dynamical systems perspective on from intelligent robots to artificial life, vol III. AAI
agent-environment interaction. Artif Intell Books, Ontario
72:173–215 Holland J (1975) Adaptation in natural and artificial sys-
Beer RD (2003) The dynamics of active categorical per- tems. University of Michigan Press, Ann Arbor
ception in an evolved model agent. Adapt Behav Keijzer F (2001) Representation and behavior. MIT Press,
11:209–2 London
Berthouze L, Lungarella M (2004) Motor skill acquisition Kelso JAS (1995) Dynamics patterns: the self-organiza-
under environmental perturbations: on the necessity of tion of brain and behaviour. MIT Press, Cambridge
alternate freezing and freeing. Adapt Behav Lungarella M, Metta G, Pfeifer R, Sandini G (2003)
1(1):47–63 Developmental robotics: a survey. Connect Sci
Bongard JC, Paul C (2001) Making evolution an offer it 15:151–190
can’t refuse: morphology and the extradimensional Marocco D, Nolfi S (2007) Emergence of communication
bypass. In: Keleman J, Sosik P (eds) Proceedings of in embodied agents evolved for the ability to solve a
the sixth European conference on artificial life. Lec- collective navigation problem. Connect Sci
ture notes in artificial intelligence, vol 2159. Springer, 19(1):53–74
Berlin Massera G, Cangelosi A, Nolfi S (2007) Evolution of
Breazeal C (2003) Towards sociable robots. Robot Auton prehension ability in an anthropomorphic neurorobotic
Syst 42(3–4):167–175 arm. Front Neurorobot 1(4):1–9
Brooks RA (1991a) Intelligence without reason. In: McGeer T (1990) Passive walking with knees. In: Pro-
Mylopoulos J, Reiter R (eds) Proceedings of 12th ceedings of the IEEE conference on robotics and auto-
international joint conference on artificial intelligence. mation, vol 2, pp 1640–1645
Morgan Kaufmann, San Mateo Metta G, Sandini G, Natale L, Panerai F (2001) Develop-
Brooks RA (1991b) Intelligence without reason. In: Pro- ment and Q30 robotics. In: Proceedings of IEEE-RAS
ceedings of 12th international joint conference on arti- international conference on humanoid robots, pp 33–42
ficial intelligence, Sydney, pp 569–595 Mondada F, Franzi E, Ienne P (1993) Mobile robot
Brooks RA, Breazeal C, Irie R, Kemp C, Marjanovic M, miniaturisation: a tool for investigation in control algo-
Scassellati B, Williamson M (1998) Alternate essences rithms. In: Proceedings of the third international sym-
of intelligence. In: Proceedings of the fifteenth posium on experimental robotics. Kyoto
national conference on artificial intelligence (AAAI- Mondada F, Pettinaro G, Guigrard A, Kwee I, Floreano D,
98), Madison, pp 961–976 Denebourg J-L, Nolfi S, Gambardella LM, Dorigo
Chiel HJ, Beer RD (1997) The brain has a body: adaptive M (2004) Swarm-bot: a new distributed robotic con-
behavior emerges from interactions of nervous system, cept. Auton Robot 17(2–3):193–221
body and environment. Trends Neurosci 20:553–557 Nolfi S (2002) Power and limits of reactive agents. Neurol
Clark A (1997) Being there: putting brain, body and world Comput 49:119–145
together again. MIT Press, Cambridge Nolfi S (2005) Behaviour as a complex adaptive system:
De Greef J, Nolfi S (2010) Evolution of implicit and on the role of self-organization in the development of
explicit communication in a group of mobile robots. individual and collective behaviour. Complexus
In: Nolfi S, Mirolli M (eds) Evolution of communica- 2(3–4):195–203
tion and language in embodied agents. Springer, Berlin Nolfi S, Floreano D (1999) Learning and evolution. Auton
Endo I, Yamasaki F, Maeno T, Kitano H (2002) A method Robot 1:89–113
for co-evolving morphology and walking patterns of Nolfi S, Floreano D (2000) Evolutionary robotics: the
biped humanoid robot. In: Proceedings of the IEEE biology, intelligence, and technology of self-organiz-
conference on robotics and automation, Washington, ing machines. MIT Press/Bradford Books, Cambridge
DC Nolfi S, Marocco D (2002) Active perception: a sensori-
Floreano D, Husband P, Nolfi S (2008) Evolutionary motor account of object categorization. In: Hallam B,
robotics. In: Siciliano B, Oussama K (eds) Handbook Floreano D, Hallam J, Hayes G, Meyer J-A (eds) From
of robotics. Springer, Berlin animals to animats 7, Proceedings of the VII interna-
Gigliotta O, Nolfi S (2008) On the coupling between agent tional conference on simulation of adaptive behavior.
internal and agent/environmental dynamics: develop- MIT Press, Cambridge, pp 266–271
ment of spatial representations in evolving autono- Oudeyer P-Y, Kaplan F, Hafner V (2007) Intrinsic moti-
mous robots. Adapt Behav 16:148–165 vation systems for autonomous mental development.
Goldenberg E, Garcowski J, Beer RD (2004) May we have IEEE Trans Evol Comput 11(2):265–286
your attention: analysis of a selective attention task. In: Pfeifer R, Bongard J (2007) How the body shape the way
Schaal S, Ijspeert A, Billard A, Vijayakumar S, we think. MIT Press, Cambridge
Hallam J, Meyer J-A (eds) From animals to animats Pfeifer R, Iida F, Gómez G (2006) Morphological compu-
8: proceedings of the eighth international conference tation for adaptive behavior and cognition. Int Congr
on the simulation of adaptive behavior. MIT Press, Ser 1291:22–29
Cambridge Pollack JB, Lipson H, Funes P, Hornby G (2001) Three
Harvey I (2000) Robotics: philosophy of mind using a generations of coevolutionary robotics. Artif Life
screwdriver. In: Gomi T (ed) Evolutionary robotics: 7:215–223
766 Embodied and Situated Agents, Adaptive Behavior in
Prokopenko M, Gerasimov V, Tanev I (2006) Evolving Steels L (2003) Evolving grounded communication for
spatiotemporal coordination in a modular robotic sys- robots. Trends Cogn Sci 7(7):308–312
tem. In: Rocha LM, Yaeger LS, Bedau MA, Sugita Y, Tani J (2005) Learning semantic combinatoriality
Floreano D, Goldstone RL, Vespignani A (eds) Artifi- from the interaction between linguistic and behavioral
cial life X: proceedings of the tenth international con- processes. Adapt Behav 13(1):33–52
ference on the simulation and synthesis of living Tani J, Fukumura N (1997) Self-organizing internal rep-
systems. MIT Press, Boston resentation in learning of navigation: a physical exper-
Scassellati B (2001) Foundations for a theory of mind for a iment by the mobile robot Yamabico. Neural Netw
humanoid robot. PhD thesis, Department of Electrical 10(1):153–159
Engineering and Computer Science, MIT, Boston Tani J, Nolfi S (1999) Learning to perceive the world as
Scheier C, Pfeifer R, Kunyioshi Y (1998) Embedded neu- articulated: an approach for hierarchical learning in
ral networks: exploiting constraints. Neural Netw sensory-motor systems. Neural Netw 12:1131–1141
11:1551–1596 Tani J, Nishimoto R, Namikawa J, Ito M (2008) Co-
Schmidhuber J (2006) Developmental robotics, optimal developmental learning between human and humanoid
artificial curiosity, creativity, music, and the fine arts. robot using a dynamic neural network model. IEEE
Connect Sci 18(2):173–187 Trans Syst Man Cybern B Cybern 38:1
Schmitz A, Gómez G, Iida F, Pfeifer R (2007) On the Varela FJ, Thompson E, Rosch E (1991) The embodied
robustness of simple speed control for a quadruped mind: cognitive science and human experience. MIT
robot. In: Proceeding of the international conference Press, Cambridge
on morphological computation, Venice van Gelder TJ (1998) The dynamical hypothesis in cogni-
Slocum AC, Downey DC, Beer RD (2000) Further exper- tive science. Behav Brain Sci 21:615–628
iments in the evolution of minimally cognitive behav- Vaughan E, Di Paolo EA, Harvey I (2004) The evolution
ior: from perceiving affordances to selective attention. of control and adaptation in a 3D powered passive
In: Meyer J, Berthoz A, Floreano D, Roitblat H, Wil- dynamic walker. In: Pollack J, Bedau M, Husband P,
son S (eds) From animals to animats 6. Proceedings of Ikegami T, Watson R (eds) Proceedings of the ninth
the sixth international conference on simulation of international conference on the simulation and synthe-
adaptive behavior. MIT Press, Cambridge sis of living systems. MIT Press, Cambridge
the same links is a loop. A cluster is a set of
Interaction-Based Computing connected nodes. A graph can be composed by
in Physics one cluster (a connected graph) or more than
one (a disconnected graph). A tree is a
Franco Bagnoli connected graph without loops.
Department Physics and Astronomy and CSDC, Mean field An approximate technique for com-
University of Florence, Florence, Italy puting the value of the observables of an
extended system, neglecting correlations among
parts. If necessary, the dynamics is first approx-
Article Outline imated by a stochastic process. In its simpler
version, the probability of a state of the system
Glossary is approximated by the product of the probability
Definition of each component, neglecting correlations.
Introduction: Physics and Computers Since the state of two components that depend
From Trajectories to Statistics and Back on a common “ancestor” (that interact with a
Artificial Worlds common node) is in general not uncorrelated,
Future Directions and this situation corresponds to an interaction
Bibliography graph with loops, the simplest mean-field
approximation consists in replacing the graph
Glossary or the lattice of interactions with a tree.
Monte Carlo A method for producing stochastic
Correlation The correlation between two vari- trajectories in the state space designed in such a
ables is the difference between the joint prob- way that the time-averaged probability distri-
ability that the two variables take some values bution is the desired one.
and the product of the two probabilities (which Nonlinear system A system composed by parts
is the joint probability of two uncorrelated vari- whose combined effects are different from the
ables), summed over all possible values. In an sum of the effects of each part.
extended system, it is expected that the corre- Percolation The appearance of a “giant compo-
lation among parts diminishes with their dis- nent” (a cluster that contains essentially all nodes
tance, typically in an exponential manner. or links) in a graph or a lattice, after adding or
Critical phenomenon A condition for which an removing nodes or links. Below the percolation
extended system is correlated over extremely threshold, the graph is partitioned into discon-
long distances. nected clusters, none of which contains a sub-
Extended system A system composed by many stantial fraction of nodes or links, in the limit of
parts connected by a network of interactions an infinite number of nodes and links.
that may be regular (lattice) or irregular (graph). Probability distribution The probability of
Graph, lattice, tree A graph is set of nodes finding a system in a given state, for all the
connected by links, oriented or not. If the possible states.
graph is translationally invariant (it looks the State of a system A complete characterization of
same when changing nodes), it is called a a system at a given time, assigning or measuring
(regular) lattice. A disordered lattice is a lattice the positions, velocities, and other dynamical
with a fraction of removed links or nodes. An variables of all the elements (sometimes called
ordered set of nodes connected by links is a configuration). For completely discrete sys-
called a path. A closed path not passing on tems (cellular automata) of finite size, the state
of the system is just a set of integer numbers, to scientists. Actually, computers had already
and therefore the state space is numerable. been used during the last years of the war for
State space The set of all possible states of a performing computations about a specific field
system. of physics: the atomic bomb (Harlow and
Trajectory A sequence of states of a system, Metropolis 1983; Metropolis et al. 1980).
labeled with the time, i.e., a path in the state space. Up to then, the only available tool, except
experiments, was paper and pencil. Starting with
Newton and Leibnitz, humans discovered that
Definition
continuous mathematics (i.e., differential and
integral calculus) allowed to derive many conse-
Theoretical physics investigation is based on build-
quences of a given hypothesis just by the manip-
ing models of reality. Due to our limited cognitive
ulation of symbols. It seemed natural to express all
capabilities, In order to understand a phenomenon,
quantities (e.g., time, space, mass) as continuous
we need to represent it using a limited amount of
variables. Notice however that the idea of a con-
symbols. Clearly, it is generally impossible to
tinuous number is not at all “natural”: one has to
reproduce all aspects of a physical system with a
learn how to deal with it, while (small) integer
simple model. However, even using simple build-
numbers can be used and manipulated (added,
ing blocks, one can obtain systems whose behavior
subtracted) by illiterate humans and also by
is quite complex. Therefore, most of the job of
many animals. A point which is worth to be
theoretical physics is that of identifying and inves-
stressed is that any computation refers to a
tigating simple models that reproduce the main
model of certain aspects of reality considered
aspects of complex phenomena. Although some
most important, while others are assumed to be
progress can be done analytically, most of models
not important.
are also investigated by computer simulations.
Most of the investigations in physics only
Computers have changed the way a physical
using paper and pencils are limited to almost-
model is studied. Computers may be used to calcu-
linear systems, or systems whose effective num-
late the properties of a very complicated model
ber of variables is quite small. On the other hand,
representing a real system or to investigate experi-
most of the naturally occurring phenomena can be
mentally what are the essential ingredients of a
successfully modeled only using nonlinear ele-
complex phenomenon. In order to carry out these
ments. Therefore, a great deal of the precomputer
explorations, several basic models have been devel-
physics is essentially linear physics, although
oped, which are now used as building blocks for
astronomers (like other scientists) used to inte-
performing simulations and designing algorithms in
grate numerically, by hand, the nonlinear equa-
many fields, from chemistry to engineering, from
tions of gravitation, in order to compute the
natural sciences to psychology. Rather than being
trajectories of planets. This computation, how-
derived from some fundamental law of physics,
ever, was so cumbersome that no playing with
these blocks constitute artificial worlds still to be
trajectories was possible.
completely explored.
How can a computer solve physical problems?
In this entry, we shall first present a pathway
While analog computers have been used for inte-
from Newton’s laws to cellular automata and
grating differential equations, the much more flex-
agent-based simulations, showing (some) compu-
ible digital computers are deterministic discrete
tational approaches in classical physics. Then, we
systems. The way of working of a (serial) com-
shall present some examples of artificial worlds.
puter is that of a very fast automaton that manip-
ulates data following a program.
Introduction: Physics and Computers In order to use computers as fast calculators,
scientists ported and adapted existing numerical
Some 60 years ago, shortly after the end of the algorithms and developed new ones. This implied
Second World War, computers become available the use of techniques able to approximate the
Interaction-Based Computing in Physics 769
In the following, we shall try to elucidate some the probability of rain or in the expected wind
aspects of the interplay between computer and velocity, not in forecasting the trajectories of all
physics. In section “From Trajectories to Statistics molecules of air. Similarly, in psychology, one is
and Back,” we shall illustrate possible logic path- interested in the expected behavior of an individ-
ways (in classical mechanics) leading from New- ual, not in computing the activity of all neurons in
ton’s equations to research fields that use his brain.
computers as investigative tool, like agent-based Since physics is the oldest discipline that has
investigations of human societies. In section been quantified into equations, it may be illumi-
“Artificial Worlds,” we shall try to present suc- nating to follow some of the paths followed by
cinctly some example of artificial systems that are researchers to reduce the complexity of a high-
still active research topics in theoretical physics. dimensional problem to something more manage-
able or at least simpler to be simulated on a
computer.
From Trajectories to Statistics and Back In particular, we shall see that many approach-
es consist in projecting the original space onto a
The outline of this section is the following. Phys- limited number of dimensions, corresponding to
ics is often denoted the most fundamental science, the observables that vary in a slow and smooth
and one may think that given powerful enough way and assuming that the rest of the dynamics is
computers, one should be able to reconstruct any eventually approximated by “noise.” The noise
experimental situation simply implementing the can be so small, compared to the macroscopic
fundamental laws of physics. I would like to show observables that it can be neglected. In such
that any investigation is based on models, requir- cases, one has a deterministic, low-dimensional
ing approximations and additional assumptions, dynamical system, for instance, the usual models
and that any change of scale implies a change of for rigid bodies, planets, etc. In the opposite case,
model. However, an important suggestion from the resulting system is stochastic, and one is inter-
physics is that similar models can be used to ested in computing the average values of observ-
interpret different situations, and therefore, the ables, over the probability distribution of the
associated computational techniques can be projected system.
reused in different contexts. We shall follow this However, the computation of the probability
line from Newton’s equations to agent-based distribution may be hard, and so one seeks to find
modeling. a way of producing artificial trajectories, in the
Let us assume that a working model of the projected space, designed in such a way that their
reality can be built using a set of dynamical equa- probability distribution is the desired one. So
tions, for instance, those of classical mechanics. doing, the problem reduces to the computation
We shall consider the model of a system formed of the time-averaged values of slow observables.
by many particles, like a solid or a fluid. The state For the rest of this section, please make reference
of the resulting system can be represented as a to Fig. 1.
point in a high-dimensional space, since it is given
by all coordinates and velocities of all particles. Newton’s Laws
The evolution of the system is a trajectory in such The success of Newton in describing the motion
space. Clearly, the visualization and the investiga- of one body, subjected to a static field force (say,
tion of such a problem is challenging, even using gravitational motion of one planet, oscillation of a
powerful computers. body attached to a spring, motion of the pendu-
Moreover, even if we were able to compute lum, etc.), clearly proved the validity of his
many trajectories (in order to have an idea of approach and also the validity of using simple
fluctuations), this would not imply that we have models for dealing with natural phenomena.
understood the problem. Let us consider, for Indeed, the representation of a body as a point
instance, the meteorology: one is interested in mass, the idea of massless springs and strings,
Interaction-Based Computing in Physics 771
Interaction-Based Computing in Physics, points”, dashed boxes are topic that are not covered by
Fig. 1 Graphical illustration of the logic path followed discussion, boxes with darker frames mark topics that are
in this introduction. Boxes with double frame are “starting investigated more in details.
and the concept of force fields are all mathemati- dimensionality is an example of a scale separa-
cal idealizations of the reality. tion: the variables that describe the motion of the
The natural generalization of this approach is planets vary slowly and smoothly in time. Other
carried out in the eighteenth century by Lagrange, variables, for instance, those that describe the
Hamilton, and many others. It has brought to the oscillations of a molecule on the surface of a
mathematizing of the mechanics and the deriva- planet can be approximated by a noise term so
tion of rational mechanics. The resulting standard small that can be safely neglected. This approxi-
(or historical) way of modeling physical systems mation can also be seen as a “mean-field”
is that of using differential equations, i.e., a con- approach, for which one assumes that variables
tinuous description of time, space, and other behave not too differently from their average.
dynamical quantities. Using these techniques, one can develop models
From an abstract point of view, one is faced of many systems that result in the same mathe-
with two different options: either concentrate matical scheme: a few coupled equations. The
on systems described by a few equations resulting equations may clearly have a structure
(low-dimensional systems) or try to describe sys- quite different from that resulting from Newtonian
tems formed by many components. dynamics (technically, Hamiltonian systems).
However, the reduction of the number of vari-
Low Dimensionality ables does not guarantee the simplicity of the
Historically, the most important problem of New- resulting model. The problem of three gravita-
ton’s times was that of three bodies interacting via tional bodies cannot be split into smaller pieces,
the gravitational attraction (the Sun, the Earth, and and the computation of an accurate trajectory
the Moon). By approximating planets with point requires a computer. In general, a nonlinear sys-
masses, one gets a small number of coupled dif- tem in a space with three or more dimensions is
ferential equations. This reduction of chaotic. This implies that it may react to a small
772 Interaction-Based Computing in Physics
perturbation of the parameters or of the initial The variables that result from such operations
conditions with large variations of its trajectory. are called normal modes, because they behave
This sensibility to variation implies the impossi- independently one from the other (i.e., they cor-
bility of predicting its behavior for long times, respond to orthogonal or normal directions in the
unless one is content with a probabilistic state space). For instance, the linear model of a
description. vibrating string (with fixed ends) predicts that any
The reduction from a high-dimensional system pattern can be described as a superposition of
to a small number of equation can in general be “modes,” which are the standing oscillations
considered the result of a projection operation, with zero, one, two, etc. nodes (the harmonics).
similar to studying the dynamics of couple of However, linear systems behave in a somewhat
dancers only looking at his shadow. The problem strange way, from the point of view of thermal
is of course that of finding the right projection, for physics. Let us consider, for instance, the system
instance, that able to show that the couple is composed by two uncoupled oscillators. It is clear
formed by two distinct bodies that only occasion- that if we excite one oscillator with any amount of
ally separate. energy, it will remain confined to that subsystem.
With normal modes, the effect is the same: any
amount of energy communicated to a normal
High Dimensionality
mode remains confined to that mode, if the system
In many cases, the projection operation results in a
is completely linear. In other words, the system
system still composed by many parts. For
never forgets its initial conditions.
instance, models of nonequilibrium fluids neglect
On the contrary, the long-time behavior of
to consider the movement of the individual mole-
normal systems does not depend strongly on the
cules, but still one has to deal with the values of
initial conditions. One example is the timbre or
the pressure, density, and velocity in all points. In
“sound color” of an object. It is given by the
these cases, one is left with a high-dimensional
simultaneous oscillations on many frequencies,
problem. Assuming that the noise of the projected
but in general an object emits its characteristic
dimensions can be neglected, one can either write
sound regardless of how exactly is excited. This
down a large number of coupled equation (e.g., in
would not be true for linear systems.
modeling the vibration of a crystal) or use a con-
Since the distribution of energy to all available
tinuous approach and describe the system by
modes is one of the assumptions of equilibrium
means of partial differential equations (e.g., the
statistical mechanics, which allows us to under-
model of a fluid).
stand the usual behavior of matter, we arrived at
an unpleasant situation: linear systems, which are
Linear Systems so “easy” to be studied, cannot be used to ground
In general, high- and low-dimensional approaches statistical physics on mechanics.
can be systematically developed (with paper and
pencil) only in the linear approximation. Let us Molecular Dynamics
illustrate this point for the case of coupled differ- Nowadays, we have computers at our disposal,
ential equation: if the system is linear, one can and therefore we can simulate systems composed
write the equations using matrices and vectors. by many parts with complex interactions. On can
One can in principle find a (linear) transformation do this simply discretizing Newton’s equations of
of variables that make the system diagonal, i.e., motion so that a digital computer can approxi-
that reduces the problem to a set of uncoupled mately integrate the set of coupled differential
equations. At this point, one is left with (many) equations. This technique is sometimes called
one-dimensional independent problems. Clearly, molecular dynamics.
there may be mathematical difficulties, but the One is generally interested in computing macro-
path is clear. A similar approach (for instance, scopic quantities. These are defined as averages of
using Fourier transforms) can be used also for some function of the microscopic variables
dealing with partial differential equations. (positions, velocities, accelerations, etc.) of the
Interaction-Based Computing in Physics 773
In random walks, each step of the target particle is but is constructed heuristically from observations
independent on previous steps, due to collisions of the behavior of a real system.
with the rest of particles. Collisions, moreover, are It is therefore crucial to investigate how obser-
supposed to be uncorrelated. A more sophisti- vations are made, i.e., the analysis of a series of
cated approximation consists in keeping some time measurements. In particular, a good exercise
aspects of motion, for instance, the influence of is that of simulating a dynamical or stochastic
inertia or of external forces, still approximating system, analyze the resulting time-series data of
the rest of the world by noise (which may contain a given observable, and see if one is able to
a certain degree of correlation). This is known as reconstruct from it the relationships or the equa-
the Langevin approach, which includes the ran- tions ruling the time evolution.
dom walk as the simplest case. Langevin equa- Let us consider the experimental study of a cha-
tions are stochastic differential equations. otic, low-dimensional system. The measurements
The essence of this method relies in the assump- on this system give a time series of values that we
tion that the behavior of the various parts of the assume discrete (which is actually the case consid-
systems is uncorrelated. This assumption is vital ering experimental errors). Therefore, the output of
also for other types of approximations that will be our experiment is a series of symbols or numbers, a
illustrated in the following. Notice that in the statis- time series. Let us assume that the system is station-
tical mechanics approach, this assumption is not ary, i.e., that the sequence is statistically homoge-
required. neous in time. If the system is not extremely chaotic,
In the Langevin formulation, by averaging symbols in the sequence are correlated, and one can
over many independent realizations of the process derive the probability of observing single symbols,
(which in general is not the same of averaging couples of symbols, triples of symbols, etc. There is
over many particles that simultaneously move, a hierarchy in these probabilities, since the knowl-
due, for instance, to excluded volumes), one edge of the distribution of triples allows the compu-
obtains the evolution equation of the probability tation of the distribution of couples, and so on.
of finding a particle in a given portion of space. It can be shown that the knowledge of the
This is the Kolmogorov integral-differential equa- probability distribution of the infinite sequence is
tion that in many case can be simplified, giving a equivalent to the complete knowledge of the
differential (Fokker-Plank) equation. The diffu- dynamics. However, this would correspond to
sion equation is just the simplest case (Gardiner performing an infinite number of experiments,
1994; van Kampen 1992). for all possible initial conditions.
It is worth noticing that a similar formulation The usual investigation scheme assumes that cor-
may be developed for quantum systems: the Feyn- relations vanish beyond a certain distance, which is
man path-integral approach is essentially a equivalent to assume that the probability of observ-
Langevin formulation, and the Schroedinger equa- ing sequences longer than that distance factorize.
tion is the corresponding Fokker-Plank equation. Therefore, one tries to model the evolution of the
Random walks and stochastic differential system by a probabilistic dynamics of symbols as
equations find many applications in economics, shown in section “Probabilistic Cellular Automata.”
mainly in stock market simulations. In these Time-series data analysis can therefore be consid-
cases, one is not interested in the average behavior ered as the main experimental motivation in devel-
of the market, but rather in computing nonlinear oping probabilistic discrete models. This can be
quantities over trajectories (time series of good done heuristically comparing results with observa-
values, fluctuations, etc.). tions a posteriori or trying to extract the rules directly
from data, like in the Markov approach.
Time-Series Data
In practice, a model is never derived ab initio, by Markov Approximation
projecting the dynamics of all the microscopic The Markov approach, either continuous or dis-
components onto a limited number of dimensions, crete, also assumes that the memory of the system
Interaction-Based Computing in Physics 775
vanishes after a certain time interval, i.e., that the the elementary kinetic theory. The Boltzmann equa-
correlations in time series decay exponentially. In tion can also be obtained from the truncation of a
discrete terms, one tries to describe the process hierarchy of equations (BBGKY hierarchy) relating
under study as an automaton, with given transition multiparticle probability distributions. Therefore,
probabilities. The main problem is, given a the Boltzmann equation is similar in spirit to a
sequence of symbols, what is the simplest autom- mean-field analysis.
aton (hidden Markov chains (Rabiner 1989)) that
can generate that sequence with maximum “pre-
dictability,” i.e., with transition probabilities that Equilibrium
are nearest to zero or one? Again, it is possible to One of the biggest successes of the stochastic
derive a completely deterministic automaton, but approach is equilibrium statistical mechanics. The
in general it has a number of states equivalent to main ingredient of this approach is that of minimum
the length of the time series, so it is not general- information, which, in other words, corresponds to
izable and has no predictability (see also section the assumption: what is not known is not harmful.
“Probabilistic Cellular Automata”). On the con- By supposing that at equilibrium the probability
trary, an automaton with a very small number of distribution of the systems maximizes the informa-
nodes will have typically intermediate transition tion entropy (corresponding to a minimum of infor-
probabilities, so predictability is again low mation on the system), one is capable of deriving the
(essentially equivalent to random extraction). probability distribution itself and therefore the
Therefore, the good model is the result of an expected values of observables (ensemble averages,
optimization problem that can be studied using, see section “Ising Model”). In this way, using an
for instance, Monte Carlo techniques. explicit model, one is capable to compute the value
of the parameters that appear in thermodynamics. If
Mean Field it were possible to show that the maximum entropy
Finally, from the probabilities one can compute state is actually the state originated by the dynamics
averages of observables, fluctuations, and other of a mechanical (or quantum) system, one could
quantities called moments of the distribution. ground thermodynamics on mechanics. This is a
Actually, the knowledge of all moments is equiv- long-investigated subject, dating back to
alent to the knowledge of the whole distribution. Boltzmann, which is however not yet clarified.
Therefore, another approach is that of relating The main drawback in the derivations is about ergo-
moments at different times or different locations, dicity. Roughly speaking, a system is called ergodic
truncating the recurrences at a certain level. The if the infinite-time average of an observable over a
roughest approximation is that of truncating the trajectory coincides with its average over a snapshot
relations at the level of averages, i.e., the mean- of infinitely many replicas. For a system with fixed
field approach. It appears so natural that it is often energy and no other conserved quantities, a suffi-
used without realizing the implications of the cient condition is that a generic trajectory passes
approximations. For instances, chemical equa- near all points of the accessible phase space. How-
tions are essentially mean-field approximations ever, most systems whose behavior is “experimen-
of a complex phenomenon. tally” well approximated by statistical mechanics
are not ergodic. Moreover, another ingredient, the
Boltzmann Equation capability of quickly forgetting the information
Another similar approach is that of dividing an about initial conditions appears to be required; oth-
extended system into zones and assume that the erwise trajectories are strongly correlated, and aver-
behavior of the system in each zone is well ages over different trajectories cannot be “mixed”
described by a probability distribution. By together. This capability is strongly connected to the
disregarding correlations with other zones, one chaoticity or unpredictability of extended systems,
obtains the Boltzmann equation, with which many but unfortunately these ingredients make analytic
transport phenomena may be studied well beyond approaches quite hard.
776 Interaction-Based Computing in Physics
An alternative approach, due to Jaynes (1957), not want to revert to the original, still-more-highly
is much more pragmatic. In essence, it says: let dimensional dynamical system, which typically
design a model with the ingredients that one requires powerful computers just to be followed
thinks are important, and assume that what is not for tiny time intervals.
in the model does not affect its statistical proper- First of all, one can divide (separate) the model
ties. Compute the distribution that maximizes the into almost independent subsystems that, due to
entropy with the given constraints. Then, compare the small residual interactions (the “almost” inde-
the results (averages of observables) with experi- pendency), are at the same temperature. In the
ments (possibly, numerical ones). If they agree, typical example of a gas, the velocity components
one has captured the essence of the problem; appear into the formula of energy as additive
otherwise one has to include some other ingredi- terms, i.e., they do not interact with themselves
ents and repeat the procedure. Clearly, this or with other variables. Therefore, they can be
approach is much more general than the “dynam- studied separately giving the Maxwell distribution
ical” one, not considering trajectory or making of velocities. The positions of molecules, how-
assumptions about the energy, which is simply ever, are linked by the potential energy (except
seen as a constraint. But physicists would be in the case of an ideal gas), and so the hard part of
much more satisfied by a microscopic derivation the computation is that of generating configura-
of statistical physics. tions. Secondly, statistical mechanics guarantees
In spite of this lack of strong basis, the statis- that the asymptotic probability distribution does
tical mechanics approach is quite powerful, espe- not depend on the details of dynamics. Therefore,
cially for systems that can be reduced to the case one is free to look for the simplest dynamics still
of almost independent elements. In this situation, compatible with constraints. The Monte Carlo
the probability distribution of the system (the par- computation is just a set of recipes for generating
tition function) factorizes, and many computa- such trajectories. In many problems, this approach
tions may be performed by hand. Notice allows to reduce the (computational) complexity
however that this behavior is in strong contrast of the problem of several orders of magnitude,
to that of truly linear systems: the “almost” attri- allowing to generate artificial trajectories that
bute indicates that actually the elements interact span statistically significant configuration with
and therefore share the same “temperature.” small computational effort. In parallel with the
generation of the trajectory, one can compute the
Monte Carlo value of several observables and perform statisti-
The Monte Carlo technique was invented for com- cal analysis on them, in particular the computation
puting, with the aid of a computer, thermal aver- of time averages and fluctuations.
ages of observables of physical systems at By extension, the same terms Monte Carlo is
equilibrium. Since then, this term is often used used for the technique of generating sequences of
to denote the technique of computing the average states (trajectories) given the transition probabili-
values of observables of a stochastic system by ties, and computing averages of observables on
computing the time-average values over artificial trajectories, instead of over the probability
trajectories. distribution.
In equilibrium statistical physics, one is faced
by the problem of computing averages of observ- Stochastic Optimization
ables over the probability distribution of the sys- One of the most interesting applications of Monte
tem, and since the phase space is very high Carlo simulations concerns stochastic optimiza-
dimensional, this is in general not an easy task: tion via simulated annealing. The idea is that of
one cannot simply draw random configurations, exploiting an analogy between the status of a
because in general they are so different from those system (and its energy) and the coding of a par-
typical of the given value of the temperature, that ticular procedure with corresponding cost func-
their statistical weight is marginal. And one does tion. The goal is that of finding the best solution,
Interaction-Based Computing in Physics 777
i.e., the global minimum of the energy given the space and therefore to try creative solutions. At
constraints. Easy systems have a smooth energy low temperature, constraints become important.
landscape, shaped like a funnel, so that usual At zero temperature, the configurations with
techniques like that of always choosing the dis- lower energy are those that satisfy all constraints,
placements that locally lowers the energy if possible, or at least the largest part of them.
(gradient descent) are successful. However, In recent years, physics have dealt with
when the energy landscape is corrugated, there extremely complex problems (e.g., spin glasses
are many local minima where algorithms like (Dotsenko 1994; Mezard et al. 1987)), in which
gradient descent tend to get trapped. Methods the energy landscape is extremely rough. Special
from statistical mechanics (Monte Carlo), on the techniques based on nonmonotonous “walks” on
contrary, are targeted to generating trajectories temperature have been developed (simulated tem-
that quickly explore the relevant parts of the pering (Marinari and Parisi 1992)).
state space, i.e., those that correspond to the larg-
est contributions to the probability distribution Critical Phenomena
that depends on the temperature, an external or One of the most investigated topics of statistical
control parameter. If the temperature is high, the mechanics concerns phase transitions. This is a
probability distribution is broad and the generated fascinating subject: in the vicinity of a continuous
trajectory does not “see” the minima of energy phase transitions, correlation lengths diverge, and
that are below the temperature, i.e., it can jump the system behave collectively, in a way which is
over and off the local minima. largely independent of the details of the model.
By lowering the temperature, the probability This universal behavior allows the use of
distribution of system obeying statistical mechan- extremely simplified models that therefore can
ics concentrates around minima of energy, and the be massively simulated.
Monte Carlo trajectory does the same. The energy The philosophy of statistical mechanics may be
(or cost) function of not extremely complex prob- exported to nonequilibrium systems: systems with
lems is shaped in such a way that the global absorbing states (that correspond to infinitely neg-
optimum belongs to a broad valley, so that this ative energy), driven systems (live living ones),
lowering of the temperature increases the proba- strongly frustrated systems (that actually never
bility of finding it. reach equilibrium), etc. In general, one defines
Therefore, a sufficiently slow annealing should these systems in terms of transition probabilities,
furnish the desired global minimum. Moreover, it not in terms of energy. Therefore, one cannot
is possible to convert constraints into energy invoke a maximum entropy principle, and the
terms, which is quite convenient since or many results are less general.
problems, it is difficult to generate configurations However, many systems exhibit behavior rem-
that satisfy the constraints. Let us think, for iniscent of equilibrium systems, and the same
instance, to the problem of generating a school language can be used: phase transitions, correla-
timetable, keeping into consideration that lessons tions, susceptibilities, etc. These characteristics,
should not last more than three consecutive hours, common to many different models, are sometimes
that a teacher or students cannot stay in two clas- referred as emergent features.
ses at the same time, and that a teacher is not One of the most famous problems in this field
available on Monday, another prefers the first is percolation: the formations of giant clusters in
hours, etc. It is difficult to generate generic con- systems described by a local aggregation dynam-
figurations that obey all constraints, while it is ics, for instance, adding links to a set of nodes.
easy to formulate a Monte Carlo algorithm that This basic model has been used to describe an
generates arbitrary configurations, weighting incredibly large range of phenomena (Stauffer and
them with a factor that depends on how many Aharony 1994).
constraints are violated. At high temperature, con- Equilibrium and nonequilibrium phase transi-
straints do not forbid the exploration of the state tions occur for a well-defined value of a control
778 Interaction-Based Computing in Physics
parameter. However, in nature one often observes therefore to include such a procedure in the
phenomena whose characteristics resemble that of model, so that they not only evolve over the net-
a system near a phase transition, critical dynamics, works but also evolve the network (Boccaletti
without any fine-tuned parameter. For such sys- et al. 2006).
tem, the term self-organized criticality has been
coined (Bak et al. 1987), and they are the subject Agents
of active researches. Many of the described tools are used in the
so-called agent-based modeling. The idea is that
Networks of exploiting the powerful capabilities of present
A recent extension of statistical physics is the computers to simulate directly a large number of
theory of networks. Networks in physics are agents that interact among them. Traditional
often regular, like the lattice of a crystal, or only investigations of complex systems, like crowds,
slightly randomized. Characteristics of these net- flocks, traffic, and urban models, have been
works are the fixed (or slightly dispersed around performed using homogeneous representation:
the mean) number of connections per node, the partial differential equations (i.e., mean field),
high probability of having connected neighbors Markov equations, cellular automata, etc. In
(number of “triangles”), and the large time needed such an approach, it is supposed that each agent
to cross the network. The opposite of a regular type is present in many identical copies, and there-
network is a random graph, which, for the same fore they are simulated as macrovariables
number of connections, exhibits low number of (cellular automata) or aggregated like indepen-
triangles and short crossing time. The statistical dent random walkers in the diffusion equation.
and dynamical properties of systems whose con- But live elements (cells, organisms) do not behave
nection are regular or random are generally quite in such a way: they are often individually unique,
different. carry information about their own past history,
Watts and Strogatz (1998) argued that social and so on. With computers, we are now in the
networks are never completely regular. They position of simulating large assemblies of individ-
showed that the simple random rewiring of a uals, possibly geographically located, like, for
small number of links in a regular network may instance, humans in an urban simulation.
induce the small world effect: local properties, One of the advantages of such approach is that
like the number of triangles, are not affected, but of offering the possibility of measuring quantities
large-distance ones, like the crossing time, that are inaccessible to field researchers and also
quickly became similar to that of random graphs. playing with different scenarios. The disadvan-
Also the statistical and dynamical properties of tages are the proliferation of parameters that are
models defined over a rewired network are gener- often beyond experimental confirmation.
ally similar to those correlated to random graphs.
After this finding, many social networks were
studied, and they revealed a yet different struc- Artificial Worlds
ture: instead of having a well-defined connectiv-
ity, many of them present a few highly connected A somewhat alternative approach to that of tradi-
“hubs” and a lot of poorly connected “leafs.” The tional computational physics is that of studying an
distribution of connectivity is often shaped as a artificial model, built with little or no direct con-
power law (or similar (Newman 2005)), without a nection with reality, trying to include only those
well-defined mean (scale-free networks (Albert aspects that are considered relevant. The goal is to
and Barabasi 2002)). Many of phenomenological be able to find the simplest system still able to
models are presently reexamined in order to exhibit the relevant features of the phenomena
investigate their behavior over such networks. under investigation. The resulting models, though
Moreover, scale-free networks cannot be laid not directly applicable to the interpretation of
down; they need to be grown following a proce- experiments, may serve as interpretative tools in
dure, similar in this to fractals. It is natural many different situations. For instance, the Ising
Interaction-Based Computing in Physics 779
model was developed in the context of the model- The quantity E(x) can be thought as a land-
ing of magnetic systems but has been applied to scape, with low-energy configurations
opinion formation, social simulations, etc. corresponding to valley and high-energy ones to
peaks. The distribution P(x) can be interpreted as
Ising Model the density of a gas, each “particle” corresponding
The Ising (better: Lenz-Ising) model is probably to a possible realization (a replica) of the system.
one of the most known models in statistical phys- This gas concentrates in the valleys for low tem-
ics. Its history (Niss 2005) is particularly illuminat- peratures and diffuses if the temperature is
ing in this context, even if it took place well before increased. The temperature is related to the aver-
the advent of computers in physics. It is also a age level of the gas.
model for which the Monte Carlo and simulated In the absence of the local field (J = 0), the
annealing techniques are readily applied. energy is minimized if each xi is aligned (same
Let us first illustrate schematically the model. sign) with H. This ordering is counteracted by
I shall present the traditional version, with the thermal noise. In this case, it is quite easy to obtain
terminology that arises from the physics of mag- the average magnetization per spin (order
netic systems. However, it is an interesting exer- parameter)
cise to reformulate it in the context, for instance,
of opinion formation. Let us simply replace “spin < x > ¼ tanhðH=T Þ,
up/down” with “opinion A/B,” “magnetization”
with “average opinion,” “coupling” with which is a plausible behavior for a paramagnet.
“exchange of ideas,” “external magnetic field” A ferromagnet however presents hysteresis; i.e., it
with “propaganda,” and so on. may maintain for long times (metastability) a pre-
The Ising model is defined on a lattice that can be existing magnetization opposed to the external
in one, two, or more dimensions or even on a disor- magnetic field.
dered graph. We shall locate a cell with an index i, With coupling turned on (J > 0), it may happen
corresponding to the set of spatial coordinates for a that the local field is strong enough to “resist” H,
regular lattice or a label for a graph. The dynamical i.e., a compact patch of spins oriented against
variable xi for each cell is just a binary digit, tradi- H may be stable, even if the energy could be
tionally named “spin” that takes the values 1. We lowered by flipping all them, because the flip of
shall indicate the whole configuration as x. Therefore, a single spin would rise the energy (actually, this
a lattice with N cells has 2 N distinct configurations. flip may happen but is statistically reabsorbed in
Each configuration x has an associated energy short times). The fate of the patch is governed by
boundaries. A spin on a boundary of a patch feels
EðxÞ ¼ Si ðH þ hi Þ xi , a weaker local field, since some of its neighbors
are oriented in the opposite. Straight boundaries in
where H represents the external magnetic field two or more dimensions separate spins that
and hi is a local magnetic field, generated by “know” the phase they belong to: since most of
neighboring spins, hi = Sj Jij xi. The coupling Jij their neighbors are in that phase, the spins on the
for the original Lenz-Ising model is one if i and edges may flip more freely. Stripes that span the
j are nearest neighbors, and zero otherwise. whole lattice are rather stable objects and may
The maximum-entropy principle (Jaynes resist an opposite external field since spins that
1957) gives the probability distribution occasionally flip are surrounded by spins belong-
ing to the opposite phase and therefore feel a
PðxÞ ¼ ð1=Z Þ expðEðxÞ=T Þ strong local field that pushes them towards the
phase opposed to the external field.
from which averages can be computed. The In one dimension with finite-range coupling, a
parameter T is the temperature, and Z, the “parti- single spin flip is able to create a “stripe”
tion function,” is the normalization factor of the (perpendicularly to the lattice dimension) and
distribution. therefore can destabilize the ordered phase. This
780 Interaction-Based Computing in Physics
is the main reason for the absence of phase tran- phenomenon (Binney et al. 1993); n and are
sitions in one dimension, unless the coupling examples of critical exponents.
extends on very large distances or some coupling The divergence of the correlation length indi-
is infinite (see the part on directed percolation, cates that there is no characteristic scale (x), and
section “Probabilistic Cellular Automata”). therefore fluctuations of all sizes appear. In this
This model was proposed in the early 1920s by case, the details of the interactions are not so
Lenz to Ising for his Ph.D. dissertation as a simple important, so that many different models behave
model of a ferromagnet. Ising studied it in one in the same way, for what concerns, for instance,
dimension, found that it shows no phase transition, the critical exponents. Therefore, models can be
and concluded (erroneously) that the same hap- grouped into universality classes, whose details
pened in higher dimensions. Most of the contem- are essentially given by “robust” characteristics
poraries rejected the model since it was not based like the dimensionality of space and of the order
on Heisenberg’s quantum mechanical model of parameter, the symmetries, etc.
ferromagnetic interactions. It was only in the The power-law behavior of the correlation
forties that it started gaining popularity as a function also indicates that if we perform a
model of cooperative phenomena, a prototype of rescaling of the system, it would appear the
order-disorder transitions. Finally, in 1940, same or, conversely, that one is unable to estimate
Onsager (1944) provided the exact solution of the the distance of a pattern by comparing the “typical
two-dimensional Lenz-Ising model in zero external size” of particulars. This scale invariance is typi-
field. It was the first (and for many years the only) cal of many natural phenomena, from clouds
model exhibiting a nontrivial second-order transi- (whose height and size are hard to be estimated),
tion whose behavior could be exactly computed. to trees and other plant elements, lungs, brain, etc.
Second-order transition has interested physi- Many examples of power laws and collective
cists for almost all the past century. In the vicinity behavior can be found in natural sciences
of such transitions, the elements (say, spins) of the (Sornette 2006). Differently from what happens
system are correlated up to very large distances. in the Lenz-Ising model, in these cases there is no
For instance, in the Lenz-Ising model (with cou- parameter (like the temperature) that has to be
pling and more than one dimension), the high- fine-tuned, so one speaks of self-organized criti-
temperature phase is disordered, and the low- cality (Bak et al. 1987).
temperature phase is almost completely ordered. Since the Lenz-Ising model is so simple,
In both these phases, the connected two-point exhibits a critical phase, and can be exactly solved
correlation function (in some cases), it has become the playground for
a variety of modifications and applications to var-
Gc ðr Þ ¼< xi xiþr > < xi >2 ious fields. Clearly, most of the modifications do
not allow analytical treatment and have to be
decreases exponentially, Gc(r) ’ exp.(r/x), with investigated numerically. The Monte Carlo
r = |r|. The length x is a measure of the typical size method allows to add a temporal dimension to a
of patch of spins pointing in the same direction. statistical model (Kawasaki 1972), i.e., to trans-
Near the critical temperature Tc, the correlation form stochastic integrals into averages over ficti-
length x diverges like x(T Tc) ’ (T Tc)n tious trajectories. Needless to say, the Lenz-Ising
(n = 1 for d = 2 and n ’ 0.627 for d = 3, where d is model is the standard test for every Monte Carlo
the dimensionality of the system). In such case the beginner, and most of the techniques for acceler-
correlation function is described by a power law ating the convergence of averages have been
developed with this model in mind (Swendsen
Gc ðr Þ ’ r 2d and Wang 1987).
Near a second-order phase transition, a physi-
with = 1/4 for d = 1 and ’ 0.024 for d = 3. cal system exhibits critical slowing down, i.e., it
This phase transition is an example of a critical reacts to external perturbations with an extremely
Interaction-Based Computing in Physics 781
slow dynamics, with a convergence time that 1966). It was however just a theoretical exercise;
increases with the system size. One can extend the automaton was so huge that up to now it has
the definition of the correlation function including not yet completely implemented (Von Neumann
the time dimension: in the critical phase also the universal constructor 2008).
temporal correlation length diverges (as a power The idea of cellular automata is quite simple:
law). This happens also for the Lenz-Ising model take a lattice (or a graph) and put on each cell an
using the Monte Carlo dynamics, unless very automaton (all automata are equal). Each autom-
special techniques are used (Swendsen and aton exhibits its “state” (which is one out of a
Wang 1987). Therefore, the dynamical version small number) and is programmed to react
of the Lenz-Ising model can be used also to inves- (change state) according to the state of neighbors
tigate relaxational dynamics and how this is and its present one (the evolution rule). All autom-
influenced by the characteristics of the energy ata update their state synchronously.
landscape. In particular, if the coupling Jij changes Cellular automata share many similarities with
sign randomly for every couple of sites (or the the parallel version of the Lenz-Ising model. Dif-
field H has random sign for each site), the energy ferently from that, their dynamics is not derived
landscape becomes extremely rugged. When from an energy, but is defined in terms of the
spins flip in order to align to the local field, they transition rules. These rules may be deterministic
may invert the field felt by neighboring ones. This or probabilistic. In the first case (illustrated in this
frustration effect is believed to be the basic mech- section), cellular automata are fully discrete,
anism of the large viscosity and memory effects of extended dynamical systems. Probabilistic cellular
glassy substances (Dotsenko 1994; Mezard automata are illustrated in section “Probabilistic
et al. 1987). Cellular Automata.” The temporal evolution of
The rough energy landscape of glassy systems deterministic cellular automata can be computed
is also challenging for optimization methods, like exactly (regardless of any approximation) on a
simulated annealing (Kirkpatrick et al. 1983) and standard computer.
its “improved” cousin, simulated tempering Let us illustrate the simplest case, elementary
(Marinari and Parisi 1992). Again, the Lenz- cellular automata, in Wolfram’s jargon (1983).
Ising model is the natural playground for these The lattice here is one-dimensional, so to identify
algorithms. an automaton it is sufficient to give one coordi-
The dynamical Lenz-Ising model can be for- nate, say i, with i = 1, . . ., N. The state of the
mulated such that each spin is updated in parallel automaton on cell i at time t is represented by a
(Barkema and MacFarland 1994) (with the pre- single variable, xi(t), that can take only two values,
caution of dividing cells into sublattices, in order “dead/live” or “inactive/active” or 0/1. The time is
to keep the neighborhood of each cell fixed during also discrete, so t = 1, 2, . . .
updates). In this way, it is essentially a probabilis- The parallel evolution of each automaton is
tic cellular automata, as illustrated in section given by the rule
“Probabilistic Cellular Automata.”
xi ðt þ 1Þ ¼ f ðxi1 ðtÞ, xi ðtÞ, xiþ1 ðtÞÞ:
Cellular Automata
In the same period in which traditional computa- Since xi = 0,1, there are only eight possible
tion was developed, in the early 1950s, John Von combinations of the triple {xi1(t), xi(t), xi+1(t)},
Neumann was interested in the logic basis of life from {0,0,0} to {1,1,1}. For each of them, f(xi1(t),
and in particular in self-reproduction, and since xi(t), xi+1(t)) is either zero or one, so the function
the analysis of a self-reproduction automata fol- f can be simply coded as a vector of eight bits, each
lowing the rules of real physics was too difficult, position labeled by a different configuration of
he designed a playground (a cellular automaton) inputs. Therefore, there are only 28 = 256 different
with just enough “physical rules” in order to make elementary cellular automata that have been stud-
its analysis possible (von Neumann and Burks, ied carefully (see, for instance, Wolfram (1983)).
782 Interaction-Based Computing in Physics
In spite of their simplicity, elementary cellular Game of Life, is that these automata are “near
automata exhibit a large variety of behaviors. In the edge” of self-organizing complexity. One can
the language of dynamical systems, they can be slightly “randomize” the Game of Life, allowing
“visually” classified (Wolfram 1983) as fixed sometimes an exception to the rule. Let us intro-
points (class 1), limit cycles (class 2), and “cha- duce a parameter p that measures this randomness,
otic” oscillations (class 3). A fourth class, “com- with the assumption that p = 0 is the true “life.”
plex” CA, exhibits areas of repetitive or stable Well, it was shown (Nordfalk and Alstrøm 1996)
configurations with structures that interact with that the resulting model exhibits a second-order
each other in complicated ways. A classic exam- phase transition for a value of p very near zero.
ple is the Game of Life (Berlekamp et al. 1982). Deterministic cellular automata have been
This two-dimensional cellular automaton is based investigated as prototypes of discrete dynamical
on a simple rule. A cell may be either dead (0) or systems, in particular for what concerns the defi-
alive (1). A living cell survives if, among its nition of chaos. Visually, one is tempted to use this
8 nearest neighbors, there are two or three alive word also to denote the irregular behavior of
cells, otherwise it dies and disappears. Generation “class-3” rules. However, the usual definition of
is implemented through a rule for empty cells: chaos involves the sensitivity to an infinitesimally
they may become alive if surrounded by exactly small perturbation: following the time dynamics
three living cells. In spite of the simplicity of the of two initially close configurations, one can
rule, this automaton generates complex and long- observe an amplification of their distance. If the
living patterns, some of them illustrated in Fig. 2. initial distance (d0) is infinitesimal, then the dis-
Complex CA have large transients, during tance grows exponentially for some time
which interesting structures may emerge. They (d(t) ’ d0 exp.(lt)), after which it tends to satu-
finally relax into class-1 automata. It has been rate, since the trajectories are generally bounded
conjectured that they are able of computation, inside an attractor or due to the dimensions of the
i.e., that one can “design” a universal computer accessible space. The exponent l depends on the
using these CA as building blocks, as has been initial configuration, and if this behavior is
proved to be possible with the Game of Life. observed for different portions of the trajectory,
Another hypothesis, again confirmed by the it fluctuates: a trajectory spends some time in
Interaction-Based
Computing in Physics,
Fig. 2 Some of the most
common “animals” in the
game of life, with the
probability of encountering
them in an asymptotic
configuration (Bagnoli et al.
1991)
Interaction-Based Computing in Physics 783
regions of high chaoticity, after which may pass Finite-size cellular automata always follow
through “quiet” zones. If one “renormalizes” peri- periodic trajectories. Let us consider, for instance,
odically this distance, considering one system as Boolean automata of N cells. The number of pos-
the “master” and the other as a measuring device, sible different states is 2 N and due to determinism,
one can accumulate good statistics and define a once that a state has been visited twice the autom-
Lyapunov exponent l that gives indications about ata has entered a limit cycle (or a fixed point). One
the chaoticity of the trajectory, through a limiting may have limit cycles with large basins of transient
procedure. configurations (configurations that do not belong
The accuracy of computation poses some prob- to the cycle). Many scenarios are possible. The set
lems. Since in computer numbers are always approx- of different configurations may be divided in many
imate, one cannot follow just “one” trajectory. The basins, of small size (small transient) and small
small approximations accumulate exponentially, and period, like in class-1 and class-2 automata. Or
the computer time series actually jumps among one may have large basins, with long transients
neighboring trajectories. Since the Lyapunov expo- that lead to short cycles, like in class-4 automata.
nent is generally not so sensible to a change of Finally, one may have one or very few large basins,
precision in computation, one can assume that the with long cycles that include most of the configu-
chaotic regions are rather compact and uniform, so rations belonging to the basin (small transients).
that in general one associates a Lyapunov exponent This is the case of class-3 automata. For them, the
to a system, not to an individual trajectory. Never- typical period of a limit cycle grows exponentially,
theless, this definition cannot apply to completely like the total number of configurations, with the
discrete systems like cellular automata. system size, so that for moderately large system it
In any case, chaoticity is related to is almost impossible to observe a whole cycle in a
unpredictability. As first observed by Lorenz, and finite time. Another common characteristic of
following the definition of Lyapunov exponent, the class-3 automata is that the configurations quickly
precision of an observation over a chaotic system is decorrelate (in the sense of the correlation function)
related to the average time for which predictions along a trajectory. If one takes into consideration as
are possible. Like in weather forecasts, in order to starting points two configurations that are the same
increase the time span of a prediction, one has to but for a local difference, one observes that this
increase the precision of the initial measurement. In difference amplifies and diffuses in class-3 autom-
extended system, this also implies to extend the ata, shrinks or remains limited in class 1 and class
measurements over a larger area. One can also 2, and has an erratic transient behavior in class 4,
consider a “synchronization” approach. Take two followed by the fate of class 1 and class 2. There-
replicas of a system and let them evolve starting fore, if one considers the possibility of not knowing
from different initial configurations. With a fre- exactly the initial configuration of an automaton,
quency and a strength that depends on a parameter unpredictability grows with time also for such dis-
q, one of these replicas is “pushed” towards the crete systems. Actually, also (partially) continuous
other one, so as to reduce their distance. Suppose systems like coupled maps may exhibit this kind of
that q = 0 is the case of no push and q = 1 is the behavior (Bagnoli and Cecconi 2001; Cecconi
case of extremal push, for which the two systems et al. 1998; Crutchfield and Kaneko 1988; Politi
synchronize in a very short time. There should be a et al. 1993). Along this line, it is possible to define
critical value qc that separates these two behaviors an equivalent of the Lyapunov exponents for CA
(actually, the scenario may be more complex, with (Bagnoli et al. 1992). The synchronization proce-
many phases (Bagnoli and Cecconi 2001)). In the dure can be applied also to cellular automata, and it
vicinity of qc, the distance between the two replicas correlates well with the Lyapunov exponents
is small, and the distance d grows exponentially. (Bagnoli and Rechtman 1999).
The critical value qc is such that the exponential An “industrial” application of cellular autom-
growth compensates the shrinking factor and is ata is their use for modeling gases. The hydrody-
therefore related to the Lyapunov exponent l. namical equations, like the Navier-Stokes ones,
784 Interaction-Based Computing in Physics
simply reflect the conservation of mass, momen- It is possible to express the dynamics of the
tum, and energy (i.e., rotational, translational, and average distribution in a simple form: it is the
time invariance) for the microscopic collision rules Lattice Boltzmann Equation (LBE) (Chopard
among particles. Since the modeling of a gas via et al. 2002; Succi 2001; Wolf-Gladrow 2004).
molecular dynamics is rather cumbersome, some The method retains many properties of LGCA
years ago it was proposed (Frisch et al. 1986; like the possibility of considering irregular and
Hardy et al. 1973) to simplify drastically the micro- varying boundaries and may be simulated in a
scopic dynamics using particles that may travel very efficient way with parallel machines (Succi
only along certain directions with some discrete 2001). Differently from LGCA, there are numer-
velocities and jumping in discrete time only ical stability problems to be overcome.
among nodes of a lattice, indeed, a cellular autom-
aton. It has been shown that their macroscopic Probabilistic Cellular Automata
dynamics is described by usual hydrodynamics In deterministic automata, given a local configu-
laws (with some odd features related to the ration, the future state of a cell is univocally deter-
underlining lattice and finiteness of velocities) mined. However, let us consider the case of
(Rothman and Zaleski 2004; Wolf-Gladrow 2004). measuring experimentally some pattern and trying
The hope was that these Lattice Gas Cellular to analyze it in terms of cellular automata. In time-
Automata (LGCA) could be simulated so effi- series analysis, it is common to perform averages
ciently in hardware to make possible the investi- over spatial patches and temporal intervals and to
gation of turbulence, or, in other words, that they discretize the resulting value. For instance, this is
could constitute the Ising model of hydrodynam- the natural result of using a camera to record the
ics. While they are indeed useful to investigate temporal evolution of an extended system, for
certain properties of gases (for instance, chemical instance the turbulent and laminar regions of a
reactions (Lawniczak et al. 1991), or the relation- fluid. The resulting pattern symbolically repre-
ship between chaoticity and equilibrium (Bagnoli sents the dynamics of the original system, and if
and Rechtman 2009)), they resulted too noisy and it is possible to extract a “rule” out of this pattern,
too viscous to be useful for the investigation of it would be extremely interesting for the construc-
turbulence. Viscosity is related to the transport of tion of a model. In general, however, one observes
momentum in a direction perpendicular to the that sometimes a local configuration is followed
momentum itself. If the collision rule does not by a symbol, and sometimes the same local con-
“spread” quickly the particles, the viscosity is figuration is followed by another one. One should
large. In LGCA there are many limitations to conclude that the neighborhood (the local config-
collisions, so that in order to lower viscosity one uration) does not univocally determine the follow-
has to consider averages over large patches, thus ing symbol.
lowering the efficiency of the method. One can extend the “range” of the rule, adding
However, LGCA inspired a very interesting more neighbors farther in space and time (Rabiner
approximation. Let us consider a large assembly 1989). By doing so, the “conflicts” generally
of replicas of the same system, each one starting reduce but at the price of increasing the complex-
from a different initial configuration, all compatible ity of the rule. At the extremum, one could have an
with the same macroscopic initial conditions. The automaton with infinite “memory” in time and
macroscopic behavior after a certain time would be space that perfectly reproduces the observed pat-
the average over the status of all these replicas. If terns but with almost nonpredictive power, since it
one assumes a form of local equilibrium, i.e., is extremely unlucky that the same huge local
applies the mean-field approximation for a given configuration is encountered again.
site, one may try to obtain the dynamics of the So, one may prefer to limit the neighborhood to
average distribution of particles, which in principle some finite extension and accept that the rule
is the same of “exchanging” particles that happen sometimes “outputs” a symbol and sometimes
to stay on the same node among replicas. another one. One defines a local transition
Interaction-Based Computing in Physics 785
probability t(xi(t + 1)|Xi(t)) of obtaining a certain Although a definitive proof is still missing, it is
symbol xi at time t + 1 given a local configuration plausible that matrices with all elements different
Xi at time t. Deterministic cellular automata cor- from zero correspond to some equilibrium model,
respond to the case t = 0,1. The parallel version of whose transition rules can be derived from an
the Lenz-Ising model can be reinterpreted as a energy function (Georges and le Doussal 1989).
probabilistic cellular automaton. Since probabilistic cellular automata are defined
Starting from the local transition probabilities, in terms of the transition probabilities, one is free to
one can build up the transition probability T(x|y) investigate models that go beyond equilibrium. For
of obtaining a configuration x given a configura- instance, if some transition probability takes the
tion y. T(x|y) is given by the product of the local value zero or one, in the language of equilibrium
transition probabilities t (Bagnoli 2000). One can system, this would correspond to some coupling
read the configurations x and y as indexes, so that (like the coupling factor J of the Lenz-Ising model)
T can be considered as a matrix. The normaliza- that becomes infinite. This case is not so uncom-
tion of probability corresponds to the constraint mon in modeling. The inverse of a transition prob-
Sx T(x|y) = 1 8 y. ability corresponds to the average waiting time for
Denoting with P(x, t) the probability of observ- the transition to occur in a continuous-time model
ing a given configuration x at time t, and with P(t) (one may think to chemical reactions). Some tran-
the whole distribution at time t, we have for the sitions may have a waiting time so long with
evolution of the distribution respect to the observation interval, to be practically
irreversible. Therefore, probabilistic cellular
Pðx, t þ 1Þ ¼ T PðtÞ, automata (alongside other approaches like for
instance annihilating random walks) allow the
with the usual rules for the product of matrices and exploration of out-of-equilibrium phenomena.
vectors. Therefore, the transition matrix T defines One such phenomenon is directed percolation,
a Markov process, and the asymptotic state of the i.e., a percolation process with a special direction
system is given by the eigenvalues of T. The (time) along which links can only be crossed one
largest eigenvalue is always 1, due to the normal- way (Broadbent and Hammersley 1957). Let us
ization of the probability distribution, and the think, for instance, of the spreading of an infection
corresponding eigenvector is the asymptotic dis- in a one-dimensional lattice, with immediate
tribution. The theory of Markov chains says that if recovery (SIS model). An ill individual can infect
T is finite and irreducible, i.e., it cannot be rewrit- one or both of his two neighbors but returns to the
ten (renumbering rows and columns) as blocks of susceptible state after one step. The paths of infec-
noninteracting subspaces, then the second eigen- tion (see Fig. 3) can wander in the space directions
value is strictly less than one and the asymptotic but are directed in the time directions.
state is unique. In this case, the second eigenvalue The parallel version of a directed percolation
determines the convergence time to the asymp- process can be mapped onto probabilistic cellular
totic state. For finite systems, often the matrix T is automata. The simplest case, in one spatial dimen-
irreducible. However, in the limit of infinite size, sion and with just two neighbors, is called the
the largest eigenvalue may become degenerate, Domany-Kinzel model (Domany and Kinzel
and therefore there are more than one asymptotic 1984). It is even more general than the usual directed
state. This is the equivalent of a phase transition percolation, allowing “nonlinear” interactions
for Markov processes. among sites in the neighborhood (e.g., two wet
For the parallel Lenz-Ising model, the elements sites may have less probability of percolating than
of the matrix T are given by the product of local one alone).
transition rules of the Monte Carlo dynamics. These processes are interesting because there is
They depend on the choice of the algorithm but an absorbing state (Bagnoli et al. 2001; Hinrichsen
essentially have the form of exponentials of the 1997), which is the dry state for the wetting phe-
difference in energy, divided by the temperature. nomenon and the healthy state for the spreading of
786 Interaction-Based Computing in Physics
laws. This is sometimes considered a weakness of (Cailliau 1995). The high-energy physics experi-
the whole idea of studying such systems from a ments require a lot of simulations and data pro-
quantitative point of view. We have tried to show cessing, and physicists (among others) developed
that actually even the “hardest” discipline, phys- protocols to distribute this load on a grid of
ics, always deals with models that have finally to networked computers. Nowadays, a European
be simulated on computers, making various project aims to “open” grid computing to other
assumptions and approximations. Theoretical sciences (European Grid Infrastructure).
physics is accustomed for a long time to extreme It is expected that this combination of quanti-
simplifications of models, hoping to enucleate the tative modeling and grid computing will stimulate
fundamental ingredients of a complex behavior. innovative studies in many fields. Here is a small
This approach has proved to be quite rewarding sparse list of possible candidates:
for our understanding of nature.
In recent years, physics have been seen study- • Theory of evolution, especially for what con-
ing many fields not traditionally associated to cerns evolutionary medicine
physics: molecular biology, ecology, evolution • Social epidemiology, coevolution of diseases
theory, neurosciences, psychology, sociology, lin- and human populations, interplay between
guistics, and so on. Actually, the word “physics” sociology and epidemics
may refer either to the classical subjects of study • Molecular biology and drug design, again
(mainly atomic and subatomic phenomena, struc- driven by medical applications
ture of matter, cosmological, and astronomical • Psychology and neural sciences: it is expected
topics) or to the “spirit” of the investigation that that the “black box” of traditional psychology
may apply to almost any discipline. This spirit is and psychiatry will be replaced by explicit
essentially that of building simplified quantitative models based on brain studies
models, composed by many elements, and study • Industrial and material design
them with theoretical instruments (most of the • Earth sciences, especially meteorology, volca-
times, applying some form of mean-field treat- nology, and seismology
ment) and with computer simulations. • Archaeology: simulation of ancient societies,
This approach has been fruitful in chemistry reconstruction of historical and prehistorical
and molecular biology, and nowadays many phys- climates
ical journals have sections devoted to multi-
disciplinary studies. The interesting thing is that Nowadays, the term cellular automata has
not only physicists have brought some mathemat- enlarged its meaning, including any system
ics into fields that are traditionally more qualita- whose elements do not move (in opposition to
tive (which often corresponds to linear modeling, agent-based modeling). Therefore, we now have
plus noise), but have also discovered many inter- cellular automata on nonregular lattices, non-
esting questions to be investigated and new homogeneous, with probabilistic dynamics (see
models to be studied. One example is given by section “Probabilistic Cellular Automata”),
the current investigations about the structure of etc. (El Yacouby et al. 2006). They are therefore
social networks that were “discovered” by physi- considered more as a “philosophy” of modeling
cists in the nontraditional field of social studies. rather than a single tool. In some sense, cellular
Another contribution of physicists to this “new automata (and agent-based) modeling is opposed
way” of performing investigations is the use of to the spirit of describing a phenomena using
networked computers. For a long time, physicists differential equations (or partial differential equa-
have used computers in performing computations, tions). One of the reasons is that the language of
storing data and diffusing information using the automata and agents is simpler and requires less
Internet. Actually, the concept of what is now the training than that of differential equations.
World Wide Web was born at CERN, as a method Another reason is that at the very end, any rea-
for sharing information among laboratories sonable problem has to be investigated using
788 Interaction-Based Computing in Physics
computers, and while the implementation using Cecconi F, Livi R, Politi A (1998) Fuzzy transition region
discrete elements is straightforward (even if care- in a one-dimensional coupled-stable-map lattice. Phys
Rev E 57(3):2703–2712
ful planning may speed up dramatically the simu- Chopard B, Luthi P, Masselot A, Dupuis A (2002) Cellular
lation), the computation of partially differential automata and lattice Boltzmann techniques: an
equations is an art in itself. approach to model and simulate complex systems.
However, the final success of this approach is Adv Complex Syst 5(2):103–246
Crutchfield J, Kaneko K (1988) Are attractors relevant to
related to the availability of high-quality experi- turbulence? Phys Rev Lett 60(26):2715–2718
mental data that allow to discriminate among the Daxois T, Peyrard M, Ruffo S (2005) The Fermi-Pasta-
almost infinite number of models that can be built. Ulam ‘numerical experiment’: history and pedagogical
perspectives. Eur J Phys 26:S3–S11
Domany E, Kinzel W (1984) Equivalence of cellular
automata to Ising models and directed percolation.
Phys Rev Lett 53(4):311–314
Bibliography
Dotsenko V (1994) An introduction to the theory of spin
glasses and neural networks. World Scientific,
Primary Literature Singapore
Albert R, Barabasi AL (2002) Statistical mechanics of El Yacouby S, Chopard B, Bandini S (eds) (2006) Cellular
complex networks. Rev Mod Phys 74:47–97 automata, Lecture notes in computer science, vol 4173.
Bagnoli F, Rechtman R, Ruffo S (1991) Some facts of life. Springer, Berlin
Physica A 171:249–264 European Grid Infrastructure. https://www.egi.eu/.
Bagnoli F (2000) Cellular automata. In: Bagnoli F, Accessed 10 Apr 2017
Ruffo S (eds) Dynamical modeling in biotechnologies. Fermi E, Pasta J, Ulam S (1955) Los alamos report la-1940.
World Scientific, Singapore, p 1 In: Segré E (ed) Collected papers of Enrico Fermi.
Bagnoli F, Cecconi F (2001) Synchronization of non- University of Chicago Press, Chicago
chaotic dynamical systems. Phys Lett A 282(1–2):9–17 Frisch U, Hasslacher B, Pomeau Y (1986) Lattice-gas
Bagnoli F, Rechtman R (1999) Synchronization and max- automata for the navier-stokes equation. Phys Rev
imum Lyapunov exponents of cellular automata. Phys Lett 56(14):1505–1508
Rev E 59(2):R1307–R1310 Gardiner CW (1994) Handbook of stochastic methods for
Bagnoli F, Rechtman R (2009) Thermodynamic entropy physics, chemistry, and the natural sciences, Springer
and chaos in a discrete hydrodynamical system Phys series in synergetics, vol 13. Springer, Berlin
Rev E 79:041115 Georges A, le Doussal P (1989) From equilibrium spin
Bagnoli F, Rechtman R, Ruffo S (1992) Damage spreading models to probabilistic cellular automata. J Stat Phys
and Lyapunov exponents in cellular automata. Phys 54(3–4):1011–1064
Lett A 172:34 Hardy J, Pomeau Y, de Pazzis O (1973) Time evolution of a
Bagnoli F, Boccara N, Rechtman R (2001) Nature of phase two-dimensional classical lattice system. Phys Rev Lett
transitions in a probabilistic cellular automaton with 31(5):276–279
two absorbing states. Phys Rev E 63(4):046116 Harlow H, Metropolis N (1983) Computing & computers –
Bak P, Tang C, Weisenfeld K (1987) Self-organizing crit- weapons simulation leads to the computer era. Los
icality: an explanation of 1/f noise. Phys Rev A Alamos Sci 4(7):132
38:364–374 Haw M (2005) Einstein’s random walk. Phys World
Barkema GT, MacFarland T (1994) Parallel simulation of 18:19–22
the ising model. Phys Rev E 50(2):1623–1628 Hinrichsen H (1997) Stochastic lattice models with several
Berlekamp E, Conway J, Guy R (1982) What is life? Games absorbing states. Phys Rev E 55(1):219–226
in particular, vol 2. Academic, London. Chap. 25 Jaynes E (1957) Information theory and statistical mechan-
Binney J, Dowrick N, Fisher A, Newman MEJ (1993) The ics. Phys Rev 106(4):620–630
theory of critical phenomena. Oxford Science/ Kaneko K (1985) Spatiotemporal intermittency in coupled
Clarendon Press, Oxford map lattices. Progr Theor Phys 74(5):1033–1044
Boccaletti S, Latora V, Moreno Y, Chavez M, Hwang DU Kawasaki K (1972) Kinetics of Ising model. In:
(2006) Complex networks: structure and dynamics. Domb CM, Green MS (eds) Phase transitions and crit-
Phys Rep 424(4–5):175–308 ical phenomena, vol 2. Academic, New York, p 443
Broadbent S, Hammersley J (1957) Percolation processes Kirkpatrick S, Gelatt CG Jr, Vecchi MP (1983) Optimiza-
I. Crystals and mazes. Proc Camb Philos Soc 53:629–641 tion by simulated annealing. Science 220:671–680
Cailliau R (1995) A short history of the web. http:// Lawniczak A, Dab D, Kapral R, Boon JP (1991) Reactive
www.netvalley.com/archives/mirrors/robert_cailliau_ lattice gas automata. Phys D 47(1–2):132–158
speech.htm. Accessed 10 Apr 2017 Marinari E, Parisi G (1992) Simulated tempering: a new
Car R, Parrinello M (1985) Unified approach for molecular Monte Carlo scheme. Europhys Lett 19:451–458
dynamics and density-functional theory. Phys Rev Lett May R (1976) Simple mathematical models with very
55(22):2471–2474 complicated dynamics. Nature 261:459–467
Interaction-Based Computing in Physics 789
Metropolis N, Hewlett J, Rota GC (eds) (1980) A history of von Neumann J, Burks AW (1966) Theory of self-
computing in the twentieth century. Academic, reproducing automata. University of Illinois Press,
New York Urbana/London
Mezard M, Parisi G, Virasoro MA (1987) Spin glass theory Von Neumann universal constructor (2008) http://
and beyond. World scientific lecture notes in physics, en.wikipedia.org/wiki/Von_Neumann_Universal_Con
vol 9. World Scientific, Singapore structor. Accessed 10 Apr 2017
Newman ME (2005) Power laws, Pareto distributions and Watts D, Strogatz SH (1998) Collective dynamics of
Zipf’s law. Contemp Phys 46:323–351 ‘small-world’ networks. Nature 393:440–441
Niss M (2005) History of the Lenz-Ising model Wilensky U (1999) Netlogo. Center for connected learning
1920–1950: from ferromagnetic to cooperative phe- and computer-based modeling, Northwestern Univer-
nomena. Arch Hist Exact Sci 59(3):267–318 sity, Evanston. http://ccl.northwestern.edu/netlogo/.
Nordfalk J, Alstrøm P (1996) Phase transitions near the Accessed 10 Apr 2017
“game of life”. Phys Rev E 54(2):R1025–R1028 Wolf-Gladrow D (2004) Lattice-gas cellular automata and
Onsager L (1944) Crystal statistics. I. A two-dimensional lattice Boltzmann models: an introduction, Lecture
model with an order-disorder transition. Phys Rev notes in mathematics, vol 1725. Springer, Berlin
65:117–149 Wolfram S (1983) Statistical mechanics of cellular autom-
Oestreicher C (2007) A history of chaos theory. Dialogues ata. Rev Mod Phys 55:601–644
Clin Neurosci 9(3):279–289. Available online https://
www.ncbi.nlm.nih.gov/pmc/articles/PMC3202497/
Politi A, Livi R, Oppo GL, Kapral R (1993) Unpredictable Books and Reviews
behaviour of stable systems. Europhys Lett Boccara N (2004) Modeling complex systems. In: Gradu-
22(8):571–576 ate texts in contemporary physics. Springer, Berlin
Rabiner L (1989) A tutorial on hidden Markov models and Bungartz H-J, Mundani R-P, Frank AC (2005) Bubbles,
selected applications in speech recognition. Proc IEEE jaws, moose tests, and more: the wonderful world of
77(2):257–286 numerical simulation, Springer VideoMATH. Springer,
Rapaport DC (2004) The art of molecular dynamics simu- Berlin. (DVD)
lation. Cambridge University Press, Cambridge Chopard B, Droz M (2005) Cellular automata modeling of
Repast – recursive porus agent simulation toolkit physical systems. In: Collection Alea-Saclay: mono-
(2008) http://repast.sourceforge.net/. Accessed 10 Apr graphs and texts in statistical physics. Cambridge Uni-
2017 versity Press, Cambridge
Rothman DH, Zaleski S (2004) Lattice-gas cellular autom- Deisboeck S, Kresh JY (2006) Complex systems science in
ata. Monographs and texts in statistical physics. Col- biomedicine. In: Deisboeck S, Kresh JY (eds) Topics in
lection Alea-Saclay, Paris biomedical engineering. Springer, New York
Sornette D (2006) Critical phenomena in natural sciences, Gould H, Tobochnik J, Christian W (2007) An introduction
Springer series in synergetics. Springer, Berlin to computer simulation methods: applications to phys-
Stauffer D, Aharony A (1994) Introduction to percolation ical systems. Addison-Wesley, New York
theory. Taylor Francis, London Landau RH (2005) A first course in scientific computing:
Succi S (2001) The lattice Boltzmann equation for fluid symbolic, graphic, and numeric modeling using maple,
dynamics and beyond. Numerical mathematics and java, Mathematica, and Fortran90. Princeton Univer-
scientific computation. Oxford University Press, sity Press, Princeton
Oxford Open Source Physics. http://www.opensourcephysics.org/.
Swendsen R, Wang JS (1987) Nonuniversal critical Accessed 10 Apr 2017
dynamics in Monte Carlo simulations. Phys Rev Lett Resnick M (1994) Turtles, termites, and traffic jams.
58(2):86–88 Explorations in massively parallel microworlds. In:
van Kampen NG (1992) Stochastic processes in physics Complex adaptive systems. MIT Press, Cambridge
and chemistry. North-Holland, Amsterdam Shalizi C Cosma’s home page. http://www.cscs.umich.edu/
~crshalizi/. Accessed 10 Apr 2017
Glossary
Swarm Intelligence
Ant colony optimization Probabilistic optimiza-
Gerardo Beni
tion algorithm where a colony of artificial ants
University of California, Riverside, CA, USA
cooperate in finding solutions to optimization
problems.
Cellular automaton A system evolving in dis-
Article Outline
crete time steps, with four properties: a grid of
cells, a set of possible states of the cells, a
Glossary
neighborhood, and a function which assigns a
Definition of the Subject and Its Importance
new state to a cell given the state of the cell and
Introduction
of its neighborhood.
Biological Systems
Cellular-computing architecture Computer
Robotic Systems
design that uses cellular automata and related
Artificial Life Systems
machines, as processors and as storage of
Definition of Swarm
instruction and data.
Standard-Mathematics Methods
Dynamic cellular-computing system Cellular-
Swarm Optimization
computing system whose cells are mobile.
Particle Swarm Optimization (PSO)
Elementary swarmAn ordered set of N units
Ant Colony Optimization (ACO)
described by the N components vi (i = 1,2, . . . N)
Nonlinear Differential Equation Methods
of a vector v; any unit i may update the vector at
Limitations of Standard-Mathematics Methods
any time ti, using a function f of Ki vector com-
Cellular-Computing Methods
ponents. 8i N: vi(t + 1) = f (vk K(i) (t)).
Intelligence as Universal Computation
Game of life A cellular automaton designed to
Relations to Standard-Mathematics Methods
simulate lifelike phenomena.
Randomness in Swarm Intelligence
Intelligence (working definition for swarm intelli-
The Implicit Assumption of Asynchrony
gence) Ability to carry out universal computation.
Irrelevance
Natural asynchrony Asynchronous updating
Asynchronous Swarms
characterized by three properties: more than
Types of Asynchrony
one unit may update at each time step, any
Modeling Asynchrony by Synchronous Swarms
unit may update more than once in each
Local Synchrony and Self-Synchronization
updating cycle, and the updating order varies
The Natural Asynchrony of Swarms
randomly for every updating cycle.
The Realization of Asynchronous Swarms
Optimization algorithms Algorithms to satisfy
Characteristics of Swarm Intelligence
a set of constraints and/or optimize (e.g., min-
Dynamics in Swarm Intelligence
imize) a function by systematically choosing
Unpredictability in Swarm Intelligence
the values of the variables from an allowed set.
Swarms of Intelligent Units
Particle swarm optimization Probabilistic opti-
Future Directions
mization algorithm where a swarm of potential
Bibliography
solutions (particles) cooperate in finding solu- 2006; IEEE 2007), and in a new technical journal
tions to discrete optimization problems. (2007), without mentioning other areas in which
Pheromone A chemical that triggers an innate the term swarm itself had become popular. At the
behavioral response in another member of the time of the first update (mid-2013), there were
same animal species. already dozens of books on or related to swarm
Stigmergy Indirect communication through intelligence. There are now (mid-2019) five
modification of the environment. journals dedicated strictly to swarm intelligence.
Swarm intelligence Definition 1 (section “Defi- A search for “Swarm Intelligence” on the Internet
nition of the Subject and Its Importance”): The yields about 18 million results.
intuitive notion of “swarm intelligence” is that As the use of the term “swarm intelligence” has
of a “swarm” of agents (biological or artifi- spread, its meaning has broadened to a point in
cial) which, without central control, collec- which it is often understood to encompass almost
tively (and only collectively) carry out any type of collective behavior. And since the
(unknowingly, and in a somewhat-random term “swarm intelligence” has popular appeal, it
way) tasks normally requiring some form of is also sometimes used in contexts which have
“intelligence.” Definition 5 (section “Swarms limited scientific or technological content. Some
of Intelligent Units”): The capability of univer- meanings, however, refer rigorously to precise
sal computation carried out with natural asyn- concepts. The following treatment of swarm intel-
chrony by a dynamic cellular-computing ligence is based only on concepts that can be
system, none of whose cells can predict the clearly defined and quantified. Hence, it is more
computation done by the swarm. restricted than some broader swarm intelligence
Swarm optimization Ant colony optimization presentations, but, even so, it describes an inter-
particle swarm optimization and related prob- related scientific/technical core which forms a
abilistic optimization algorithms. solid basis for a well-defined multidisciplinary
Swarm robotics The technology of robotic sys- research area.
tems capable of swarm intelligence.
Unpredictable system A system such that com- Definition 1 The intuitive notion of “swarm intel-
plete knowledge of its state and operation at ligence” is that of a “swarm” of agents (biological
any given time is insufficient to compute the or artificial) which, without central control, col-
system’s future state before the system lectively (and only collectively) carry out
reaches it. (unknowingly, and in a somewhat-random way)
Von Neumann architecture Computer design tasks normally requiring some form of
that uses one processing unit and one storage “intelligence.”
unit holding both instructions and data.
A more specific definition requires a detailed
discussion, and so it is given at the end of the
Definition of the Subject and Its article. (Sections “Characteristics of Swarm Intel-
Importance ligence” and “Swarms of Intelligent Units”.)
Although this notion of swarm intelligence
The research area identified as “swarm intelli- might seem vague, we will see in the course of
gence” (SI) has been evolving now for 30 years. this entry that in fact it has many specific impli-
The term “swarm intelligence” first appeared in cations. Note that the notion is broad, which partly
1989 (Beni and Wang 1989a, b). By 2007 (when explains its widespread use, but not so broad as to
the first version of this entry appeared), “swarm include any type of collective action of groups of
intelligence” was in the title of four books simple entities, as will become clear later.
(Bonabeau et al. 1999; Kennedy et al. 2001; These characteristics of swarm intelligence are
Abraham et al. 2006; Engelbrecht 2006), in two also those of several biological systems, e.g.,
series of conference proceedings (Dorigo et al. some insect societies or some components of the
Swarm Intelligence 793
not easy to quantify follows from the difficulty of Focusing on the first distinction (scientific
defining several of the key components of the vs. technological interest), the main scientific
intuitive notion of SI (Definition 1). First, “intel- interest in SI originated with the work of biolo-
ligence” is a notoriously ambiguous concept. Sec- gists studying insect societies (Bonabeau et al.
ond, “randomly” is also not easily defined and 1999). The main technological interest originated
quantified. Third, “only collectively” must be with roboticists trying to design distributed
specified in terms of the critical number of agents robotic systems (Beni and Wang 1989a, b).
required for the emergence of SI. In what sense is A valuable reference on the development of SI is
a unit “simple” or “unintelligent” and the task Bonabeau et al. (1999), dealing, in parallel, with
carried out “complex” or “intelligent”? Fourth, these two interrelated interests. For a more recent
“unknowingly” implies that the global status and survey, see Tan (2018).
the goal of the swarm are, at least to some extent,
unknown to the single agents. Which algorithms
and communication schemes result in tasks car-
Biological Systems
ried out “unknowingly” by the agents?
Because of these difficulties, in this entry, we
Probably the best-known and seminal biology
first use the aforementioned (Definition 1) intui-
experiment in SI is the “double bridge” experi-
tive notion of SI to describe the current main areas
ment by Goss et al. (1989). While studying the
of studies considered to be SI. This will provide an
foraging of ants, they observed that if ants,
overview of the current status of the field; it will
starting from a point S, could reach food at a
make it possible to quantify the four vague con-
point F via two paths of different lengths, the
cepts (“intelligence,” “randomly,” “collectively,”
ants would at first choose one of the two paths
“unknowingly”) and to reach a more sharply
randomly; but after some ants had returned from
defined concept of SI. From this, we will be able
F to S, more ants would choose to go from S to
to see more clearly the limitations of SI and so its
F via the shortest path; and eventually practically
realistic potential for future applications.
all the ants would choose the shortest path (see
In this entry, the main areas of SI studies are
Fig. 1).
described by making three very broad distinctions:
The ants following the shorter path (the lower
(1) scientific interest versus technological interest
path) return to the source before the ants which
(sections “Biological Systems,” “Robotic Systems,”
have taken the longer path. In this way, the shorter
“Artificial Life Systems,” and “Definition of
path has a higher density of pheromones; as a
Swarm”); (2) standard mathematics versus cellular
result, ants starting at S will now prefer the
computational mathematics (sections “Standard-
shorter path.
Mathematics Methods,” “Swarm Optimization,”
The key insight was the realization that the ants
“Nonlinear Differential Equation Methods,” “Limita-
were finding the best path via stigmergy, that is, by
tions of Standard-Mathematics Methods,” “Cellular-
communicating through modification of the envi-
Computing Methods”); and (3) synchronous opera-
ronment. Ants are blind, but they communicate
tion versus asynchronous operation (sections “Ran-
chemically via pheromones. By laying
domness in Swarm Intelligence,” “Asynchronous
Swarms,” “The Realization of Asynchronous
Swarms,” “Characteristics of Swarm Intelligence,”
“Swarms of Intelligent Units”). These distinctions in
turn will provide a guide to clarifying the four vague
concepts in the intuitive definition of SI (Definition 1)
and, thus, a conceptual orientation for future studies S F
and applications; they will also provide criteria for
evaluating the promise of SI to solve complex prob- Swarm Intelligence, Fig. 1 Illustration of the double-
lems that traditional approaches cannot. bridge experiments
Swarm Intelligence 795
pheromones along the path when returning from behaviors of the entire flock. In addition to being
the food source F, the ants effectively marked the used to simulate group motion in a number of
shortest path by laying more pheromones on movies and games, this flocking behavior has
it. After that, the ants that would start from been used, e.g., for time-varying data visualiza-
S would choose the path marked by more phero- tion (Moere 2004). For more examples, see
mones, i.e., the shortest path. Hassanien and Emary (2016).
Thus, a method of self-organization and a In studying these biological systems, several
method to solve a nontrivial problem by a form concepts of relevance to SI were recognized. They
of collective intelligence, with many of the ele- can be summarized as:
ments of the intuitive definition of SI given above,
were observed and understood. 1. Multiple communications (of various types)
Later, Dorigo (1992) realized that this method among units
could be abstracted and generalized to design 2. Randomness (random fluctuations)
algorithms that rely on “artificial ants” to solve 3. Positive feedback, to reinforce random
much more complex problems (see section fluctuations
“Swarm Optimization”). Thus, the close connec- 4. Negative feedback for stabilization
tion between biological studies of SI and its poten-
tial for technological application was first clearly Of the various types of communication, we
demonstrated. have already noted stigmergy, i.e., indirect com-
Many other experiments in ants and other munication by modification of the environment.
social insects have confirmed the potential for On the other hand, direct communication may
developing bioinspired algorithms (Bonabeau occur unit to unit, contact being a special case
et al. 1999; Dorigo and Stutzle 2004; Olariu and (e.g., via antennae or mandibles in insects), or by
Zomaya 2005; Passino 2004). For actual insect broadcasting within a certain range (e.g., acousti-
societies, various ant algorithms have been cally or chemically). The type and specific mode
applied to model tasks such as division of labor, of communication has been found to be critical to
cemetery organization, brood care, carrying of the task performed, as, e.g., in what types of
large objects, constructing bridges, foraging, patterns are formed (Eftimie et al. 2007).
patrolling, chaining, sorting eggs, nest building, Finally, the most basic lesson from biological
and nest grooming. Social insects constitute 2% of studies of SI is that biology has found solutions to
insects, half being ants. Besides ants, termites, hard computational problems and that the design
bees, and wasps have been observed to exhibit principles used in doing this can be imitated.
some forms of SI behavior as in the aforemen-
tioned tasks.
Apart from insects, many other biological Robotic Systems
groups exhibit behavior with some of the features
of SI, such as flocks of birds and schools of fish. The actual realization of SI systems as collections
A seminal model of artificial flocks and schools of of robots is a very hard problem; in fact, it is quite
fish was proposed by Craig Reynolds (1987). difficult to make even small groups of robots
It is a computational model for simulating the perform useful tasks (Parker et al. 2005; Sahin
animation of a group of entities called “boids,” and Spears 2005). For a review from an engineer-
i.e., it is intended to represent the group move- ing perspective, see, e.g., Brambilla et al. (2013).
ment of flocks of birds and fish schools. In this Making even a single mobile, autonomous robot
model, each boid makes its own decisions on its work in a reliable way (even in simplified envi-
movement according to a small number of simple ronments) is a complex project. Often the techni-
rules that react to the neighboring members in the cal problems with small groups of robots are quite
flock and the environment it can sense. The simple far from the goal of SI, so there is not much reason
local rules of each boid generate complex global to use the term “swarm.” Terms such as
796 Swarm Intelligence
“collective robotics,” “multi-robot systems,” and coming closest to the problem of swarm con-
“distributed autonomous robotic systems” are trol is perhaps that of “formation” control, e.g.,
generally, and more appropriately, used. But, the control of multi-robot teams or autono-
whenever the tasks carried out by these robotic mous aircrafts or land or water vehicles.
systems become scalable to large numbers, the These studies, when extended to decentralized
term “swarm robotics” is appropriate, and, in systems, lead to consider problems of asyn-
fact, it has come into use. More typically, chronous stability of distributed robotic sys-
“swarm robotics” simply describes the design of tems and swarms (Gazi and Passino 2011).
groups of robotic units performing a collective
task. Each robotic unit cannot solve the task Although much progress has been made in
alone; and collectively, the robotic units try to swarm robotics, the application of SI algorithms
accomplish a common task without centralized is still underdeveloped; one reason is that often the
control. SI behavior emerges only above a critical number
As for any robotic system in general, each which is too large to make the construction of a
robotic unit, and the group as a whole, requires robotic swarm practical, because it is too complex
design of mechanics, control, and communica- or expensive. Investigations of this type are thus
tions. The emphasis of current research, in relation generally carried out by simulation (Pinciroli
to swarm robotics, is primarily on the latter two: et al. 2012).
(1) effective communication among the robot These simulations are specialized methods of
units (Hamann 2010) and (2) effective control swarm robotics. Early examples included “exe-
via decentralized algorithms and robustness cutable models” which can run in simulation or
(Sahin et al. 2007). For a recent update on on a mobile robotic unit and can execute all
swarm robotics, see, e.g., Hamann (2018). aspects of a robotic unit behavior (sensing, infor-
Research in robotic communication has become mation processing, actuation, motion), i.e., they
important with the growth of wireless communi- fully represent how perception is translated into
cation networking and the lower cost of building action by each robotic unit. Executable models
robotic units, thus opening a new range of appli- were an evolution of early protocols (the
cations for multi-robot systems with networking so-called “behavior-based” protocols) designed
capabilities, including swarm robotics. In fact, around the subsumption architecture (Brooks
swarm robotics provides the common ground for 1986). Behavior-based protocols were general-
convergence of information processing, commu- ized into Markov-type methods, i.e., protocols
nication theory, and control theory (Hamann where the transitions between the possible states
2010, 2018). of a robotic unit are specified by a probability
transition matrix as in Markov processes
1. Research in control of robotic swarms is par- (Johnson et al. 1998). More recently, simulators
ticularly important to guarantee the stability of have improved in flexibility and efficiency via
the swarm since the swarm does not have a highly modular designs (Pinciroli et al. 2012).
centralized control. The stability of a swarm is Looking at applications, swarm robotics has by
a special case of the general problem of dis- now accumulated a collection of standard prob-
tributed control. In fact, after swarm robotics lems which recur often in the literature. One group
algorithms for task implementation have been of problems is based on pattern formation: aggre-
devised, the practical realization requires sta- gation, self-organization into a lattice, deploy-
bility and robustness, i.e., proper control. ment of distributed antennas or distributed arrays
Swarm control presents new challenges to of sensors, covering of areas, mapping of the
robotics and control engineers: various types environment, deployment of maps, creation of
of controllers for swarms are currently being gradients, etc. A second group of problems
investigated, e.g., neural controllers (Sahin focuses on some specific entity in the environ-
et al. 2007). The control theory example ment: finding the source of a chemical plume,
Swarm Intelligence 797
homing, goal searching, foraging, prey retrieval, and electronic) realization of the units comprising
etc. And a third group of problems deals with the swarm. This is, as noted, an arduous task
more complex group behavior: cooperative trans- which often becomes the emphasis of research in
port, mining (stick picking), shepherding, flock- swarm robotics. But, as it was emphasized in the
ing, containment of oil spills, etc. This is not an early years of SI, even if the material construction
exhaustive list: other generic robotic tasks, such as of the swarm were accomplished, SI algorithms
obstacle avoidance and all terrain navigation, are would remain as the most difficult challenge for
also swarm robotics tasks. swarm robotics. This can be easily seen from the
One envisioned application of swarm robotics fact that a “robot” swarm with very advanced
which received considerable media attention in hardware is already available for experimentation:
the past was the ANTS (autonomic nanotechnol- it is a group of human beings. Each person could
ogy swarm) project by NASA (Curtis et al. 2000). be limited in a controlled way, e.g., by allowing
This project envisioned nanobots (i.e., a swarm of each person to handle only a specific device
microscopic robots) operating autonomously to according to specific rules. Algorithms to make
form structures for space exploration. such a swarm doing intelligent tasks are in the
The European Union-sponsored swarm robot- province of SI, but they are not simple to devise,
ics project (Dorigo et al. 2004; Mondada et al. as common experience shows. Although the
2004) was completed in 2005 after demonstrating notion of “human swarm” has been discussed in
several critical tasks, such as autonomous self- the popular press in connection to the develop-
assembly, cooperative obstacle avoidance, and ment of social media and crowd sourcing, it has
group transport. For this project, a new type of not been translated into quantitative algorithms of
robot called an s-bot was developed. A swarm-bot the SI type (Brabham 2013).
could transport an object too heavy for a single
s-bot (Mondada et al. 2005). The project has con-
tinued to progress and evolved into the Artificial Life Systems
“Swarmanoid” project which ended in
2011(www.Swarmanoid.Org). For another exam- The areas of self-organization, complexity, and
ple, see, e.g., Arvin et al. (2014). The largest artificial life (or A-life) are all older and broader
swarms so far realized are those of the kilobot fields than SI and overlap with it to various
project. The kilobot swarm is a swarm of extents.
1024 units that can be programmed to experiment A-life is conceptually placed somewhere
with swarm robotics in large-scale autonomous between science and technology and between
self-organization (Rubenstein et al. 2014). biology and robotics. During the mid-1980s,
Although swarm robotics could be defined as attempts at imitating living systems with
the robotic implementation of SI (Definition 1), so machines grew rapidly and resulted in the forma-
far, as noted, this implementation remains a dis- tion of the research field of “artificial life” (Adami
tant goal. Meanwhile, concepts from SI can be et al. 2012; Aguilar et al. 2014). A-life investi-
usefully applied to collections of cooperating gates phenomena characteristic of living systems
robots. Thus, referring to the intuitive notion of primarily through computational and (to a lesser
SI (Definition 1), the robotic swarm can be char- extent) robotic methods.
acterized by the type of algorithm and of Its scope is wide, ranging from investigations
(decentralized) control, the number of units of how lifelike properties develop from inorganic
above which new behavior emerges, the commu- components to how cognitive processes emerge in
nication method (range, topology, bandwidth), the natural or artificial systems. It includes research
processing and memory capability of each unit, on any man-made systems that mimic the charac-
and the heterogeneity of the group. teristics of natural living systems. By this crite-
Swarm robotics, besides the implementation of rion, it includes SI, but actual, current A-life
SI algorithms, includes the material (mechanical research is not much focused on SI; rather it
798 Swarm Intelligence
focuses on origin and synthesis of life, evolution- uncontrollably chaotic. The most adequate states
ary robotics, morphogenesis, learning, etc. are selected according to their fitness, either
The basic theories at the foundation of A-life, directly by the environment or by subsystems
and of relevance to SI, are the theories of self- that have adapted to the environment at an earlier
organization and complexity. A-life studies sys- stage.
tems which are typically characterized by many Formally, the basic mechanism underlying
strongly coupled degrees of freedom. Systems of self-organization is the (often driven by random-
this type are more generally investigated within ness) variation which explores different regions in
the science of complexity which began to be an the system’s state space until it enters an attractor.
active field of research in the early 1980s. It is This precludes further variation outside the attrac-
multidisciplinary, and it investigates physical, tor and thus restricts the freedom of the system’s
biological, computational, and social science components to behave independently. It is equiv-
problems, including a vast range of topics alent to the decrease of statistical entropy that
(Traub) from environmental sciences to econom- defines self-organization.
ics as it is clear from the content of this encyclo- It is useful to keep this brief sketch of self-
pedia. One basic feature that these systems have in organization theory in mind as we proceed in
common is the emergence of complex behavior describing SI, since the concepts in the theory of
from simple components, a notion we also find SI are evolved from a combination of concepts of
in SI. self-organization and computation.
In regard to self-organization, we note that, as
many systems in nature, A-life systems may start
disordered and featureless and then spontaneously Definition of Swarm
organize themselves to produce ordered struc-
tures, i.e., they self-organize. The theory of self- After having looked, in the previous three sec-
organization, going back to the 1950s (Nicolis and tions, at actual robotic, biological, and A-life sys-
Prigogine 1977), grew out of a variety of disci- tems and ideas of complexity and self-
plines but mainly from thermodynamics, non- organization related to SI, we can return to the
linear dynamics, and control theory. Self- intuitive definition of SI (Definition 1) and make it
organization can be defined as the spontaneous more quantitative.
creation of a globally coherent (i.e., entropy low- The intuitive notion consists of four elements:
ering) pattern out of local interactions – a concept SI is “intelligence” achieved “collectively,” “ran-
also relevant to SI. domly,” and “unknowingly.” An elementary
Because of its distributed character, self- swarm retaining these four elements can be
organization tends to be robust, resisting pertur- defined as:
bations. The dynamics of a self-organizing system
is typically nonlinear, because of feedback rela- Definition 2 (Elementary Swarm) An ordered set
tions between the components. Positive feedback of N units described by the N components vi (i = 1,
leads to fast growth, which ends when all compo- 2, . . . N) of a vector v; any unit i may update the
nents have been absorbed into the new configura- vector, at any time ti, using a function f of Ki vector
tion, leaving the system in a stable, negative components. 8i N: vi(t + 1) = f (vk K(i) (t)).
feedback state. Nonlinear systems have in general
several stable states, and this number tends to The elementary swarm describes an internally
increase (bifurcate) as an increasing input of driven “collective” action. External input may be
energy forces the system away from its thermo- added in the function f. “Randomness” is built in
dynamic equilibrium. To adapt to a changing the updating times. The evolution occurs
environment, the system needs a variety of stable “unknowingly” since the units have no processing
states that is large enough to react to perturbations capability. The elementary swarm can be general-
but not so large as to make its evolution ized so that randomness appears also in the
Swarm Intelligence 799
parameters of the function f. A further generaliza- based on SMm, since the greatest number of sig-
tion is obtained by letting each vector component nificant results in the area of SI has been obtained,
to be not just one number but a set of parameters. so far, by standard-mathematics methods, specif-
Hereinafter, we call Swarm (capital S) any ically in the areas of optimization and nonlinear
system capable of SI. It is worth noting that even dynamics. We consider them in turn in the next
in Swarms more general than the elementary two sections.
swarm, the modeling is assumed restricted in
such a way that no unit is capable of computing
the Swarm’s next global state (see also section Swarm Optimization
“Swarms of Intelligent Units”). Finally, “intelli-
gence” is expected to be achieved by running Optimization is by far the largest research area
appropriate algorithms via the updating function associated with SI. This is due, mainly, to two
f. If and how this is going to be possible requires a extremely successful optimization methods,
more mathematical discussion, which is the sub- whose origin is related to models of SI. The two
ject of sections “Standard-Mathematics methods are the ant colony optimization (ACO)
Methods,” “Swarm Optimization,” “Nonlinear (Dorigo 1992; Dorigo and Stutzle 2004) and the
Differential Equation Methods,” “Limitations of particle swarm optimization (PSO) (Kennedy
Standard-Mathematics Methods,” and “Cellular- et al. 2001; Kennedy and Eberhart 1995). Both
Computing Methods.” ACO and PSO originated in the early 1990s
and have resulted in hundreds of applications
based on variations of the original algorithms.
Standard-Mathematics Methods So much so that the field of “swarm optimiza-
tion” could stand alone, apart from its relation to
The science of biological swarms and the engi- SI with which it is sometimes even identified.
neering of robotic swarms, as well as research in A thorough and recent description of swarm
A-life relevant to SI, have progressed by using a optimization techniques is in Sun et al. (2011).
broad range of mathematical techniques. All these Here, only the key concepts of swarm optimiza-
techniques can be classified in two main groups: tion are reviewed, as they relate to SI. For a
(1) “standard-mathematics” methods and (2) “cel- review see, e.g., Nayyar et al. (2018).
lular-computational” methods. In PSO and ACO, as in any optimization
By standard-mathematics methods (SMm), we methods, a function must be optimized, e.g., min-
mean any method that is based on the standard imized. To find the minimum of the function, the
tools of applied mathematics and computations variable is changed in a systematic way – the
based on standard (von Neumann) computer optimization method. Generally, the variable
architectures. Examples are methods in differen- spans a multidimensional space. The search for
tial equations, stochastic techniques, linear sys- the global minimum is nontrivial since the func-
tems, and optimization. By cellular-computing tion may have many local minima, and the search
methods (CCm), we mean highly parallel and could end into one of them (see Fig. 2). Various
local computational methods, with simple cells techniques to avoid this trapping have been devel-
as the basic units of computation, typically carried oped by using some degree of randomness in the
out on cellular automata (CA) (see “Cellular search strategy. For example, simulated annealing
Automata”) (Sipper 1999). (Kirkpatrick et al. 1983) was developed to over-
These two mathematical approaches reflect come the limitations of nonrandom methods, e.g.,
two distinct trends in the evolution of SI research, the gradient descent (Snyman 2005). PSO and
as described below. We consider first (sections ACO belong to this class of optimization tech-
“Swarm Optimization,” “Nonlinear Differential niques that make use of randomized searches.
Equation Methods,” “Limitations of Standard- A recent set of studies on swarm optimization is
Mathematics Methods”) the approach to SI in Tan et al. (2017).
800 Swarm Intelligence
Function to minimize
Variable parameter
Swarm Intelligence, Fig. 2 Simplified illustration of the parameter continuously the algorithm will find one of the
typical problem encountered in optimization. Starting from nearest local minima (black circles) rather than the global
the value represented by the open circle and varying the minimum (indicated by the arrow)
ACO key insight is the application of the con- individual, involving some combination of self-
cept of stigmergy to stochastic optimization. The propulsion, random movement, and interaction with
ants communicate by modifying the environments neighboring organisms. The models typically take the
(graph) and act probabilistically on the basis of the form of coupled nonlinear difference or differential
modified environment. equations, which may be stochastic or deterministic,
Many variations of the basic ACO algorithm depending on the particular features of each model.
have been proposed and implemented. The many Numerical simulations have revealed collective
variations take advantage of specific knowledge behavior. But a main disadvantage of such models
about the specific problem, i.e., they use heuris- is that, for realistic numbers of individuals, analytical
tics, e.g., by setting the a priori propensity of results for the collective motion are difficult or impos-
traversing an arc or by setting the evaporation rate. sible to obtain. It is worth mentioning that some
ACO eventually resulted in a metaheuristic progress has been made in obtaining analytical results
which is a strategy for designing ACO heuristics. for stationary groups. In Mogilner et al. (2003), a
Various ACO-based metaheuristics have been discrete model was formulated, and a Lyapunov
developed. Similarly to PSO, ACO algorithms functional was used to successfully predict an equi-
have been applied to all the basic types of optimi- librium state of equally spaced organisms. However,
zation problems: continuous and discrete, analytical (nonstatistical) descriptions of non-
constrained and unconstrained, single and multi- equilibrium states in discrete swarm models are few.
objective, and static and dynamic. The first appli- Other investigations of swarming have been
cation of ACO was to the traveling salesman carried out in a continuum setting, in which rele-
problem, which is an NP-hard combinatorial opti- vant quantities are described as scalar or vector
mization problem, and it is the most frequently fields. This approach goes back to 1980; reviews
attacked problem using various ACO heuristics. are provided in Murray (2007). Continuum models
The main classes of other applications are to may be constructed a priori or by coarse graining a
problems of: particle model. In general, continuum models pro-
vide a convenient setting in which to study large
1. Ordering (scheduling, routing) populations, since one may apply machinery from
2. Assignment (neural network training, image the analysis of partial differential equations. In the
segmentation, design) context of swarms, the focus has generally been on
3. Subsets finding (maximum independent set) models in which the population density satisfies a
4. Grouping (clustering, bin packing) convection-diffusion equation ensuring that the
population density is conserved while individuals
Clearly, “swarm optimization” successfully travel with a set average velocity. Models of this
uses concepts from the general notion of SI, but type (Topaz and Bertozzi 2004) can predict, e.g.,
optimization is not in itself a necessary character- whether a population aggregates or disperses, the
istic of SI. In fact, many tasks actually or poten- regions of aggregation, and length scales of the
tially carried out by swarms are not optimal in any density patterns. Many applications to biological
sense. See also Monmarché (2016). and other collective systems have been carried out
recently (Topaz et al. 2012; Canizo et al. 2011), and
this branch of swarm studies is also expanding
Nonlinear Differential (Elamvazhuthi et al. 2018).
Equation Methods
swarm of Definition 2 can be regarded as 1984; Dennunzio et al. 2012). Many of these sys-
performing a cellular computation. And in fact, tems are CA, or related systems, using very simple
cellular computing has been used extensively in rules of evolution with local interactions. And so
A-life studies, including systems with strong rela- they are useful starting points for modeling SI.
tion to SI (Adami et al. 2012). Cellular-computing In particular, cellular computing is the most
systems offer SI something that the SI systems appropriate to endow the swarm with the property
described in the “standard-mathematics” sections of unpredictability. The latter property was an orig-
(“Standard-Mathematics Methods,” “Swarm inal motivation for SI (Beni and Wang 1989a, b),
Optimization,” “Nonlinear Differential and it is crucial in the task of escaping detection by
Equation Methods,” “Limitations of Standard- a predator; it is also of importance in engineering
Mathematics Methods”) lack, i.e., a clear charac- swarms for strategic defense applications.
terization of intelligent task. Unpredictability is almost a built-in property of
cellular-computing systems because if one
observes the rules of evolution in their raw form,
Intelligence as Universal Computation it is usually almost impossible to tell much about
the overall behavior they will produce. See, e.g.,
Intelligence is an ambiguous concept, escaping a Agapie et al. (2014).
unique definition (Gottfredson 1997; Legg and
Hutter 2007). By identifying “intelligence” with
“computation,” the concept is restricted, but, at
the same time, it can be made precise. In fact, in Relations to Standard-Mathematics
SI, we define intelligence unambiguously as the Methods
ability to carry out universal computation.
Universal computation (or universality) is the SMm cannot provide the swarm with the element
property of a computer system (or language) of universal computation, which we have taken as
which, with appropriate programming, can be the working definition of “intelligence.” The only
made to perform exactly the same set of tasks as way would be to make each unit a von Neumann
any other computer system (or language). (i.e., standard) computing system. In a sense, this
Universal computation (i.e., the ability to emu- violates the notion of Swarm, since in a Swarm,
late a universal computer) is essentially the limit by definition, each unit must be “simple.” (This
of any model of computation (Church-Turing the- point will be further clarified in section “Swarms
sis) (Cooper 2003). It was first proven by Turing of Intelligent Units”.) On the contrary, the main
in 1936 that no system can ever carry out explicit advantage of cellular-computing systems over
computations more sophisticated than those car- standard-mathematics systems is the possibility
ried out by a Turing machine. Subsequently, uni- of universal computation by simple units.
versality has been found to be a widespread For this reason, CCm are the natural paradigm
property of many cellular-computing systems for the understanding and designing of SI systems,
(Wolfram 2002). in spite of the fact that the approach to SI based on
One of the first cellular-computing systems cellular computing has so far produced fewer
shown to be capable of universal computation is so-called SI applications than the approach based
Conway’s game of life (Gardner 1970). This CA on SMm. Indeed, SMm have basic limitations for
is also the prototypical example of A-life systems. modeling SI. This is because the use of SMm tends
And it is also an example of the strong connection to restrict the range of tasks performable by the
between universal cellular-computing and Swarm. And this happens because SMm typically
bioinspired systems. solves problems by specifying constraints, i.e., con-
More recently, a large number of simple ditions to be satisfied by the solution, e.g., by spec-
cellular-computing systems have been found to ifying equations. But most computational problems
be capable of universal computation (Langton cannot be solved in this way.
Swarm Intelligence 805
The optimization methods described in the pre- out universal computation is necessary. To this
vious sections (PSO, ACO, etc.) illustrate the aim, CCm are the most suitable.
point. In these iterative methods, the key issue is Unfortunately, although CCm have many
what kind of changes should be made at each advantages over SMm for modeling SI, they
iteration step. Starting from a random pattern, at address only three of the four key elements of the
each step, a change is made to get the pattern closer notion (Definition 1) of SI (“intelligence,” “collec-
to satisfying the constraint(s). Since direct methods tively,” “randomly,” “unknowingly”), leaving out
(e.g., gradient descent) rarely work as the pattern the element of “randomness.” Generally, CCm
gets stuck into local minima, randomness in operate deterministically and do not include “ran-
updating is added. In this way, larger portions of domness,” as, e.g., “swarm optimization” systems
the solution space are sampled. The larger the do. But this does not have necessarily to be the
changes made, the faster one can potentially case. The issue is addressed in the next section.
approach a global minimum but the greater the
chance of overshooting. The result is that no itera-
tion technique of this type can guarantee a solution Randomness in Swarm Intelligence
to general combinatorial optimization problems.
As we have seen, the swarm optimization methods Randomness is a key element in the notion of SI (cfr.
(e.g., ACO, PSO) rely on heuristics to adjust the Definitions 1 and 2). Examples from biology justify
search and obtain (nonoptimal but) often satisfac- this requirement. Randomness is not easily quanti-
tory solutions. But, in general, for the great major- fied precisely, but, whatever the form and measure
ity of combinatorial optimization problems (e.g., chosen, the point is that for swarms, some form of
the traveling salesman problem Johnson and randomness is necessary – otherwise, they would
McGeoch 1997), no polynomial upper bound on fail to be models for analyzing a large class of
the time complexity has been found so far. And this biological systems. But what kind of randomness
happens in many problems whose solution is is essential to model these biological systems?
sought by using randomness to satisfy the imposed Randomness in the number and type of agents is
constraints. As an example, a set of identical balls not important – the agents could be strictly identi-
cannot be shaken into an ordered, closed-packed cal and remain in the same number. Randomness in
configuration. With extremely high probability, the initial conditions is not essential either. Many
they lock into some configuration or another, not swarms evolve from regular initial conditions into
the optimal (close-packing) one. highly complex and random patterns. Randomness
This fact has important implications for of external input from the environment is not
SI. What it says is that no matter how much always present, and it is certainly not a requirement
randomness is added to the system, it may never for biological swarm behavior.
evolve to reach the solution specified by the con- What about the randomness artificially added
straints. Although, ultimately, constraints can be to the units, as in swarm optimization?
set up as a way of specifying algorithms, and The randomness added to the units in PSO or
hence computing, it is far simpler to specify algo- ACO algorithms is modeled as originating from
rithms via rules of evolution, as it is done in the random behavior of each unit. This is a plau-
cellular computing. sible assumption in relation to biological systems.
The conclusion is that methods based on con- But the swarms in PSO and ACO are typically
straints and other SMm are not ideally suited for updated in an orderly (nonrandom) way, typically
systems evolving with great complexity, and, in sequentially (there are also parallel
particular, they are not suitable for universal com- implementations Olariu and Zomaya 2005),
putation. Thus, if SI is to be a framework for whereas, in biological systems, the units update
(biological or engineered) swarms to carry out in a disordered, random fashion.
“intelligent tasks” with the greatest generality, a And it is this type of randomness that is both
methodology that allows for the swarm to carry necessary in any biologically relevant model of
806 Swarm Intelligence
swarms and sufficient to provide many (but not processes has been pointed out, e.g., in the prob-
all) of the advantages of randomness in solving lem of morphogenesis (Liang and Beni 1995).
swarm engineering problems. The Turing diffusion-reaction model (Turing
The conclusion is that the only randomness 1952), being based on differential equations,
that is truly essential for SI is randomness in the implies synchronicity and central control; hence,
times of operation of the units. Each unit has its it is physically not realistic for a scale of the order
own clock, not synchronized with other units’ of 100 cells.
clocks. Other types of randomness in the behavior In fact, synchronicity leads to realistic models
of the units or the environment may be required to only whenever the spatiotemporal resolution is
solve specific problems, but randomness in times high, as, e.g., for phenomena typically studied in
of operation is necessary for any biologically real- physics. But when the units studied are complex
istic model. or few enough to have a less fine spatiotemporal
Interestingly though, many applications so far resolution, as in biology or human societies, syn-
considered in the area of SI do not yet include this chronicity is not realistic, as it is obvious by
randomness in the models. We have already men- observation.
tioned that typical optimizing swarms update Thus, in using synchronous methods for bio-
sequentially and CA systems operate largely in logical or human societies, implicitly a strong
parallel, i.e., synchronously. Synchronous or assumption is being made, i.e., that the synchro-
sequential operations are by far the most common nously (or sequentially) and nonsynchronously
updating modes in either SMm or CCm. (and not sequentially) obtained solutions would
coincide.
But this assumption has no validity. In fact, it
The Implicit Assumption of Asynchrony has been shown, for example, that CA, when
Irrelevance running in synchronous and nonsynchronous
ways, normally produce totally different results
As noted, it is a basic fact that biological agents, (Cornforth et al. 2005). This has been noted
apart from exceptional cases, do not operate syn- already in the 1990s (Bersini and Detour 1994)
chronously (or sequentially) in groups. It is also a in A-life studies. In Bersini and Detour (1994),
fact that people in social groups do not operate two well-known CA were compared: Conway’s
synchronously or sequentially. If SI is supposed to “game of life”(Gardner 1970) and the immune
model biological and social swarms, SI must be network model. The former is a two-dimensional
based on models that do not operate synchro- CA capable of universal computation when run
nously or sequentially (Huberman and Glance synchronously. But the behavior is totally differ-
1993). ent when run without synchrony: the game of life
And if biological swarms are capable of solv- stops producing complex patterns and converges
ing problems (including optimization) without to a fixed point.
synchrony (nor sequentially), as they do, then The immune network model is asynchronous
models that imitate those swarms should operate and the game of life synchronous. The crucial
asynchronously (not sequentially). factor in the different behavior of the two systems
But, as noted, the main modeling paradigms was identified as the synchronous versus asyn-
for bioinspired algorithms, standard mathematics, chronous updating. In fact, it was concluded
and cellular computing are either essentially that, in this case, asynchrony induces stability in
sequential or synchronous. CA. This agrees qualitatively with studies in stan-
An example from SMm is the solution of par- dard mathematics (Bersini and Detour 1994).
tial differential equations: they operate synchro- In conclusion, the assumption that asynchrony
nously on every point (clearly seen in solving makes no difference has been found not to be
them numerically and iteratively). This unrealistic valid. (for an example in PSO see, e.g., Nor Azlina
use of differential equations in biological et al. 2014) Hence, asynchronous systems must be
Swarm Intelligence 807
studied as such, not by using synchronous models. These eight basic types of asynchronous
Moreover, different types of asynchrony yield updating can be further specialized. For example,
different results, as discussed in section “Asyn- if all three properties are absent (~S, ~M, ~R), the
chronous Swarms.” updating is sequential. But, the sequential
updating order of the units can be fixed in differ-
ent ways. Studies of CA have proven that the
behavior differs markedly not only for the eight
Asynchronous Swarms
types of asynchrony but even among different
sequential ordering (Sipper 1999; Cornforth
Several cellular-computing studies in the 1990s
et al. 2005).
(Sipper 1999; Schonfisch and de Roos 1999) led
In Wolfram (2002), the (S, M, ~R) form of
to a variety of results emphasizing the role that
updating has been applied to describe processes
different types of asynchrony play in the results.
where each unit has independent clocks; but the
Studying asynchronous systems is complicated
clocks have a fixed, nonrandom frequency. This
because, among other things, deviation from syn-
type of asynchrony is considered a good model for
chronicity, i.e., from the mode of updating all
forest ecosystems, fire spread, and other natural
units in parallel at each time step, may occur in
and artificial systems. The results are very differ-
several different ways. For example, sequential
ent when updating of the type (~S, M, R),
updating and random updating are both asynchro-
(~S, ~M, R), or sequential (~S, ~M, ~R) is applied
nous but very different.
to the same system.
In conclusion, the crucial point is that
(Cornforth et al. 2005) the exact manner of
Types of Asynchrony updating can have a profound effect on overall
system behavior. The implication of this is that
Unfortunately, there is no standard vocabulary for when comparing models of natural systems or
the various types of asynchrony. Thus, we use the artificial multi-agent systems, it must be stated
following classification to describe the possible which updating scheme has been used; otherwise,
types of asynchrony. meaningful comparison between different studies
Consider an updating cycle (UC), i.e., the time may not be possible.
interval at the end of which all units have been In particular, returning to swarm optimization,
updated at least once. Eight types of UC can be one may ask whether in tasks, such as finding the
identified by the presence or absence of any of the shortest path, it is realistic to apply to natural
following three properties: synchronicity (S), systems (such as insect societies) swarm models
more than one unit may update at each time step; which are sequential (such as swarm optimization
multiplicity (M), any unit may update more than models) or synchronous (such as models based on
once in each UC; and randomness (R), the differential equations). While these models work
updating order varies randomly for every UC effectively as artificial swarms, there is no proof
(see Fig. 5). that they apply to natural systems, which are
Four units are represented by rectangles with asynchronous.
different patterns, from black (bottom) to white
(top). The horizontal axis measures time steps in
units equal to the base of a rectangle. The vertical Modeling Asynchrony by Synchronous
dashed line indicates the end of an updating cycle Swarms
(i.e., all units have updated at least once). The
label below the horizontal axis specifies the type Because of the widespread use of synchronous
of updating (~means “not”). The standard “paral- methods in simulations of SI, one might wonder
lel” and “sequential” updating are, respectively, under what conditions a synchronous but stochastic
(S, ~M, ~R), Fig. 5e, and (~S, ~M, ~R), Fig. 5a. model could be equivalent to an asynchronous one.
808 Swarm Intelligence
a e
b f
c g
d h
Swarm Intelligence, Fig. 5 Illustration of the eight (~S, M, ~R). (d) Asynchrony of type (~S, M, R). (e)
types of updating, according to synchronicity, multiplicity, Asynchrony of type (S, ~M, ~R). (f) Asynchrony of type
and randomness. (a) Asynchrony of type (~S, ~M, ~R). (b) (S, ~M, R). (g) Asynchrony of type (S, M, ~R). (h) Asyn-
Asynchrony of type (~S, ~M, R). (c) Asynchrony of type chrony of type (S, M, R)
To answer this question, let us consider the two • Case (2) 8i N: vi(t + 1) = f(t) (vk K(i) (t)).
types of stochastic models most commonly used where P[f(t) = fg] is the probability mass func-
to model randomness in synchronously updated tion of choosing f(t) = fg out of a set of Nf
systems. The randomness may be included in possible functions {fg; g = 1, . . . Nf}.
(1) the possible outcomes of the updating function
or (2) in the choice of the function applied to the Case (1) is typical of probabilistic CA, and it is
updating. also the method used in PSO. In these systems, the
Referring to the definition of elementary state vector, at each time step, evolves according
swarm (Definition 2), the two cases correspond to a fixed rule which produces a new state vector
to generalizing the updating function as follows: from the previous one. The rule is based on the
state of the neighbors of each unit and does not
• Case (1) 8i N: vi(t + 1) = f (vk K(i) (t), ζ). change from step to step, but the outcome of the
where ζ is a random variable. rule is probabilistic.
Swarm Intelligence 809
Case (2) is what is done, for example, in prob- the possibility of self-synchronization. If the
abilistic iterated function systems (Peruggia swarm can self-synchronize, then all the results
1993). In probabilistic iterated function systems, for synchronous swarms could be applied.
a vector evolves via a set of maps (a map is a To look into this, let us return to the classifica-
function whose domain and range coincide); at tion of the types of asynchrony, i.e., the SMR
each time step, a map is chosen, probabilistically, classification above. If the SMR properties are
from a set of possible maps. applied to blocks of units rather than individual
In either cases (1) or (2), the updating scheme units, the resulting updating orders are referred to
fails to model the actual time evolution of natural as locally synchronous.
systems not so much because the updating is CA with cells organized into blocks have been
applied synchronously but because the random- investigated (Sipper et al. 1997). These CA relax
ness is applied collectively, i.e., to all the units in the normal requirement of all cells having the
the same way. On the other hand, a synchronous same update rule. Cells within a block are updated
algorithm realistically simulating independent synchronously, but blocks are updated asynchro-
random updating can be run as case (2) applied nously. They experimented with different SMR
individually to each unit, as follows: types of asynchrony and concluded that synchro-
nous and asynchronous CA can be evolved with
• Case (3) 8i N: vi(t + 1) = f(t)i (vk K(i) (t)). equivalent computational properties, but CA of
where P[f(t)1 = fg1, f(t)2 = fg2, . . . f(t)N = fgΝ] with the asynchronous type may require a larger num-
f(t)i {fg; g = 1, . . . Nf} is the joint probability ber of cells (Sipper et al. 1997). Another study
mass function of each unit i updating, at time t, (Clapham 2002) has shown cases in which local
according to the function f(t)i. synchronization can lead to the same outcome as
with global synchronization. But how can local
In the simplest embodiment of case (3), the set of synchronization be achieved?
possible updating functions consists only of the A number of schemes have appeared in which
identity and of another function f, with probabilities the order of updating depends on local interac-
p and (1-p), respectively (i.e., a Bernoulli process). tions and leads to local synchronization. In effect,
In such a case, every unit, at each time step, either what local interactions (or constraints) can do is to
does not update, with probability p, or updates force a unit to wait to update until others are ready,
according to the function f, with probability (1-p). and so this creates a local synchronization. An
Running this algorithm synchronously is equivalent asynchronous CA model that can behave as a
to asynchronous independent updating of the units synchronous CA has been demonstrated
in a random way – a realistic description of a ran- (Nehaniv 2002); it functions by the addition of
dom swarm. So, under these independently stochas- extra constraints on the order of updating, effec-
tic conditions, running a simulation synchronously tively providing a type of local synchronization.
represents correctly the physical asynchronous Whether these methods of self-synchronization
updating of the swarm units. On the other hand, may in some cases result in realistic models of
this does not change the fact that different results natural systems of SI remains an open question.
are obtained when using this random updating
(whether simulated with stochastic synchrony or
not) instead of synchronous or sequential updating. The Natural Asynchrony of Swarms
and resting periods have an aperiodic pattern for state as by using synchronous or sequential itera-
individual ants, but for the whole colony, there are tions. It was also shown that, under certain condi-
synchronized periodic patterns of active and rest- tions, the SMR asynchronous updating leads to
ing periods. convergence while synchronous updating does
In spite of the difficulty of finding a clear-cut not. This is another example of the advantages
answer to the question of the natural mode of SI of randomness in allowing the swarm to reach a
updating, from observations of biological systems fixed state.
and from local synchronization models, it may be At the end of section “Cellular-Computing
plausible to assume that the essential form of Methods,” we concluded that CCm have, for SI
asynchrony in SI is the randomness in the working modeling, many advantages over SMm. The most
of the individual clocks, as argued in section crucial advantage is the possibility of universal
“Randomness in Swarm Intelligence”; hence, the computation which we took as the definition of
SI asynchrony must be characterized by the pres- intelligence for SI. We also noted, however, that
ence of all three asynchrony properties, i.e., SMR. studies based on CCm which include randomness
In conclusion, at this stage of our discourse, the are scarce. We described a few in section “Ran-
Swarm remains defined as in Definition 2, quali- domness in Swarm Intelligence” especially in
fied by SMR asynchronous updating, which here- discussing the qualitative differences with syn-
inafter we call natural asynchrony. Note that chronous CA and in relation to mechanisms of
stochastic synchronous simulations of this model local and self-synchronization.
can also be carried out as, e.g., in Case (3) above. Generally, these studies model relatively trivial
phenomena but cannot model nontrivial phenom-
ena such as universal computation. In fact, there
The Realization of Asynchronous are very few studies of universal computation in
Swarms asynchronous CA. Significant advances have
been made only recently. The first attempts were
So far, we have established the importance and made by simulating a synchronous CA on an
type of asynchronous models in SI, but what SI asynchronous CA (Nakamura 1974) after which
investigations using asynchronous swarms are a synchronous model, as a Turing machine, was
there? simulated on the synchronous CA. However, this
As noted in section “Asynchronous Swarms,” asynchronous CA is, in practical realization,
research in asynchronous models is still very lim- synchronous.
ited, relative to synchronous models, and this in Improved asynchronous CA do not rely on
spite of the fact that the very first models of SI global synchronization but conduct asynchronous
were all asynchronous, using SMm based on finite computation directly by simulating delay-
differences (Beni 1992). Explicit updating insensitive circuits, i.e., circuits in which delays
schemes in finite difference methods can also be of signals do not affect the correctness of the
regarded as parallel CA, thus belonging to both circuit operation (Lee et al. 2004). This method
SMm and CCm. Investigations of asynchronicity essentially uses local synchronization with
in finite difference methods are not common (Beni undetermined exact timing between transitions.
2004b). Examples include a nonlinear updating In this way, an asynchronous CA, with a hexago-
rule was based on a linear relation between two nal cell structure, capable of universal computing
neighboring units (Beni and Hackwood 1992). has been realized (Adachi et al. 2004). Although
A gradient type of swarm updating was also pro- relying on local synchronization, this type of
posed in modeling morphogenesis (Liang and asynchronous CA can mimic natural phenomena
Beni 1995). as, e.g., phenomena that rely on chemical reac-
For swarms updating with “natural” asyn- tions which occur only when the right molecules
chrony, i.e., according to SMR, a study (Beni are available in the right positions at the right
2004b) gives a proof of convergence to the same times.
Swarm Intelligence 811
2012; Xu et al. 2019), asynchronous CA do not cellular-computing systems (also called cellular
update “naturally.” A second, more fundamental robotic systems) are not equivalent to (and very
reason is that in most SI studies (both in natural hard to simulate by) cellular computers has been
and technological systems), the units are emphasized since the very beginning of SI stud-
dynamic. When the units of an asynchronous CA ies (Beni 1988). Thus, we conclude by stating a
are made mobile, a different and more complex set definition of SI which, while remaining
of problems need to be solved due to the changing grounded in the intuitive ideas (Definition 1),
neighborhoods of each cell. These issues of includes all the concepts discussed quantitatively
dynamic reconfigurations of cells have not been in this entry.
addressed in asynchronous cellular-computing
designs and are likely to remain outside the Definition 4 SI is the capability of universal
scope of research aimed at improved computer computation carried out with “natural” asyn-
architectures. The computational problems arising chrony by a dynamic cellular-computing system.
from dynamically reconfiguring cells are central
in SI. We address this issue next. As we have seen, studies in the area of SI so far
have been concerned with models of collective
behavior which, to some limited degree, approach
Dynamics in Swarm Intelligence SI as defined above. Even though, to date, no
system with SI (Definition 4) has been built
In the definitions of SI and Swarm given so far (or designed), significant progress has been
(Definitions 1, 2, and 3), there has been no men- made, and, from what we have seen, it is reason-
tion of the dynamics of the units. But a general able to expect that it can be done.
characteristic of the units of a swarm is that almost In fact, although there is not yet proof of uni-
invariably they are mobile. In fact, we have versal computation carried out with “natural”
already discussed the dynamic nature of swarms asynchrony by a cellular robotic system (i.e., a
in sections “Biological Systems” and “Robotic dynamic cellular-computing system), the recent
Systems” in relation to biological and robotic proofs (Takada et al. 2006; Dennunzio et al.
swarms. The reason we have so far omitted this 2012; Xu et al. 2019) for “static” cellular com-
dynamic character of the units from the progres- puters indicate that this may be possible in the
sively more precise definitions of SI is simply for near future. Other future perspectives for SI are
clarity of exposition: if the dynamics is introduced discussed in section “Future Directions.”
after all the other elements of SI have been defined
and quantified, it is easier to single out its real
importance. Unpredictability in Swarm Intelligence
From Definition 3, with appropriate speciali-
zations, all aspects of SI considered so far can be We are now in a position to consider an aspect of
included in a common core of studies. The most SI, which has been inherent to the concept of SI
general notion of SI is in fact that of universal from its inception, i.e., the unpredictability (Beni
computation carried out with “natural asyn- and Wang 1989a, b) of the Swarm.
chrony” by a cellular-computing system. But we The unpredictability of the Swarm agrees with
should add, whose cells are, in general, mobile the common intuition that it is usually difficult to
units. predict what a program will do by reading its code
The latter qualification would be unnecessary and the more so the lower the level of the language
if the description of the dynamic state could be used by the program. More precisely, a Swarm,
included among the state variables of the cell. like other universal computers, may be impossible
But this is not always possible in computing cells to predict in the sense that even if one knows the
since the computation depends on neighbors that rules of evolution and an initial state, it can still
change their locations. The fact that dynamic take an irreducible (Wolfram 2002, 1985) amount
Swarm Intelligence 813
of computation to actually predict future states. units. On the other hand, some of these groups are
Furthermore, the unpredictability of the Swarm is often included in broad considerations of SI
of a more general character than that of any uni- (Kennedy et al. 2001) as applied to human socie-
versal computer because of the randomness inher- ties. Under what conditions can these groups be
ent in its evolution and because of its dynamics. regarded as swarms?
Generally a system is (at least partially) pre- A simple answer to this question runs as fol-
dictable if at present time, one of its future states lows. Consider the special case of Definition 4,
can be known via a computation shorter than the when the cells are not finite-state machines but
one performed by the system to reach that future universal computing units. As long as the task at
state. This is not possible for many types of asyn- hand cannot be accomplished by a single unit, but
chronous CA. This notion of predictability con- only by more than a critical number of units, the
siders computational time complexity but not system operates as a swarm. That this can be the
actual time. The latter is affected by the actual case is supported intuitively and from computa-
construction of the computing system. As such it tional considerations, as follows.
is sensitive also to the dynamic of the system. Intuitively, we may refer back to the example
In fact, the unpredictability of a Swarm by a of the free market model mentioned as illustration
von Neumann universal computer has been of SI at the beginning of the entry. Each individual
argued in Beni and Wang (1989a) and Beni contributes only as a trader; the “computation” of
(2004a) on the basis of its dynamics. The the market price is done by the swarm collectively
unpredictability of a Swarm by a cellular autom- and could not be done by any individual agent.
aton has also been discussed (Beni and Wang The point is that each unit, albeit “intelligent,”
1989a; Beni 2004a). Although unpredictability is uses only a fraction of its capability, i.e., the
difficult to quantify when including actual system trading ability, thus operating, effectively, in a
construction parameters, it is often engineered by restricted, “non-intelligent” capacity.
adding randomness to the system as in camou- Computationally, we have noted in discussing
flage and cryptography. Also, in animals, random- unpredictability that, in spite of theoretical com-
ness and dynamics are methods used by a herd to putational equivalence among universal com-
avoid predators by becoming unpredictable. And puters, the capability for one universal computer
so in team sports, such as soccer, unpredictability to predict another is limited. And, in fact, at the
by the opponent is usually achieved by a combi- end of section “Swarms of Intelligent Units,” we
nation of randomness and dynamics. put forth the conjecture that a Swarm could be the
Therefore, although far from proving it, we least predictable universal computer.
may intuitively conjecture that among systems But even if this conjecture was not true, it still
capable of universal computation, a Swarm, makes sense to think of a case when a Swarm of
because it computes universally, but asynchro- universal computers is unpredictable by any of the
nously (with randomness) and dynamically, units comprising it, as it has been argued for the
could be designed to be the least predictable. case of units capable of universal computation
with von Neumann architecture (Beni and Wang
1989a; Beni 2004a). In this sense, we may think of
Swarms of Intelligent Units a swarm of universal computers as being more
capable than any one of its units, in spite of the
Let us now consider collections of intelligent fact that the Swarm and any of its units are com-
units, i.e., such that each unit is capable of univer- putationally equivalent. For these reasons, it
sal computation. These seem excluded from the makes sense to apply, under appropriate condi-
definitions of SI given so far – a key characteristic tions, the notion of SI also to human societies.
of SI is that intelligence is an emergent property, We may give then a more complete definition
happening only above a certain critical number of of Swarm by adding the characteristic of
units, and not a property of any of the individual unpredictability of the Swarm by its units.
814 Swarm Intelligence
practical realization of engineered Swarms, Adachi S, Peper F, Lee J (2004) Universality of hexagonal
whether robotic, biological, or simply asynchronous totalistic cellular automata. In: 6th inter-
national conference on cellular automata for research
computational. and industry, ACRI 2004. Lect Notes Comput Sci
Practically, the goal of SI will remain twofold: 3305:91–100
to provide models to explain biological societies Adami C, Bryson DM, Ofria C, Pennock RT (2012) Artifi-
and to engineer algorithms and devices with capa- cial life 13, Ebook ISBN 9780262310505
Agapie A, Andreica A, Chira C, Giuclea M (2014) Pre-
bility beyond those of traditional technologies. It dictability in cellular automata. PLoS One. https://doi.
will continue to include Swarm robotics and org/10.1371/Journal.pone.0108177
bioinspired algorithms such as swarm optimiza- Aguilar W, Santamaria Bonfil G, Froese T, Gershenson
tion methods. C (2014) The past, present, and future of artificial life.
Front Robot AI 1:8
But although commonly regarded as a typical Arvin F, Murray JC, Lichen S, Chun Z, Shigang Y (2014)
example of bioinspired technology, SI applica- Development of an autonomous micro robot for swarm
tions are likely to go beyond bioinspired systems. robotics. 2014 IEEE Int Conf Mechatron Autom
We can see this if we consider how nature- 635(640):3–6
Beni G (1988) The concept of cellular robot. In: Proceed-
inspired technologies have evolved. ings of the 3rd IEEE symposium on intelligent control,
Science discovers laws of nature, and technol- Arlington, pp 57–61
ogy makes inventions using those laws, often Beni G (1992) Distributed robotic systems and swarm
together with design ideas also derived from intelligence. J Robot Soc Jpn 10:31–37
Beni G (2004a) From swarm intelligence to swarm robot-
nature. Thus, for example, laws of physics and ics: swarm robotics. In: Sahin E, Spear WM (eds)
designs inspired by crystal structures are now Revised selected papers, SAB 2004 international work-
applied to nanotechnologies; similarly, laws of shop, Santa Monica, 17 July 2004. Lecture notes in
biology and designs inspired by genetic configu- computer science, March, vol 3342. Springer, pp 1–9
Beni G (2004b) Order by disordered action in swarms. In:
rations are applied to make artificial organisms in Sahin E, Spear WM (eds) Revised selected papers,
biotechnology. New attempts are also being made SAB 2004 international workshop, Santa Monica,
(see, e.g., Wolfram 2002) at discovering laws of 17 July 17. Lecture notes in computer science, March,
computing machines as though they were natural vol 3342. Springer, pp 153–171
Beni G, Hackwood S (1992) Stationary waves in cyclic
systems, and these discoveries are likely to be swarms. In: Proceedings of the IEEE international sym-
used to invent new software algorithms and hard- posium on intelligent control, Glasgow, 10–13 Aug
ware implementations of those algorithms. SI, Beni G, Wang J (1989a) Swarm intelligence in cellular
besides being bioinspired, can be said to be robotic systems. In: Proceedings of NATO advanced
workshop on robots and biological systems, Tuscany,
inspired also by this new science, which, in 26–30 June
some respects, can be more general than biology. Beni G, Wang J (1989b) Swarm intelligence. In: Proceed-
In this perspective, SI will evolve into the study ings for the 7th annual meeting of the robotics society
of what amounts to be very powerful computing of Japan, pp 425–428 (in Japanese)
Bersini H, Detour V (1994) Asynchrony induces stability
systems. Designing the simplest of these, i.e., the in CA based models. In: Brooks RA, Maes P (eds)
simplest universal dynamic cellular-computing Artificial life, vol IV. MIT Press, Cambridge,
system updating with natural asynchrony, is an pp 382–387
example of a future theoretical and practical chal- Bonabeau E, Dorigo M, Theraulaz G (1999) Swarm intel-
ligence: from natural to artificial systems. Oxford Uni-
lenge for SI. More immediate, future applications versity Press, New York
can be extrapolated from the examples given
throughout the entry.
References (Bonabeau et al. 1999; Kennedy et al. 2001;
Bibliography Abraham et al. 2006; Engelbrecht 2006; Keller et al. 2016;
Dorigo and Stutzle 2004; Olariu and Zomaya 2005;
Passino 2004; Gazi and Passino 2011; Yang et al. 2013)
Primary Literature are the main books and reviews. The journal Swarm Intel-
Abraham A, Grosan C, Ramos V (2006) Swarm intelli- ligence (2007) is the main source for new research results.
gence in data mining. In: Studies in computational The following provide additional general material related
intelligence. Springer, Berlin/Heidelberg to Swarm Intelligence.
816 Swarm Intelligence
Brabham DC (2013) Crowdsourcing. MIT Press, Gardner M (1970) The fantastic combinations of John
Cambridge Conway’s new solitaire game ‘life’. Sci Am
Brambilla M, Ferrante E, Birattari M, Dorigo M (2013) 223:120–123
Swarm robotics: a review from the swarm engineering Gazi V, Passino KM (2011) Swarm stability and optimiza-
perspective. Swarm Intell 7:1–41 tion. Springer, New York
Brooks R (1986) A robust layered control system for a Goss S, Aron S, Deneubourg JL, Pasteel JM (1989) Self-
mobile robot. IEEE J Robot Autom RA 2(1):14 organized shortcuts in the argentine ant. Naturwis-
Canizo JA, Carrillo JA, Rosado J (2011) A well-posedness senschaften 76:579–581
theory in measures for some kinetic models of collec- Gottfredson LS (1997) Mainstream science on intelli-
tive motion. Math Models Methods Appl Sci gence: an editorial with 52 signatories, history, and
21:515–539 bibliography. Intelligence 24(1):13–23
Clapham N (2002) Emergent synchrony: simple asynchro- Hamann H (2010) Space-time continuous models of
nous update rules can produce synchronous behavior. swarm robotic systems: supporting global-to-local pro-
In: Sarker M, Gen N (eds) Proceedings of the sixth gramming. Springer, Berlin
Australia-Japan joint workshop on intelligent and evo- Huberman BA, Glance NS (1993) Evolutionary games and
lutionary systems. Australian National University, computer simulations. Proc Natl Acad Sci U S A
pp 41–46 90:7716–7718
Cooper SB (2003) Computability theory. Chapman Hall/ IEEE swarm intelligence symposium. Honolulu, 1–5 Apr
CRC, Boca Raton 2007. http://www.computelligence.org/sis/2007/?q=
Cornforth D, Green D, Newth D (2005) Ordered asynchro- node/2
nous processes in multi-agent systems. Phys International Journal of Swarm Intelligence Research
D 204(1–2):70–82 (IJSIR) Information Resources Management Associa-
Curtis SA, Mica J, Nuth J, Marr G, Rilee ML, Bhat tion (2010) ISSN 1947–9263
M (2000) Autonomous nano-technology Swarm. In: Johnson DS, McGeoch LA (1997) The traveling salesman
Proceedings of the 51st international aeronautical con- problem: a case study in local optimization. In: Aarts
gress, IAF-00-Q.5.08 EHL, Lenstra JK (eds) Local search in combinatorial
Dennunzio A, Formenti E, Manzoni L (2012) Computing optimization. Wiley, Chichester, pp 215–310
issues of asynchronous CA. Fundam Inform Johnson N, Galata A, Hogg DB (1998) The acquisition and
120:165–180 use of interaction behavior models. In: Proceedings.
Dorigo M (1992) Optimization, learning and natural algo- 1998 IEEE computer society conference on computer
rithms. Ph.D thesis, Dipartimento di Elettronica, vision and pattern recognition (cat. No.98CB36231),
Politecnico di Milano, Milan (in Italian) Santa Barbara, pp 866–871
Dorigo M, Stutzle T (2004) Ant colony optimization. MIT Keller JM, Liu D, Fogel DB (2016) Fundamentals of
Press, Cambridge computational Intelligence:neural networks, fuzzy sys-
Dorigo M, Tuci E, Groß R, Trianni V, Labella TH, tems, and evolutionary computation. IEEE press series
Nouyan S, Ampatzis C, Deneubourg J-L, on computational intelligence. Wiley/IEEE Press,
Baldassarre G, Nolfi S, Mondada F, Floreano D, Hoboken
Gambardella LM (2004) The SWARM-BOTS pro- Kennedy J, Eberhart RC (1995) Particle swarm optimiza-
ject. In: Sahin E, Spears WM (eds) Proceedings of tion. In: Proceedings of IEEE international conference
the 1st international workshop on swarm robotics. on neural networks, vol IV. IEEE Service Center,
Lecture notes in computer science, vol 3342. Piscataway, pp 1942–1948
Springer, Berlin, pp 26–40 Kennedy J, Eberhart RC, Shi Y (2001) Swarm intelligence.
Dorigo M, Gambardella LM, Birattari M, Martinoli A (eds) Morgan Kauffman, San Mateo
(2006) Ant colony optimization and swarm intelli- Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization
gence: 5th international workshop, ANTS 2006, Brus- by simulated annealing. Science 220(4598):671–680
sels, 4–7 sept 2006, proceedings. Lecture notes in Langton CG (1984) Self-reproduction in cellular automata.
computer science. Springer, Berlin Phys D 10:135–144
Eftimie R, de Vries G, Lewis MA (2007) Complex spatial Lee J, Peper F, Adachi S, Morita K (2004) Universal delay-
group patterns result from different animal communi- insensitive circuits with bi-directional and buffering
cation mechanisms. In: Proceedings of the National lines. IEEE Trans Comput 53(8):1034–1046
Academy of Sciences, 24 Apr 2007, vol 104, no 17 Legg S, Hutter M (2007) Universal Intelligence: a definition
Elamvazhuthi K, Kuiper H, Berman S (2018) PDE-based of machine intelligence. Mind Mach 17(4):391–444
optimization for stochastic mapping and coverage strat- Liang P, Beni G (1995) Robotic morphogenesis. Proc Int
egies using robotic ensembles. Automatica Conf Robot Autom 2:2175–2180
95:356–367. Elsevier Mandal JK, Devadutta S (2019) Intelligent computing
Engelbrecht AP (2006) Fundamentals of computational paradigm: recent trend. Springer, Singapore
swarm intelligence. Wiley, New York Moere AV (2004) Information flocking: time-varying data
Fatès N (2018) Asynchronous cellular automata: a volume in visualization using boid behaviors. In: Proceedings of
the encyclopedia of complexity and systems science, the eighth international conference on information
2nd ed. https://doi.org/10.1007/978-1-4939-8700-9_671 visualization, pp 409–414
Swarm Intelligence 817
Mogilner A, Edelstein-Keshet L, Bent L, Spiros A (2003) Schonfisch B, de Roos A (1999) Synchronous and asyn-
Mutual interactions, potentials, and individual distance chronous updating in cellular automata. Biosystems
in a social aggregation. J Math Biol 47:353–389 51:123–143
Mondada F, Pettinaro GC, Guignard A, Kwee IV, Sengupta S, Basak S, Peters II RA (2018) Particle swarm
Floreano D, Deneubourg J-L, Nolfi S, Gambardella optimization: a survey of historical and recent develop-
LM, Dorigo M (2004) SWARM-BOT: a new distrib- ments with hybridization perspectives. Mach Learn
uted robotic concept. Auton Robots 17(2–3):193–221 Knowl Extr 1:157–191
Mondada F, Gambardella LM, Floreano D, Nolfi S, Sipper M (1997) Evolution of parallel cellular machines:
Deneubourg J-L, Dorigo M (2005) The cooperation of the cellular programming approach. Lecture notes in
swarm-bots: physical interactions in collective robot- computer science. Springer, New York
ics. IEEE Robot Autom Mag 12(2):21–28 Sipper M (1999) The emergence of cellular computing.
Monmarché N (2016) Artificial ants. In: Metaheuristics. IEEE Comput 32(7):18–26
Springer, New York Sipper M, Tomassini M, Capcarrere MS (1997) Evolving
Murray JD (2007) Mathematical biology I: an introduction, asynchronous and scalable non-uniform cellular
3rd edn, interdisciplinary applied mathematics. automata. In: Proceedings of international conference
Springer, New York on artificial neural networks and genetic algorithms
Nakamura K (1974) Asynchronous cellular automata and (ICANNGA97). Springer
their computational ability. Syst Comput Controls Snyman JA (2005) Practical mathematical optimization. An
5(5):58–66 introduction to basic optimization theory and classical
Nehaniv CL (2002) Evolution in asynchronous cellular and new gradient-based algorithms. Springer, New York
automata. In: Standish RK, Abbass HA, Bedau MA Sun J, Lai C-H, Wu X-J (2011) Particle swarm optimisa-
(eds) Proceedings of the eighth conference on artificial tion: classical and quantum perspectives. Numerical
life. MIT Press, pp 65–74 analysis and scientific computing series. Chapman &
Nicolis G, Prigogine I (1977) Self-organization in non- Hall/CRC, 1 Har/Cdr, Boca Raton
equilibrium systems. Wiley, New York Swarm Intelligence (2007) Springer. ISSN: 1935–3812
Nor Azlina AA, Mubin M, Mohamad MS, Kamarulzaman Takada Y, Isokawa T, Peper F, Matsui N (2006) Construc-
AA (2014) A synchronous-asynchronous particle tion universality in purely asynchronous cellular
swarm optimization algorithm. Sci World automata. J Comput Syst Sci 72:1368–1385
J 2014:123019. https://doi.org/10.1155/2014/123019 Tan Y (ed) (2018) Swarm Intelligence: principles, current
Olariu S, Zomaya AY (2005) Handbook of bioinspired algorithms and methods (control, robotics, and sen-
algorithms and applications. Chapman & Hall/CRC sors). The Institution of Engineering and Technology,
Computer & Information Science, Boca Raton London
Parker LE, Schneider FE, Schultz AC (2005) Multi-robot Topaz CM, Bertozzi A (2004) Swarming patterns in two-
systems. From swarms to intelligent automata. In: Pro- dimensional kinematic model for biological groups.
ceedings from the 2005 international workshop on SIAM J Appl Math 65(1):152–174
multi-robot systems, vol III. Springer Topaz CM, D’Orsogna MR, Edelstein-Keshet L, Bernoff
Passino K (2004) Biomimicry for optimization, control, AJ (2012) Locust dynamics: behavioral phase change
and automation. Springer, London and swarming. PLOS Comp Biol 8(8):e1002642
Peruggia M (1993) Discrete iterated function systems. Traub J (editor in chief) J Complex. Elsevier http://www.
CRC Press, Wellesley. 1568810156 elsevier.com/wps/find/journaldescription.cws_home/622
Pinciroli C, Trianni V, O’Grady R, Pini G, Brutschy A, 865/description#description
Brambilla M, Mathews N, Ferrante E, DiCaro G, Trianni V, Tuci E, Passino KM, Marshall JAR
Ducatelle F, Birattari M, Gambardella LM, Dorigo (2011) Swarm cognition: an interdisciplinary approach
M (2012) ARGoS: a modular, parallel, multi-engine to the study of self-organising biological collectives.
simulator for multi-robot systems. Swarm Intell Swarm Intell 5:3–18
6:271–295 Turing AM (1952) The chemical basis for morphogenesis.
Reynolds C (1987) Flocks, herds, and schools: a distrib- Philos Trans R Soc Lond B 237:37–72
uted behavioral model. Comput Graph 21(4):25–34 von Neumann J (1966) Theory of self-reproducing autom-
Rubenstein M, Cornejo A, Nagpal R (2014) Programmable ata. University of Illinois Press. edited and completed
self-assembly in a thousand-robot swarm. Science by Burks AW
345:6198 Weiss G (2000) Multiagent systems: a modern approach to
Sahin E, Spears WM (2005) Swarm robotics: SAB 2004 distributed artificial intelligence. MIT Press, Cambridge
international workshop, Santa Monica, 17 July 2004, Wolfram S (1985) Undecidability and intractability in the-
revised selected papers. Lecture notes in computer sci- oretical physics. Phys Rev Lett 54:735–738
ence. Springer Wolfram S (2002) A new kind of science. Wolfram Media,
Sahin E, Spears WM, Winfield AFT (eds) (2007) Swarm Champaign
robotics. Second SAB 2006 international workshop, Xu W-L, Lee J, Chen H-H, Isokawa T (2019) Universal
Rome, 30 Sept 2006–1 Oct 2006 revised selected computation in a simplified Brownian cellular automa-
papers. Lecture notes in computer science, vol 4433. ton with Von Neumann Neighbrohood. Fundamenta
Springer, Berlin/Heidelberg/New York Informaticae 165(2):139–156
818 Swarm Intelligence
Yang X-S, Cui Z, Xiao R, Gandomi AH (2013) Swarm Hamann H (2018) Swarm robotics: a formal approach.
intelligence and bio-inspired computation: theory and Springer, Cham
applications. Elsevier, Boston Hassanien AE, Emary E (2016) Swarm intelligence: prin-
ciples, advances, and applications. CRC Press, Boca
Raton
Books and Reviews
Kacprzyk J, Pedrycz W (2015) Springer handbook of com-
Agrawal A, Gans J, Goldfarb A (2018) Prediction
putational intelligence. Springer, Berlin
machines: the simple economics of artificial Intelli-
Kruse R, Borgelt C, Klawonn F, Moewes C,
gence. Harvard Business Review Press, Boston
Steinbrecher M, Held P (2013) Computational intelli-
Camazine S, Deneubourg J-L, Franks NR, Sneyd J,
gence: a methodological introduction. Springer,
Theraulaz G, Bonabeau E (2001) Self-organization in
New York
biological systems. Princeton University Press,
Mohanty S (2018) Swarm Intelligence methods for statis-
Princeton
tical regression. Chapman and Hall/CRC, Boca Raton
Deutsch A, Dormann S (2018) Cellular automaton model-
Nayyar A, Le D-N, Nguyen NG (2018) Advances in
ing of biological pattern formation: characterization,
swarm intelligence for optimizing problems in com-
examples, and analysis, 2nd edn. Birkhauser, Basel
puter science. Chapman and Hall/CRC, Boca Raton
Dorigo M, Sahin E (2004) Swarm robotics – special issue
Sipper M (2002) Machine nature: the coming age of bio-
editorial. Auton Robot 17(2–3):111–113
inspired computing. McGraw-Hill, New York
Engelbrecht AP (2006) Fundamentals of computational
Solnon C (2010) Ant colony optimization and constraint
swarm intelligence. Wiley, New York
programming. Wiley-ISTE, Hoboken
Hamann H (2010) Space-time continuous models of
Tan Y, Takagi H, Shi Y, Niu B (2017) Advances in swarm
swarm robotic systems: supporting global-to-local pro-
intelligence: 8th international conference, ICSI 2017,
gramming, cognitive systems monographs. Springer,
Fukuoka
Berlin
psychology, management science, policy, and
Social Phenomena Simulation some areas of biology. Computer simulation con-
cerns the study of different techniques for simu-
Paul Davidsson1 and Harko Verhagen2 lating phenomena on a computer, e.g., discrete-
1
Department of Computer Science, Malmö event, object-oriented, and equation-based
University, Malmö, Sweden simulation.
2
Department of Computer and Systems Sciences,
Stockholm University, Kista, Sweden
Introduction
the social behavior of groups of animals or artifi- Why Simulate Social Phenomena?
cial creatures. One example is the Boid model by
Reynolds (1987), which simulates coordinated Simulation of social phenomena can be done for
animal motion such as bird flocks and fish different purposes, e.g.,
schools. With respect to human societies, Epstein
and Axtell (1996) developed one of the first agent- • Supporting social-theory building
based models, called Sugarscape, to explore the • Supporting the engineering of systems, e.g.,
role of social phenomena such as seasonal migra- validation, testing, etc.
tions, pollution, sexual reproduction, combat, and • Supporting planning, policymaking, and other
transmission of disease. This work is in spirit decision making
closely related to one of the best-known and ear- • Training, in order to improve a person’s skills
liest examples of the use of simulation in social in a certain domain
science, namely, the Schelling model (1971), in
which cellular automata were used to simulate the It is possible to distinguish between four types
emergence of segregation patterns in neighbor- of end users: scientists, who use social phenom-
hoods based on a few simple rules expressing ena simulation in the research process to gain new
the preferences of the agents. Another pioneer knowledge; policymakers, who use it for making
worth mentioning is Barricelli (1957), who to strategic decisions; managers (of systems), who
some extent used agent-based modeling for sim- use it to make operational decisions; and other
ulating biological systems. professionals, such as architects, who use it in
To sum up, we can identify two main their daily work. We will now describe how
approaches to social simulation: these types of end users may use simulation of
social phenomena for different purposes.
• Macrolevel (or equation-based) simulation,
which is typically based on mathematical
models. It views the set of individuals (the Supporting Social-Theory Building
population) as a structure that can be charac-
terized by a number of variables. In the context of social-theory building, agent-
• Microlevel (or agent-based) simulation, in based simulation can be seen as an experimental
which the specific behaviors of specific indi- method or as theories in themselves (Sawyer
viduals are explicitly modeled. In contrast to 2003). In the former case, simulations are run to
macrolevel simulation, it views the structure as test the predictions of theories, whereas in the
emerging from the interactions between indi- latter case, simulations in themselves are formal
viduals and thus exploring the standpoint that models of theories. Formalizing the ambiguous,
complex effects need not have complex causes. natural language-based theories of the social sci-
ences helps to find inconsistencies and other prob-
As argued by Van Parunak et al. (1998), agent- lems and thus contributes to theory building.
based modeling is most appropriate for domains Using agent-based simulation studies as an
characterized by a high degree of localization and experimental tool offers great possibilities. Many
distribution and dominated by discrete decision. experiments with human societies are either
Equation-based modeling, on the other hand, is unethical or even impossible to conduct. Experi-
most naturally applied to systems that can be ments in silico, on the other hand, are fully possi-
modeled centrally and in which the dynamics are ble. These can also breathe new life into the ever-
dominated by physical laws rather than informa- present debate in sociology on the micro-macro
tion processing. We will here focus on agent- link (Alexander et al. 1987). Agent-based models
based models, particularly those that have a richer mostly focus on the emergence of macrolevel
representation of the individual than the cellular properties from the local interaction of adaptive
automata. agents that influence one another (Macy and
Social Phenomena Simulation 821
Miller 2002; Sawyer 2003). However, simulations and the efficiency of reactions to emergencies
in computational organization theory (Carley and (Friedricj and Burghardt 2007). For a recent over-
Prietula 1994; Prietula et al. 1998), for example, view of work in this area, see Mustaphaa et al.
often try to analyze the influence of macrolevel (2013). An example of the use of real-world data
phenomena on individuals. Using agent-based is to analyze the effect of different insurance pol-
models to simulate the bidirectional relation icies on the willingness of agents to pay for a
between micro- and macrolevel concepts would disaster insurance policy (Brouwers and Verhagen
provide tools to analyze the theoretical conse- 2004). In the related area of flood management,
quences of the work done by theorists such as Dawson et al. (2011) describe a model where
Habermas, Giddens, and Bourdieu, to name a individual decisions by inhabitants are
few (Sawyer 2003). implemented as probabilistic finite state machines
where the parameters are linked to census data.
Another application area for this type of simu-
Supporting the Engineering of Systems lation study is disease spreading. Typically, agents
are used to represent human beings and the simu-
Many new technical systems are distributed and lation model is linked to real-world geographical
involve complex interactions between humans data. One study (Yergens et al. 2006) also
and machines. The properties of agent-based sim- included agents that represent towns acting as
ulation make it especially suitable for simulating the epicenter of disease outbreak. The town
these kinds of systems. The idea is to model the agent’s behavior repertoire consisted of different
behavior of human users in terms of software containment strategies. The simulation model can
agents. In particular, this seems useful in situa- be quickly adapted to local circumstances via the
tions where it is too expensive, difficult, inconve- geographical data (given that there is data on the
nient, tiresome, or even impossible for real human population as well) and is used to determine the
users to test out a new technical system. Of course, effects of different containment strategies.
also the technical system, or parts thereof, may be A third area where agent-based social simula-
simulated. For instance, if the technical system tion has been used to support planning and
includes hardware that is expensive and/or special policymaking is traffic and transport (Bazzan
purpose, it is natural to simulate also this part of and Kügl 2014). An example of this is the simu-
the system when testing out the control software. lation of all cars traveling in Switzerland during
An example of such a case is the testing of control morning peak traffic (Balmer et al. 2008).
systems for “intelligent buildings,” where agents
simulate the behavior of the people in the building
(Davidsson 2000). Training
Steve (Rickel and Lewis Johnson 1999, Méndez The behavior is modeled in terms of probabilities,
et al. 2003). Steve was an agent integrated with and no attempt is made to justify these in terms of
voice synthesis software and virtual reality soft- individual preferences, decisions, plans, etc. Also,
ware providing a very realistic training environ- each simulated individual is considered in isola-
ment for controlling the engine room of a virtual tion without regard to their interaction with others.
US Navy surface ship. Thus, better results may be gained if cognitive
Another example is the PSI agent (Künzel and processes and communication between individ-
Hammer 2006). Whereas in most cases the simu- uals are also simulated.
lator training is aimed at training practical skills or Opening the black box of individual decision
decision making, this work focuses on acquiring making can be done in several ways. The first
theoretical insights in the realm of psychological layer to add is often individual psychology; for
theory. The simulation enables students to explore instance, the so-called beliefs, desires, and inten-
psychological processes without ethical problems. tions (BDI) model is often used. For an overview
of simulations using BDI models, see Adam and
Gadou (2016). Models of individual cognition
Simulating Social Phenomena used in agent-based social simulation include the
use of Soar (a computer implementation of Allen
One of the first, and simplest, ways of performing Newell’s unified theory of cognition (Künzel and
microlevel simulation is often called dynamic Hämmer 2006)), which was used in Steve
microsimulation (Gilbert 1999; Gilbert and (discussed in section “Why Simulate Social
Troitzch 2005). It is used to simulate the effect Phenomena?”).
of the passing of time on individuals. Data from a For the simulation of social behavior, the
(preferably large) random sample from the popu- agents need to be equipped with mechanisms for
lation to be simulated is used to initially charac- reasoning at the social level (unless the social
terize the simulated individuals. Some examples level is regarded as emerging from individual
of sampled features are age, sex, employment behavior and decision making). Several models
status, income, and health status. A set of transi- have been based on theories from economics,
tion probabilities are used to describe how these social psychology, sociology, etc. An early exam-
features will change over a given time period, e.g., ple of this is provided by Guye-Vuillème (2004),
there is a probability that an employed person will who has developed an agent-based model for sim-
become unemployed over the course of a year. ulating human interaction in a virtual reality envi-
The transition probabilities are applied to the pop- ronment. The model is based on sociological
ulation for each individual in turn and then repeat- concepts such as roles, values, and norms and
edly reapplied for a number of simulated time motivational theories from social psychology to
periods. Sometimes it is necessary to also model simulate persons with social identities and rela-
changes in the population, e.g., birth, death, and tionships. Another example is the Consumat
marriage. This type of simulation can be used to, model (Janssen and Jager 1999), a metamodel
e.g., predict the outcome of different social poli- combining several psychological theories on deci-
cies. However, the quality of such simulations sion making in a consumer situation, used, for
depends on the quality of the following: instance, to investigate different flood manage-
ment policies (Brouwers and Verhagen 2004).
• The random sample, which must be
representative
• The transition probabilities, which must be Future Directions
valid and complete
In a study of applications of agent-based simula-
In traditional microsimulation, the behavior of tion (Davidsson et al. 2007), it was concluded that
each individual is regarded as a “black box.” even if agent-based simulation seems a promising
Social Phenomena Simulation 823
approach to many problems involving the simu- simulation. In: Antunes L, Takadama K (eds) Multi-
lation of complex systems of interacting entities agent-based simulation VII, Lecture notes in computer
science, vol 4442. Springer, Berlin
such as social phenomena, it seems that the full Dawson RJ, Peppe R, Wang M (2011) An agent-based
potential of the agent concept often is not utilized. model for risk-based flood incident management. Nat
For instance, most models have very primitive Hazards 59(1):167–189
agent cognition, in particular if the number of Epstein JM (2014) Agent_zero: toward neurocognitive
foundations for generative social science. Princeton
agents involved is large. University Press
Regarding future applications, the combina- Epstein JM, Axtell RL (1996) Growing artificial societies:
tion of findings from neuroscience and simulation social science from the bottom up. MIT Press, Cambridge
studies as proposed by Epstein (2014) is a new Fiedrich F, Burghardt P (2007) Emergency response infor-
mation systems: emerging trends and technologies:
development and an interesting follow-up to the agent-based systems for disaster management.
abovementioned Sugerscape book. The wide Commun ACM 50(3):41–42
availability of high-quality graphics opens for García Carbajal S, Polimeni F, Múgica JL (2015) An emo-
agent-based social simulation in combination tional engine for behavior simulators. Int J Serious
Game 2(2):57–67
with sophisticated visualization techniques, such Gilbert N (1999) Computer simulation in the social sci-
as virtual reality, in the form of “serious games,” ences. Sage, Thousand Oaks
which has the potential to provide very powerful Gilbert N, Troitzsch KG (2005) Simulation for the social
training environments; see, e.g., Garcia Carbajal scientist, 2nd edn. Open University Press, Maidenhead
Guye-Vuillème A (2004) Simulation of nonverbal social
et al. (2015). In the context of military training, interaction and small groups dynamics in virtual envi-
McAlinden et al. (2014) provide an interesting ronments. Ph D thesis, Ècole Polytechnique Fédérale
application. de Lausanne, No 2933
Janssen MA, Jager W (1999) An integrated approach to
simulating behavioural processes: a case study of the
lock-in of consumption patterns. J Artif Soc Soc Simul
Bibliography 2(2). http://jasss.soc.surrey.ac.uk/2/2/2.html
Künzel J, Hämmer V (2006) Simulation in university edu-
Adam C, Gaudou B (2016) BDI agents in social simula- cation: the artificial agent PSI as a teaching tool. Sim-
tions: a survey. Knowl Eng Rev 31(3):207–238 ulation 82(11):761–768
Alexander JC, Giesen B, Münch R, Smelser NJ (eds) Macy MW, Willer R (2002) From factors to actors: com-
(1987) The micro-macro link. University of California putational sociology and agent-based modeling. Annu
Press, Berkeley Rev Sociol 28:143–166
Balmer M, Meister K, Nagel K (2008) Agent-based simu- McAlinden R, Pynadath D, Hill RW (2014) UrbanSim:
lation of travel demand: structure and computational using social simulation to train for stability operations.
performance of MATSim-T. ETH, Eidgenössische In: Ehlschlaeger CH (ed) Understanding megacities
Technische Hochschule Zürich, IVT Institut für with the reconnaissance, surveillance, and intelligence
Verkehrsplanung und Transportsysteme paradigm, US Army white report
Barricelli NA (1957) Symbiogenetic evolution processes Méndez G, Rickel J, de Antonio A (2003) Steve meets
realized by artificial methods. Methodos Jack: the integration of an intelligent tutor and a virtual
9(35–36):143–182 environment with planning capabilities. In: Intelligent
Bazzan A, Klügl F (2014) A review on agent-based tech- virtual agents, Lecture notes on artificial intelligence,
nology for traffic and transportation. Knowl Eng Rev vol 2792. Springer, Berlin, pp 325–332
29(3):375–403 Mustaphaa K, Mcheicka H, Melloulib S (2013) Modeling
Brouwers L, Verhagen H (2004) Applying the Consumat and simulation agent-based of natural disaster complex
model to flood management policies. In: Müller J, Sei- systems. Procedia Comput Sci 21:148–155. https://doi.
del M (eds) 4th workshop on agent-based simulation. org/10.1016/j.procs.2013.09.021
SCS, Montpellier, pp 29–34 Newell A (1994) Unified theories of cognition. Harvard
Carley KM, Prietula M (eds) (1994) Computational orga- University Press, Cambridge
nization theory. Erlbaum, Hillsdale Parunak HVD, Savit R, Riolo RL (1998) Agent-based
Davidsson P (2000) Multi-agent-based simulation: beyond modeling vs. equation-based modeling: a case study
social simulation. In: Moss S, Davidsson P (eds) Multi- and users’ guide. In: Sichman JS, Conte R, Gilbert
agent-based simulation, Lecture notes in computer sci- N (eds) Multi-agent systems and agent-based simula-
ence, vol 1979. Springer, Berlin, pp 98–107 tion, Lecture notes in computer science, vol 1534.
Davidsson P (2002) Agent-based social simulation: a com- Springer, Berlin, pp 10–26
puter science view. J Artif Soc Soc Simul 5(1) Prietula MJ, Carley KM, Gasser L (eds) (1998) Simulating
Davidsson P, Holmgren J, Kyhlbäck H, Mengistu D, organizations: computational models of institutions and
Persson M (2007) Applications of multi-agent-based groups. MIT Press, Cambridge
824 Social Phenomena Simulation
agent and the interaction among agents agent-based parameters and initial conditions, the dynamics may
simulation allows us to investigate the dynamics of be different. So economists may ask: what is the
complex economic systems with many heteroge- value of conducting simulations if we get very dif-
neous and not necessarily fully rational agents. ferent results with different parameter values? While
The agent-based simulation approach allows in physics the parameters (like the gravitational
economists to investigate systems that cannot be constant) may be known with great accuracy, in
studied with the conventional methods. Thus, the economics the parameters (like the risk-aversion
following key questions can be addressed: How do coefficient or for that matter the entire decision-
heterogeneity and systematic deviations from ratio- making rule) are typically estimated with substantial
nality affect markets? Can these elements explain error. This is a strong point. Indeed, we would argue
empirically observed phenomena which are consid- that the “art” of agent-based simulations is the abil-
ered “anomalies” in the standard economics litera- ity to understand the general dynamics of the system
ture? How robust are the results obtained with the and to draw general conclusions from a finite num-
analytical models? By addressing these questions ber of simulations. Of course, one simulation is
the agent-based simulation approach complements sufficient as a counterexample to show that a certain
the traditional analytical analysis and is gradually result does not hold, but many more simulations are
becoming a standard tool in economic analysis. required in order to convince of an alternative gen-
eral regularity.
This manuscript is intended as an introduction to
Introduction agent-based computational economics. An introduc-
tion to this field has two goals: (i) to explain and to
For solving the dynamics of two bodies (e.g., demonstrate the agent-based methodology in eco-
stars), with some initial locations and velocities nomics, stressing the advantages and disadvantages
and some law of attraction (e.g., gravitation), there of this approach relative to the alternative purely
is a well-known analytical solution. However, for analytical methodology, and (ii) to review studies
a similar system with three bodies, there is no published in this area. The emphasis in this entry
known analytical solution. Of course, this does will be on the first goal. While section “Some of the
not mean that physicists cannot investigate and Pioneering Studies” does provide a brief review of
predict the behavior of such systems. Knowing some of the cornerstone studies in this area, more
the state of the system (i.e., the location, velocity, comprehensive reviews can be found in LeBaron
and acceleration of each body) at time t allows us (2000), Levy et al. (2000), Samanidou et al. (2007),
to calculate the state of the system an instant later, and Tesfatsion (2001, 2002), in which part of section
at time t + Dt. Thus, starting with the initial con- “Some of the Pioneering Studies” is based.
ditions, we can predict the dynamics of the system A comprehensive review of the many entries
by simply simulating the “behavior” of each ele- employing agent-based computational models in
ment in the system over time. economics will go far beyond the scope of this
This powerful and fruitful approach, sometimes entry. To achieve goal (i) above, in section “Illustra-
called “microscopic simulation,” has been adopted tion with the LLS Model,” we will focus on one
by many other branches of science. Its application in particular model of the stock market in some detail.
economics is best known as “agent-based simula- Section “Summary and Future Directions” con-
tion” or “agent-based computation.” The advan- cludes with some thoughts about the future of the
tages of this approach are clear – they allow the field.
researcher to go where no analytical models can
go. Yet, despite of the advantages, perhaps surpris-
ingly, the agent-based approach was not adopted Some of the Pioneering Studies
very quickly by economists. Perhaps the main rea-
son for this is that a particular simulation only Schelling’s Segregation Model
describes the dynamics of a system with a particular Schelling’s (Schelling 1978) classical segregation
set of parameters and initial conditions. With other model is one of the earliest models of population
Agent-Based Computational Economics 827
dynamics. Schelling’s model is not intended as a employing agent-based simulation, one can
realistic tool for studying the actual dynamics of study the macroscopic results induced by the
specific communities as it ignores economic, agents’ individual behavior.
real-estate, and cultural factors. Rather, the aim
of this very simplified model is to explain the The Kim and Markowitz Portfolio Insurers
emergence of macroscopic single-race neighbor- Model
hoods even when individuals are not racists. Harry Markowitz is very well known for being
More precisely, Schelling found that the collec- one of the founders of modern portfolio theory, a
tive effect of neighborhood racial segregation contribution for which he/she has received the
results even from individual behavior that pre- Nobel Prize in economics. It is less well known,
sents only a very mild preference for same-color however, that Markowitz is also one of the pio-
neighbors. For instance, even the minimal require- neers in employing agent-based simulations in
ment by each individual of having (at least) one economics.
neighbor belonging to one’s own race leads to the During the October 1987 crash, markets all
segregation effect. over the globe plummeted by more than 20%
The agent-based simulation starts with a square within a few days. The surprising fact about this
mesh or lattice (representing a town) which is com- crash is that it appeared to be spontaneous – it was
posed of cells (representing houses). On these cells not triggered by any obvious event. Following the
reside agents which are either “blue” or “green” (the 1987 crash, researchers started to look for endog-
different races). The crucial parameter is the minimal enous market features, rather than external forces,
percentage of same-color neighbors that each agent as sources of price variation. The Kim-Markowitz
requires. Each agent, in his/her turn, examines the (Kim and Markowitz 1989) model explains the
color of all his/her neighbors. If the percentage of 1987 crash as resulting from investors’ “Constant
neighbors belonging to his/her own group is above Proportion Portfolio Insurance” (CPPI) policy.
the “minimal percentage,” the agent does nothing. If Kim and Markowitz proposed that market insta-
the percentage of neighbors of his/her own color is bilities arise as a consequence of the individual
less then the minimal percentage, the agent moves to insurers’ efforts to cut their losses by selling once
the closest unoccupied cell. The agent then examines the stock prices are going down.
the color of the neighbors of the new location and The Kim-Markowitz agent-based model
acts accordingly (moves if the number of neighbors involves two groups of individual investors:
of his/her own color is below the minimal percent- rebalancers and insurers (CPPI investors). The
age and stays there otherwise). This goes on until the rebalancers are aiming to keep a constant compo-
agent is finally located at a site in which the minimal sition of their portfolio, while the insurers make
percentage condition holds. After a while, however, the appropriate operations to insure that their
it might happen that following the moves of the eventual losses will not exceed a certain fraction
other agents, the minimal percentage condition of the investment per time period.
ceases to be fulfilled and then the agent starts mov- The rebalancers act to keep a portfolio struc-
ing again until he/she finds an appropriate cell. As ture with (for instance) half of their wealth in
mentioned above, the main result is that even for cash and half in stocks. If the stock price rises,
very mild individual preferences for same-color then the stocks weight in the portfolio will
neighbors, after some time the entire system displays increase, and the rebalancers will sell shares
a very high level of segregation. until the shares again constitute 50% of the
A more modern, developed, and sophisticated portfolio. If the stock price decreases, then the
reincarnation of these ideas is the Sugarscape value of the shares in the portfolio decreases,
environment described by Epstein and Axtell and the rebalancers will buy shares until the
(1996). The model considers a population of mov- stock again constitutes 50% of the portfolio.
ing, feeding, pairing, procreating, trading, warring Thus, the rebalancers have a stabilizing influ-
agents and displays various qualitative collective ence on the market by selling when the market
events which their populations incur. By rises and buying when the market falls.
828 Agent-Based Computational Economics
A typical CPPI investor has as his/her main endogenous coevolution, rather than by exogenous
objective not to lose more than (for instance) events. Moreover, AHLPT study the various
25 % of his/her initial wealth during a quarter, regimes of the system: the regime in which rational
which consists of 65 trading days. Thus, he/she fundamentalist strategies are dominating versus the
aims to insure that at each cycle 75% of the initial regime in which investors start developing strate-
wealth is out of reasonable risk. To this effect, gies based on technical trading. In the technical
he/she assumes that the current value of the trading regime, if some of the investors follow
stock will not fall in one day by more than a factor fundamentalist strategies, they may be punished
of 2. The result is that he/she always keeps in rather than rewarded by the market. AHLPT also
stock twice the difference between the present study the relation between the various strategies
wealth and 75% of the initial wealth (which (fundamentals vs. technical) and the volatility prop-
he/she had at the beginning of the 65-day erties of the market (clustering, excess volatility,
investing period). This determines the amount volume-volatility correlations, etc.).
the CPPI agent is bidding or offering at each In the first entry quoted above, the authors sim-
stage. Obviously, after a price fall, the amount ulated a single stock and further limited the bid/offer
he/she wants to keep in stocks will fall and the decision to a ternary choice of: (i) bid to buy one
CPPI investor will sell and further destabilize the share, (ii) offer to sell one share, or: (iii) do nothing.
market. After an increase in the prices (and per- Each agent had a collection of rules which described
sonal wealth), the amount the CPPI agent wants to how he/she should behave (i, ii, or iii) in various
keep in shares will increase: he/she will buy and market conditions. If the current market conditions
may support a price bubble. were not covered by any of the rules, the default was
The simulations reveal that even a relatively to do nothing. If more than one rule applied in a
small fraction of CPPI investors (i.e., less than certain market condition, the rule to act upon was
50%) is enough to destabilize the market, and chosen probabilistically according to the “strengths”
crashes and booms are observed. Hence, the of the applicable rules. The “strength” of each rule
claim of Kim and Markowitz that the CPPI policy was determined according to the rule’s past perfor-
may be responsible for the 1987 crash is supported mance: rules that “worked” became “stronger.”
by the agent-based simulations. Various variants of Thus, if a certain rule performed well, it became
this model were studied intensively by Egenter more likely to be used again.
et al. (1999) who find that the price time evolution The price is updated proportionally to the rel-
becomes unrealistically periodic for a large number ative excess of offers over demands. In Arthur
of investors (the periodicity seems related with the et al. (1997), the rules were used to predict future
fixed 65-day quarter and is significantly dimin- prices. The price prediction was then transformed
ished if the 65-day period begins on a different into a buy/sell order through the use of a Constant
date for each investor). Absolute Risk-Aversion (CARA) utility function.
The use of CARA utility leads to demands which
The Arthur, Holland, Lebaron, Palmer, and do not depend on the investor’s wealth.
Tayler Stock Market Model The heart of the AHLPT dynamics is the trad-
Palmer et al. (1994) and Arthur et al. (1997) ing rules. In particular, the authors differentiate
(AHLPT) construct an agent-based simulation between “fundamental” rules and “technical”
model that is focused on the concept of coevolu- rules and study their relative strength in various
tion. Each investor adapts his/her investment market regimes. For instance, a “fundamental”
strategy such as to maximally exploit the market rule may require market conditions of the type:
dynamics generated by the investment strategies
of all others investors. This leads to an ever- dividend=current price > 0:04
evolving market, driven endogenously by the
ever-changing strategies of the investors. in order to be applied. A “technical” rule may be
The main objective of AHLPT is to prove that triggered if the market fulfills a condition of the
market fluctuations may be induced by this type:
Agent-Based Computational Economics 829
significantly larger than their own and vice Illustration with the LLS Model
versa (the detailed formulae used by Lux and
Marchesi are inspired from the exponential The purpose of this section is to give a more
transition probabilities governing statistical detailed “hands-on” example of the agent-based
mechanics physical systems). approach and to discuss some of the practical
dilemmas arising when implementing this
The main results of the model are:
approach, by focusing on one specific model.
We will focus on the so-called LLS model of the
• No long-term deviations between the current mar-
stock market (for more details and various ver-
ket price and the fundamental price are observed.
sions of the model, see Hellthaler (1995), Kohl
• The deviations from the fundamental price,
(1997), Levy and Levy (1996), and Levy et al.
which do occur, are unsystematic.
(1994, 1996, 2000)). This section is based on the
• In spite of the fact that the variations of the
presentation of the LLS model in Chap. 7 of Levy
fundamental price are normally distributed, the
et al. (2000).
variations of the market price (the market
returns) are not. In particular the returns exhibit
Background
a frequency of extreme events which is higher
Real-life investors differ in their investment
than expected for a normal distribution. The
behavior from the investment behavior of the
authors emphasize the amplification role of the
idealized representative rational investor assumed
market that transforms the input normal distri-
in most economic and financial models. Investors
bution of the fundamental value variations into
differ one from the other in their preferences, their
a leptokurtotic (heavy-tailed) distribution of
investment horizon, the information at their dis-
price variation, which is encountered in the
posal, and their interpretation of this information.
actual financial data.
No financial economist seriously doubts these
• Clustering of volatility.
observations. However, modeling the empirically
and experimentally documented investor behavior
The authors explain the volatility clustering and the heterogeneity of investors is very difficult
(and as a consequence, the leptokurticity) by the and in most cases practically impossible to do
following mechanism. In periods of high volatil- within an analytic framework. For instance, the
ity, the fundamental information is not very useful empirical and experimental evidence suggests that
to insure profits, and a large fraction of the agents most investors are characterized by Constant Rel-
become chartists. The opposite is true in quiet ative Risk Aversion (CRRA), which implies a
periods when the actual price is very close to the power (myopic) utility function (see Eq. 2
fundamental value. The two regimes are separated below). However, for a general distribution of
by a threshold in the number of chartist agents. returns, it is impossible to obtain an analytic solu-
Once this threshold is approached (from below), tion for the portfolio optimization problem of
large fluctuations take place which further investors with these preferences. Extrapolation
increase the number of chartists. This destabiliza- of future returns from past returns, biased proba-
tion is eventually dampened by the energetic inter- bility weighting, and partial deviations from ratio-
vention of the fundamentalists when the price nality are also all experimentally documented but
deviates too much from the fundamental value. difficult to incorporate in an analytical setting.
The authors compare this temporal instability with One is then usually forced to make the assump-
the on-off intermittence encountered in certain tions of rationality and homogeneity (at least in
physical systems. According to Egenter et al. some dimension) and to make unrealistic assump-
(1999), the fraction of chartists in the Lux March- tions regarding investors’ preferences, in order to
esi model goes to zero as the total number of obtain a model with a tractable solution. The hope
traders goes to infinity, when the rest of the param- in these circumstances is that the model will cap-
eters are kept constant. ture the essence of the system under investigation
Agent-Based Computational Economics 831
and will serve as a useful benchmark, even though with a benchmark model in which all of the inves-
some of the underlying assumptions are admit- tors are rational, informed, and identical, and then,
tedly false. one by one, we add elements of heterogeneity and
Most homogeneous rational agent models lead deviations from rationality to the model in order to
to the following predictions: no trading volume, study their effects on the market dynamics.
zero autocorrelation of returns, and price volatility In the benchmark model all investors are ratio-
which is equal to or lower than the volatility of the nal, informed, and identical (RII investors). This
“fundamental value” of the stock (defined as the is, in effect, a “representative agent” model. The
present value of all future dividends; see Shiller RII investors are informed about the dividend
(1981)). However, the empirical evidence is very process, and they rationally act to maximize their
different: expected utility. The RII investors make invest-
ment decisions based on the present value of
• Trading volume can be extremely heavy future cash flows. They are essentially fundamen-
(Admati and Pfleiderer 1988; Karpoff 1987). talists who evaluate the stock’s fundamental value
• Stock returns exhibit short-run momentum and try to find bargains in the market. The bench-
(positive autocorrelation) and long-run mean mark model in which all investors are RII yields
reversion (negative autocorrelation) (Fama results which are typical of most rational-
and French 1988; Jegadeesh and Titman representative-agent models: in this model prices
1993; Levy and Lim 1998; Poterba and follow a random walk, there is no excess volatility
Summers 1988). of the prices relative to the volatility of the divi-
• Stock returns are excessively volatile relative dend process, and since all agents are identical,
to the dividends (Shiller 1981). there is no trading volume.
After describing the properties of the bench-
As most standard rational-representative-agent mark model, we investigate the effects of intro-
models cannot explain these empirical findings, ducing various elements of investor behavior
these phenomena are known as “anomalies” or which are found in laboratory experiments but
“puzzles.” Can these “anomalies” be due to ele- are absent in most standard models. We do so by
ments of investors’ behavior which are adding to the model a minority of investors who
unmodeled in the standard rational- do not operate like the RII investors. These inves-
representative-agent models, such as the experi- tors are efficient market believers (EMBs from
mentally documented deviations of investors’ now on). The EMBs are investors who believe
behavior from rationality and/or the heterogeneity that the price of the stock reflects all of the cur-
of investors? The agent-based simulation rently available information about the stock. As a
approach offers us a tool to investigate this ques- consequence, they do not try to time the market or
tion. The strength of the agent-based simulation to buy bargain stocks. Rather, their investment
approach is that since it is not restricted to the decision is reduced to the optimal diversification
scope of analytical methods, one is able to inves- problem. For this portfolio optimization, the
tigate virtually any imaginable investor behavior ex-ante return distribution is required. However,
and market structure. Thus, one can study models since the ex-ante distribution is unknown, the
which incorporate the experimental findings EMB investors use the ex-post distribution in
regarding the behavior of investors and evaluate order to estimate the ex-ante distribution. It has
the effects of various behavioral elements on mar- been documented that in fact, many investors
ket dynamics and asset pricing. form their expectations regarding the future return
The LLS model incorporates some of the main distribution based on the distribution of past
empirical findings regarding investor behavior, returns.
and we employ this model in order to study the There are various ways to incorporate the
effect of each element of investor behavior on investment decisions of the EMBs. This stems
asset pricing and market dynamics. We start out from the fact that there are different ways to
832 Agent-Based Computational Economics
estimate the ex-ante distribution from the ex-post investors may deviate to some extent from the opti-
distribution. How far back should one look at the mal choice which maximizes their expected utility.
historical returns? Should more emphasis be These deviations from the optimal choice may be
given to more recent returns? Should some “out- due to irrationality, inefficiency, liquidity con-
lier” observations be filtered out? etc. Of course, straints, or a combination of all of the above.
there are no clear answers to these questions, and In the framework of the LLS model, we exam-
different investors may have different ways of ine the effects of the EMBs’ deviations from
forming their estimation of the ex-ante return dis- rationality and their heterogeneity, relative to the
tribution (even though they are looking at the benchmark model in which investors are
same series of historical returns). Moreover, informed, rational, and homogeneous. We find
some investors may use the objective ex-post that the behavioral elements which are empirically
probabilities when constructing their estimation documented, namely, extrapolation from past
of the ex-ante distribution, whereas others may returns, deviation from rationality, and heteroge-
use biased subjective probability weights. In neity among investors, lead to all of the following
order to build the analysis step-by-step, we start empirically documented “puzzles”:
by analyzing the case in which the EMB popula-
tion is homogeneous, and then introduce various • Excess volatility
forms of heterogeneity into this population. • Short-term momentum
An important issue in market modeling is that of • Longer-term return mean reversion
the degree of investors’ rationality. Most models in • Heavy trading volume
economics and finance assume that people are fully • Positive correlation between volume and con-
rational. This assumption usually manifests itself as temporaneous absolute returns
the maximization of an expected utility function by • Positive correlation between volume and
the individual. However, numerous experimental lagged absolute returns
studies have shown that people deviate from rational
decision-making (Thaler 1993, 1994; Tversky and The fact that all these anomalies or “puzzles,”
Kahneman 1981, 1986, 1992). Some studies model which are hard to explain with standard rational-
deviations from the behavior of the rational agent by representative-agent models, are generated natu-
introducing a subgroup of liquidity or “noise” rally by a simple model which incorporates the
traders. These are traders that buy and sell stocks experimental findings regarding investor behavior
for reasons that are not directly related to the future and the heterogeneity of investors leads one to
payoffs of the financial asset – their motivation to suspect that these behavioral elements and the
trade arises from outside of the market (e.g., a “noise diversity of investors are a crucial part of the
trader’s” daughter unexpectedly announces his/her workings of the market, and as such they cannot
plans to marry, and the trader sells stocks because of be “assumed away.” As the experimentally
this unexpected need for cash). The exogenous rea- documented bounded-rational behavior and het-
sons for trading are assumed random and thus lead erogeneity are in many cases impossible to ana-
to random or “noise” trading (see Grossman and lyze analytically, agent-based simulation presents
Stiglitz 1980). The LLS model takes a different a very promising tool for investigating market
approach to the modeling of noise trading. Rather models incorporating these elements.
than dividing investors into the extreme categories
of “fully rational” and “noise traders,” the LLS The LLS Model
model assumes that most investors try to act as The stock market consists of two investment alter-
rationally as they can, but are influenced by a mul- natives: a stock (or index of stocks) and a bond.
titude of factors causing them to deviate to some The bond is assumed to be a riskless asset, and the
extent from the behavior that would have been stock is a risky asset. The stock serves as a proxy
optimal from their point of view. Namely, all inves- for the market portfolio (e.g., the Standard &
tors are characterized by a utility function and act to Poor’s 500 index). The extension from one risky
maximize their expected utility; however, some asset to many risky assets is possible; however,
Agent-Based Computational Economics 833
one stock (the index) is sufficient for our present (1994), and the power utility function is the unique
analysis because we restrict ourselves to global utility function which satisfies the CRRA condition.
market phenomena and do not wish to deal with Another implication of CRRA is that the optimal
asset allocation across several risky assets. Inves- investment choice is independent of the investment
tors are allowed to revise their portfolio at given horizon (Samuelson 1989, 1994). In other words,
time points, i.e., we discuss a discrete time model. regardless of investors’ actual investment horizon,
The bond is assumed to be a riskless invest- they choose their optimal portfolio as though they
ment yielding a constant return at the end of each are investing for a single period. The myopia prop-
time period. The bond is in infinite supply and erty of the power utility function simplifies our
investors can buy from it as much as they wish at a analysis, as it allows us to assume that investors
given rate of rf. The stock is in finite supply. There maximize their one-period-ahead expected utility.
are N outstanding shares of the stock. The return We model two different types of investors:
on the stock is composed of two elements: rational, informed, and identical (RII) investors
and efficient market believers (EMB). These two
(a) Capital gain: If an investor holds a stock, any investor types are described below.
rise (fall) in the price of the stock contributes to
an increase (decrease) in the investor’s wealth.
Rational Informed Identical (RII) Investors
(b) Dividends: The company earns income and
RII investors evaluate the “fundamental value” of
distributes dividends at the end of each time
the stock as the discounted stream of all future
period. We denote the dividend per share paid
dividends and thus can also be thought of as “fun-
at time t by Dt. We assume that the dividend is a
damentalists.” They believe that the stock price may
stochastic variable following a multiplicative
deviate from the fundamental value in the short run,
random walk, i.e., D et ¼ Dt1 ð1 þ e zÞ, where e z but if it does, it will eventually converge to the
is a random variable with some probability den- fundamental value. The RII investors act according
sity function f(z) in the range [z1, z2] (in order to to the assumption of asymptotic convergence: if the
allow for a dividend cut as well as a dividend stock price is low relative to the fundamental value,
increase, we typically choose: z1 < 0, z2 > 0). they buy in anticipation that the underpricing will be
corrected and vice versa. We make the simplifying
The total return on the stock in period t, which
assumption that the RII investors believe that the
we denote by Rt is given by
convergence of the price to the fundamental value
Pe þ Det will occur in the next period; however, our results
Ret ¼ t ; ð1Þ
Pt1 hold for the more general case where the conver-
gence is assumed to occur some T periods ahead,
where Pet is the stock price at time t. with T > 1.
All investors in the model are characterized by In order to estimate next period’s return distri-
a von Neumann-Morgenstern utility function. We bution, the RII investors need to estimate the dis-
assume that all investors have a power utility tribution of next period’s price, Petþ1 , and of next
function of the form: period’s dividend, D etþ1 . Since they know the div-
idend process, the RII investors know that
W 1a Detþ1 ¼ Dt ð1 þ e zÞ where e z is distributed according
U ðW Þ ¼ ; ð2Þ
1a to f(z) in the range [z1, z2]. The RII investors
employ Gordon’s dividend stream model in order
where a is the risk-aversion parameter. This form of to calculate the fundamental value of the stock:
utility function implies Constant Relative Risk
h i
Aversion (CRRA). We employ the power utility etþ2
Etþ1 D
function (Eq. 2) because the empirical evidence f
ptþ1 ¼ ; ð3Þ
suggests that relative risk aversion is approximately kg
constant (e.g., see Friend and Blume (1975), Gor- where the superscript
h i f stands for the fundamental
don et al. (1972), Kroll et al. (1988), and Levy value, Etþ1 D etþ2 is the dividend corresponding
834 Agent-Based Computational Economics
to time t + 2 as expected at time t + 1, k is the fundamental value next period. In this frame-
discount factor or the expected rate of return work the only source of uncertainty regarding
demanded by the market for the stock, and g is next period’s price stems from the uncertainty
the expected growth rate of the dividend, i.e., regarding next period’s dividend realization.
ð z2 More generally, the RII investors’ uncertainty
g ¼ Eðe
zÞ ¼ f ðzÞzdz: can result from uncertainty regarding any one of
z1 the above factors or a combination of several of
these factors. Any mix of these uncertainties is
The RII investors believe that the stock price may
possible to investigate in the agent-based simu-
temporarily deviate from the fundamental value;
lation framework, but very hard, if not impossi-
however, they also believe that the price will
ble, to incorporate in an analytic framework. As a
eventually converge to the fundamental value.
consequence of the uncertainty regarding next
For simplification we assume that the RII inves-
period’s price and of their risk aversion, the RII
tors believe that the convergence to the fundamen-
investors do not buy an infinite number of shares
tal value will take place next period. Thus, the RII
even if they perceive the stock as underpriced.
investors estimate Pt + 1 as
Rather, they estimate the stock’s next period’s
f
return distribution and find the optimal mix of
Ptþ1 ¼ Ptþ1 : the stock and the bond which maximizes their
expected utility. The RII investors estimate next
etþ2 depends on
The expectation at time t + 1 of D period’s return on the stock as
the realized dividend observed at t + 1:
h i Pe þ D etþ1
Etþ1 Detþ2 ¼ Dtþ1 ð1 þ gÞ: Retþ1 ¼ tþ1
Pt
Dt ð1þe
zÞð1þgÞ
Thus, the RII investors believe that the price at t + kg þ D t ð1 þ e
zÞ
¼ ; ð4Þ
1 will be given by Pt
where ez, the next year’s growth in the dividend, is
f D ð1 þ gÞ
Ptþ1 ¼ Ptþ1 ¼ tþ1 : the source of uncertainty. The demands of the RII
kg investors for the stock depend on the price of the
stock. For any hypothetical price Ph, investors
At time t, Dt is known, but Dt+1 is not; therefore calculate the proportion of their wealth x they
Pt+1 f is also not known with certainty at time t. should invest in the stock in order to maximize
However, given Dt, the RII investors know the their expected utility. The RII investor i believes
distribution of D etþ1 :
that if he/she invests a proportion x of his/her
where e z is distributed according to the known wealth in the stock at time t, then at time t +
f(z). The realization of D etþ1 determines Pt+1 f.
1 his/her wealth will be
Thus, at time t, RII investors believe that Pt+1 is
a random variable given by h i
e i ¼ W i ð1 xÞ 1 þ r f þ xRetþ1 ;
W ð5Þ
tþ1 h
f D ð1 þ e
z Þ ð 1 þ gÞ
Petþ1 ¼ Petþ1 ¼ t :
kg where Retþ1 is the return on the stock, as given by
Eq. 1, and is the wealth of investor Wih at time
Notice that the RII investors face uncertainty t given that the stock price at time t is Ph.
regarding next period’s price. In our model we If the price in period t is the hypothetical price
assume that the RII investors are certain about Ph, the t + 1 expected utility of investor i is the
the dividend growth rate g, the discount factor k, following function of his/her investment propor-
and the fact that the price will converge to the tion in the stock, x:
Agent-Based Computational Economics 835
i h i
e
EU W i e xih ðPh ÞW ih ðPh Þ
tþ1 ¼ EU W h ð1 xÞ 1 þ r f þ xRtþ1 : N ih ðPh Þ ¼ : ð9Þ
Ph
ð6Þ
Efficient Market Believers (EMB)
Substituting Retþ1 from Eq. 4, using the power The second type of investors in the LLS model is
utility function (Eq. 2), and substituting the hypo- EMBs. The EMBs believe in market efficiency –
thetical price Ph for Pt, the expected utility they believe that the stock price accurately reflects
becomes the following function of x: the stock’s fundamental value. Thus, they do not
try to time the market or to look for “bargain”
W i 1a ðz2
stocks. Rather, their investment decision is
i
e h reduced to the optimal diversification between
EU W tþ1 ¼ ð1 xÞ 1 þ r f
1a the stock and the bond. This diversification deci-
z1
sion requires the ex-ante return distribution for the
! stock, but as the ex-ante distribution is not avail-
Dt ð1þzÞð1þgÞ
kg þ Dt ð1 þ zÞ 1a able, the EMBs assume that the process generat-
þx f ðzÞdz;
Ph ing the returns is fairly stable, and they employ the
ex-post distribution of stock returns in order to
ð7Þ estimate the ex-ante return distribution.
where the integration is over all possible values of Different EMB investors may disagree on the
z. In the agent-based simulation framework, this optimal number of ex-post return observations that
expression for the expected utility, and the optimal should be employed in order to estimate the ex-ante
investment proportion x, can be solved numerically return distribution. There is a trade-off between
for any general choice of distribution f(z). For the using more observations for better statistical infer-
sake of simplicity, we restrict the present analysis ence and using a smaller number of only more
to the case where ez is distributed uniformly in the recent observations, which are probably more rep-
range [z1, z2]. This simplification leads to the fol- resentative of the ex-ante distribution. As in reality,
lowing expression for the expected utility: there is no “recipe” for the optimal number of
observations to use. EMB investor i believes that
i
W ih
1a the mi most recent returns on the stock are the best
e 1
EU W tþ1 ¼ estimate of the ex-ante distribution. Investors cre-
ð 1 aÞ ð 2 aÞ ð z 2 z 1 Þ
ate an estimation of the ex-ante return distribution
(
ð2aÞ by assigning an equal probability to each of the mi
k g Ph x kþ1
k þ 1 xDt
ð1 xÞ 1 þ r f þ
Ph k g
Dt ð1 þ z2 Þ most recent return observations:
1
Probi Retþ1 ¼ Rtj ¼ i for j
m
ð2aÞ )
ð1 xÞ 1 þ r f þ
x kþ1
D t ð1 þ z1 Þ : ð8Þ ¼ 1, . . . , mi : ð10Þ
Ph k g
The expected utility of EMB investor i is given by
For any hypothetical price Ph, each investor 1a
1 X
mi
(numerically) finds the optimal proportion xh W ih 1a
EU W itþ1 ¼ ð1 xÞ 1 þ r f þ xRtj ;
1 a m j¼1i
which maximizes his/her expected utility given
by Eq. 8. Notice that the optimal proportion, xh, ð11Þ
is independent of the wealth, Wih. Thus, if all RII
investors have the same degree of risk aversion, a, where the summation is over the set of ml most
they will have the same optimal investment pro- recent ex-post returns, x is the proportion of
portion in the stock, regardless of their wealth. wealth invested in the stock, and as before Wih is
The number of shares demanded by investor i at the wealth of investor i at time t given that the
the hypothetical price Ph is given by stock price at time t is Ph. Notice that Wih does not
836 Agent-Based Computational Economics
change the optimal diversification policy, i.e., x. The results are not much different with these
Given a set of mi past returns, the optimal portfolio various approaches. Since the RII investors are
for the EMB investor i is an investment of a taken as the benchmark of rationality, in this
proportion x*i in the stock and (1x*i) in the entry we add the noise only to the decision-
bond, where x*i is the proportion which maxi- making of the EMB investors.
mizes the above expected utility (Eq. 11) for
investor i. Notice that x*i generally cannot be Market Clearance
solved for analytically. However, in the agent- The number of shares demanded by each investor
based simulation framework, this does not consti- is a monotonically decreasing function of the
tute a problem, as one can find x*i numerically. hypothetical price Ph (see Levy et al. 2000). As
the total number of outstanding shares is N, the
Deviations from Rationality price of the stock at time t is given by the market
Investors who are efficient market believers, and are clearance condition: Pt is the unique price at
rational, choose the investment proportion x* which which the total demand for shares is equal to the
maximizes their expected utility. However, many total supply, N:
empirical studies have shown that the behavior of
investors is driven not only by rational expected X Xxh ðPt ÞW i ðPt Þ
N ih ðPt Þ ¼ h
¼ N; ð13Þ
utility maximization but by a multitude of other i i
Pt
factors (e.g., see Samuelson (1994), Thaler (1993,
1994), and Tversky and Kahneman (1981, 1986)). where the summation is over all the investors in
Deviations from the optimal rational investment the market, RII investors as well as EMB
proportion can be due to the cost of resources investors.
which are required for the portfolio optimization –
time, access to information, computational power,
Agent-Based Simulation
etc. – or due to exogenous events (e.g., an investor
The market dynamics begin with a set of initial
plans to revise his/her portfolio, but gets distracted
conditions which consist of an initial stock price
because his/her car breaks down). We assume that
P0, an initial dividend D0, the wealth and number
the different factors causing the investor to deviate
of shares held by each investor at time t ¼ 0, and
from the optimal investment proportion x* are ran-
an initial “history” of stock returns. As will
dom and uncorrelated with each other. By the cen-
become evident, the general results do not depend
tral limit theorem, the aggregate effect of a large
on the initial conditions. At the first period (t ¼ 1),
number of random uncorrelated influences is a nor-
interest is paid on the bond, and the time 1 divi-
mally distributed random influence, or “noise.” e1 ¼ D0 ð1 þ e
dend D zÞ is realized and paid out.
Hence, we model the effect of all the factors causing
Then investors submit their demand orders,
the investor to deviate from his/her optimal portfolio
Nih(Ph), and the market clearing price P1 is deter-
by adding a normally distributed random variable to
mined. After the clearing price is set, the new
the optimal investment proportion. To be more spe-
wealth and number of shares held by each investor
cific, we assume
are calculated. This completes one time period.
This process is repeated over and over, as the
xi ¼ xi þ ei ; ð12Þ
market dynamics develop.
i
where e is a random variable drawn from a trun- We would like to stress that even the simplified
cated normal distribution with mean zero and benchmark model, with only RII investors, is
standard deviation s. Notice that noise is investor impossible to solve analytically. The reason for
specific; thus, ei is drawn separately and indepen- this is that the optimal investment proportion,
dently for each investor. xh(Ph), cannot be calculated analytically. This
The noise can be added to the decision-making problem is very general and it is encountered
of the RII investors, the EMB investors, or to both. with almost any choice of utility function and
Agent-Based Computational Economics 837
distribution of returns. One important exception is any time period, the ratio between the wealth of
the case of a negative exponential utility function any two investors is equal to the ratio of their
and normally distributed returns. Indeed, many initial wealths, i.e.,:
models make these two assumptions for the sake
of tractability. The problem with the assumption W it W i0
¼ : ð14Þ
of negative exponential utility is that it implies W tj W 0j
Constant Absolute Risk Aversion (CARA), which
is very unrealistic, as it implies that investors As the wealth of investors is always in the same
choose to invest the same dollar amount in a proportion, and as they always invest the same
risky prospect independent of their wealth. This fraction of their wealth in the stock, the number of
is not only in sharp contradiction to the empirical shares held by different investors is also always in
evidence but also excludes the investigation of the the same proportion:
two-way interaction between wealth and price
xt W it
dynamics, which is crucial to the understanding N it Pt W it W i0
of the market. ¼ ¼ ¼ : ð15Þ
N tj W tj W 0j
j
xt W t
Thus, one contribution of the agent-based sim- Pt
ulation approach is that it allows investigation of
models with realistic assumptions regarding Since the total supply of shares is constant, this
investors’ preferences. However, the main contri- implies that each investor always holds the same
bution of this method is that it permits us to number of shares and there is no trading volume
investigate models which are much more complex (the number of shares held may vary from one
(and realistic) than the benchmark model, in investor to another as a consequence of different
which all investors are RII. With the agent-based initial endowments).
simulation approach, one can study models incor-
porating the empirically and experimentally
Log Prices Follow a Random Walk In the
documented investors’ behavior and the heteroge-
benchmark model all investors believe that next
neity of investors.
period’s price will converge to the fundamental
value given by the discounted dividend model
Results of the LLS Model
(Eq. 3). Therefore, the actual stock price is always
We begin by describing the benchmark case
close to the fundamental value. The fluctuations in
where all investors are rational and identical.
the stock price are driven by fluctuations in the
Then we introduce to the market EMB investors
fundamental value, which in turn are driven by the
and investigate their affects on the market
fluctuating dividend realizations. As the dividend
dynamics.
fluctuations are (by assumption) uncorrelated over
time, one would expect that the price fluctuations
Benchmark Case: Fully Rational and Identical
will also be uncorrelated. To verify this intuitive
Agents
result, we examine the return autocorrelations in
In this benchmark model all investors are RII:
simulations of the benchmark model.
rational, informed, and identical. Thus, it is not
Let us turn to the simulation of the model. We
surprising that the benchmark model generates
first describe the parameters and initial conditions
market dynamics which are typical of homoge-
used in the simulation and then report the results.
neous rational agent models.
We simulate the benchmark model with the fol-
lowing parameters:
No Volume All investors in the model are iden-
tical; they therefore always agree on the optimal
proportion to invest in the stock. As a conse- • Number of investors ¼ 1,000.
quence, all the investors always achieve the • Risk-aversion parameter a ¼ 1.5. This value
same return on their portfolio. This means that at roughly conforms with the estimate of the risk-
838 Agent-Based Computational Economics
aversion parameter found empirically and The prices in this simulation seem to fluctuate
experimentally. randomly around the trend. However, Fig. 1 shows
• Number of shares ¼ 10,000. only one simulation. In order to have a more rigor-
• We take the time period to be a quarter, and ous analysis, we perform many independent simu-
accordingly we choose: lations and employ statistical tools. Namely, for
• Riskless interest rate rf ¼ 0.01. each simulation we calculate the autocorrelation
• Required rate of return on stock k ¼ 0.04. of returns. We perform a univariate regression of
• Maximal one-period dividend decrease z1 ¼ the return in time t on the return on time t j:
0.07.
• Maximal one-period dividend growth Rt ¼ a j þ b j Rtj þ e;
z2¼ 0.10.
• e
z is uniformly distributed between these values. where Rt is the return in period t and j is the lag. The
Thus, the average dividend growth rate is g ¼ autocorrelation of returns for lag j is defined as
(z1 + z2)/2 ¼ 0.015.
cov Rt , Rtj
rj ¼ ;
Initial conditions: Each investor is endowed at s2 ðRÞ
b
time t ¼ 0 with a total wealth of $1,000, which is
composed of 10 shares worth an initial price of and it is estimated by b
b. We calculate the autocor-
$50 per share and $500 in cash. The initial quar- relation for different lags, j ¼ 1,. . .40. Figure 2
terly dividend is set at $0.5 (for an annual divi- shows the average autocorrelation as a function of
dend yield of about 4%). As will soon become the lag, calculated over 100 independent simula-
evident, the dynamics are not sensitive to the tions. It is evident both from the figure that the
particular choice of initial conditions. returns are uncorrelated in the benchmark model,
Figure 1 shows the price dynamics in a typical conforming with the random-walk hypothesis.
simulation with these parameters (simulations with
the same parameters differ one from the other No Excess Volatility Since the RII investors
because of the different random dividend realiza- believe that the stock price will converge to the
tions). Notice that the vertical axis in this figure is fundamental value next period, in the benchmark
logarithmic. Thus, the roughly constant slope model, prices are always close to the fundamental
implies an approximately exponential price growth value given by the discounted dividend stream.
or an approximately constant average return. Thus, we do not expect prices to be more volatile
Agent-Based
Computational
Economics, Fig. 1 Price
dynamics in the benchmark
model
Agent-Based Computational Economics 839
Agent-Based
Computational
Economics, Fig. 2 Return
autocorrelation in
benchmark model
than the value of the discounted dividend stream. The Introduction of a Small Minority of EMB
For a formal test of excess volatility, we follow the Investors
technique in Shiller (1981). For each time period In this section we will show that the introduction
we calculate the actual price, Pt, and the funda- of a small minority of heterogeneous EMB inves-
mental value of discounted dividend stream, pft, as tors generates many of the empirically observed
in Eq. 3. Since prices follow an upward trend, in market “anomalies” which are absent in the
order to have a meaningful measure of the vola- benchmark model and indeed, in most other
tility, we must detrend these price series. Follow- rational-representative-agent models. We take
ing Shiller, we run the regression: this as strong evidence that the “nonrational” ele-
ments of investor behavior which are documented
in experimental studies and the heterogeneity of
ln Pt ¼ bt þ c þ e; ð16Þ investors, both of which are incorporated in the
t
LLS model, are crucial to understanding the
dynamics of the market.
in order to find the average exponential price In presenting the results of the LLS model with
growth rate (where b and c are constants). Then, EMB investors, we take an incremental approach.
b
we define the detrended price as: pt ¼ Pt =ebt . We begin by describing the results of a model with a
Similarly, we define the detrended value of the small subpopulation of homogeneous EMB
discounted dividend stream pft and compare s(pt) believers. This model produces the abovementioned
with s(pft). For 100- to 1,000-period simulations, market “anomalies”; however, it produces unrealistic
we find an average s(pt) of 22.4 and an average cyclic market dynamics. Thus, this model is pre-
s(pft) of 22.9. As expected, the actual price and the sented both for analyzing the source of the “anoma-
fundamental value have almost the same volatility. lies” in a simplified setting and as a reference point
To summarize the results obtained for the with which to compare the dynamics of the model
benchmark model, we find that when all investors with a heterogeneous EMB believer population.
are assumed to be rational, informed, and identi- We investigate the effects of investors’ hetero-
cal, we obtain results which are typical of rational- geneity by first analyzing the case in which there
representative-agent models: no volume, no are two types of EMBs. The two types differ in the
return autocorrelations, and no excess volatility. method they use to estimate the ex-ante return
We next turn to examine the effect of introducing distribution. Namely, the first type looks at the
into the market EMB investors, which model set of the last m1 ex-post returns, whereas the
empirically and experimentally documented ele- second type looks at the set of the last m2
ments of investors’ behavior. ex-post returns. It turns out that the dynamics in
840 Agent-Based Computational Economics
this case are much more complicated than a sim- look very similar to the typical dynamics of the
ple “average” between the case where all EMB benchmark model. However, after the first 150 or
investors have m1 and the case where all EMB so periods, the price dynamics change. From this
investors have m2. Rather, there is a complex point onwards the market is characterized by peri-
nonlinear interaction between the two EMB sub- odic booms and crashes. Of course, Fig. 3
populations. This implies that the heterogeneity of describes only one simulation. However, as will
investors is a very important element determining become evident shortly, different simulations with
the market dynamics, an element which is the same parameters may differ in detail, but the
completely absent in representative-agent models. pattern is general: at some stage (not necessarily
Finally, we present the case where there is an after 150 periods), the EMB investors induce cyclic
entire spectrum of EMB investors differing in the price behavior. It is quite astonishing that such a
number of ex-post observations they take into small minority of only 5% of the investors can have
account when estimating the ex-ante distribution. such a dramatic impact on the market.
This general case generates very realistic-looking In order to understand the periodic booms and
market dynamics with all of the abovementioned crashes, let us focus on the behavior of the EMB
market anomalies. investors. After every trade, the EMB investors
revise their estimation of the ex-ante return distri-
Homogeneous Subpopulation of EMBs bution, because the set of ex-post returns they
When a very small subpopulation of EMB inves- employ to estimate the ex-ante distribution
tors is introduced to the benchmark LLS model, the changes. Namely, investors add the latest return
market dynamics change dramatically. Figure 3 generated by the stock to this set and delete the
depicts a typical price path in a simulation of a oldest return from this set. As a result of this update
market with 95% RII investors and 5% EMB in the estimation of the ex-ante distribution, the
investors. The EMB investors have m ¼ 10 (i.e., optimal investment proportion x* changes, and
they estimate the ex-ante return distribution by EMB investors revise their portfolios at next
observing the set of the last 10 ex-post returns). period’s trade. During the first 150 or so periods,
s, the standard deviation of the random noise the informed investors control the dynamics and
affecting the EMBs’ decision-making, is taken as the returns fluctuate randomly (as in the benchmark
0.2. All investors, RII and EMB alike, have the model). As a consequence, the investment propor-
same risk-aversion parameter a ¼ 1.5 (as before). tion of the EMB investors also fluctuates irregu-
In the first 150 trading periods, the price dynamics larly. Thus, during the first 150 periods, the EMB
Agent-Based
Computational
Economics, Fig. 3 Five
percent of investors are
efficient market believers,
95% rational informed
investors
Agent-Based Computational Economics 841
investors do not effect the dynamics much. How- The EMB investors stay away from the stock
ever, at point a, the dynamics change qualitatively as long as the ex-post return set includes the
(see Fig. 3). At this point, a relatively high dividend terrible return of the crash. At this stage the
is realized, and as a consequence, a relatively high informed investors regain control of the dynamics
return is generated. This high return leads the EMB and the stock price remains close to its fundamen-
investors to increase their investment proportion in tal value. Ten periods after the crash, the
the stock at the next trading period. This increased extremely negative return of the crash is excluded
demand of the EMB investors is large enough to from the ex-post return set, and the EMB investors
effect next period’s price, and thus a second high start increasing their investment proportion in the
return is generated. Now the EMB investors look at stock (point e). This drives the stock price up, and
a set of ex-post returns with two high returns, and a new boom-crash cycle is initiated. This cycle
they increase their investment proportion even fur- repeats itself over and over almost periodically.
ther. Thus, a positive feedback loop is created. Figure 3 depicts the price dynamics of a single
Notice that as the price goes up, the informed simulation. One may therefore wonder how gen-
investors realize that the stock is overvalued rela- eral the results discussed above are. Figure 4
tive to the fundamental value Pf and they decrease shows two more simulations with the same
their holdings in the stock. However, this effect parameters but different dividend realizations. It
does not stop the price increase and break the is evident from this figure that although the simu-
feedback loop because the EMB investors con- lations vary in detail (because of the different
tinue to buy shares aggressively. The positive dividend realizations), the overall price pattern
feedback loop pushes the stock price further and with periodic boom-crash cycles is robust.
further up to point b, at which the EMBs are Although these dynamics are very unrealistic
invested 100 % in the stock. At point b, the in terms of the periodicity, and therefore the pre-
positive feedback loop “runs out of gas.” How- dictability of the price, they do shed light on the
ever, the stock price remains at the high level mechanism generating many of the empirically
because the EMB investors remain fully invested observed market phenomena. In the next section,
in the stock (the set of past m ¼ 10 returns when we relax the assumption that the EMB pop-
includes at this stage the very high returns gener- ulation is homogeneous with respect to m, the
ated during the “boom” – segment a–b in Fig. 3). price is no longer cyclic or predictable, yet the
When the price is at the high level (segment b– mechanisms generating the market phenomena
c), the dividend yield is low, and as a conse- are the same as in this homogeneous EMB popu-
quence, the returns are generally low. As time lation case. The homogeneous EMB population
goes by and we move from point b toward point case generates the following market phenomena.
c, the set of m ¼ 10 last returns gets filled with low
returns. Despite this fact, the extremely high
returns generated in the boom are also still in Heavy Trading Volume As explained above,
this set, and they are high enough to keep the shares change hands continuously between the RII
EMB investors fully invested. However, investors and the EMB investors. When a “boom”
10 periods after the boom, these extremely high starts, the RII investors observe higher ex-post
returns are pushed out of the set of relevant returns and become more optimistic, while the
ex-post returns. When this occurs, at point c, the EMB investors view the stock as becoming over-
EMB investors face a set of low returns, and they priced and become more pessimistic. Thus, at this
cut their investment proportion in the stock stage the EMBs buy most of the shares from the
sharply. This causes a dramatic crash (segment RIIs. When the stock crashes, the opposite is true:
c–d). Once the stock price goes back down to the EMBs are very pessimistic, but the RII investors
the “fundamental” value, the informed investors buy the stock once it falls back to the fundamental
come back into the picture. They buy back the value. Thus, there is substantial trading volume in
stock and stop the crash. this market. The average trading volume in a typical
842 Agent-Based Computational Economics
Agent-Based Computational Economics, Fig. 4 Two more simulations – same parameters as Fig. 3, different
dividend realizations
simulation is about 1,000 shares per period, which Excess Volatility The EMB investors induce
are 10 % of the total outstanding shares. large deviations of the price from the fundamental
value. Thus, price fluctuations are caused not only
by dividend fluctuations (as the standard theory
suggests) but also by the endogenous market
Autocorrelation of Returns The cyclic behavior
dynamics driven by the EMB investors. This
of the price yields a very definite return autocorre-
“extra” source of fluctuations causes the price to
lation pattern. The autocorrelation pattern is
be more volatile than the fundamental value Pf.
depicted graphically in Fig. 5. The autocorrelation
Indeed, for 100- to 1,000-period independent
pattern is directly linked to the length of the price
simulations with 5 % EMB investors, we find an
cycle, which in turn are determined by m. Since the
average s(pt) of 46.4 and an average s(pft) of 30.6;
moving window of ex-post returns used to estimate
That is, we have excess volatility of about 50 %.
the ex-ante distribution is m ¼ 10 periods long, the
As a first step in analyzing the effects of het-
price cycles are typically a little longer than
erogeneity of the EMB population, in the next
20 periods long: a cycle consists of the positive
section we examine the case of two types of
feedback loop (segment a, b in Fig. 3) which is
EMB investors. We later analyze a model in
about 2–3 periods long, the upper plateau (segment
which there is a full spectrum of EMB investors.
b, c in Fig. 3) which is about 10 periods long, the
crash that occurs during one or two periods, and the
lower plateau (segment d, e in Fig. 3) which is again Two Types of EMBs
about 10 periods long, for a total of about 23–25 One justification for using a representative agent
periods. Thus, we expect positive autocorrelation for in economic modeling is that although investors
lags of about 23–25 periods, because this is the lag are heterogeneous in reality, one can model their
between one point and the corresponding point in collective behavior with one representative or
the next (or previous) cycle. We also expect negative “average” investor. In this section we show that
autocorrelation for lags of about 10–12 periods, this is generally not true. Many aspects of the
because this is the lag between a boom and the dynamics result from the nonlinear interaction
following (or previous) crash and vice versa. This between different investor types. To illustrate
is precisely the pattern we observe in Fig. 5. this point, in this section we analyze a very simple
Agent-Based Computational Economics 843
Agent-Based
Computational
Economics, Fig. 5 Return
autocorrelation 5 %,
efficient market believers,
m ¼ 10
case in which there are only two types of EMB the m ¼ 5 population is more powerful than the
investors: one with m ¼ 5 and the other with m ¼ m ¼ 15 population, and there is a transition to
15. Each of these two types consists of 2 % of the shorter boom-crash cycles. At point c the wealth
investor population, and the remaining 96 % are of the two subpopulations is again almost equal,
informed investors. The representative agent logic and there is another transition to longer cycles.
may tempt us to think that the resulting market Thus, the complex price dynamics can be partly
dynamics would be similar to that of one “aver- understood from the wealth dynamics. But how
age” investor, i.e., an investor with m ¼ 10. Fig- are the wealth dynamics determined? Why does
ure 6 shows that this is clearly not the case. Rather the m ¼ 5 population become wealthier at point
than seeing periodic cycles of about 23–25 b, and why does it lose most of this advantage at
periods (which correspond to the average m of point c? It is obvious that the wealth dynamics
10, as in Fig. 3), we see an irregular pattern. As are influenced by the price dynamics; thus, there
before, the dynamics are first dictated by the is a complicated two-way interaction between
informed investors. Then, at point a, the EMB the two. Although this interaction is generally
investors with m ¼ 15 induce cycles which are very complex, some principle ideas about the
about 30 periods long. At point b there is a tran- mutual influence between the wealth and price
sition to shorter cycles induced by the m ¼ 5 pop- patterns can be formulated. For example, a pop-
ulation, and at point c there is another transition ulation that becomes dominant and dictates the
back to longer cycles. What is going on? price dynamics typically starts underperforming,
These complex dynamics result from the non- because it affects the price with its actions. This
linear interaction between the different subpop- means pushing the price up when buying, and
ulations. The transitions from one price pattern therefore buying high, and pushing the price
to another can be partly understood by looking at down when selling. However, a more detailed
the wealth of each subpopulation. Figure 7 analysis must consider the specific investment
shows the proportion of the total wealth held by strategy employed by each population. For a
each of the two EMB populations (the remaining more comprehensive analysis of the interaction
proportion is held by the informed investors). As between heterogeneous EMB populations, see
seen in Fig. 7, the cycles which start at point a are Levy et al. (1996).
dictated by the m ¼ 15 rather than the m ¼ The two-EMB-population model generates
5 population, because at this stage the m ¼ the same market phenomena as did the homoge-
15 population controls more of the wealth than neous population case: heavy trading volume,
the m ¼ 5 population. However, after 3 cycles return autocorrelations, and excess volatility.
(at point b), the picture is reversed. At this point Although the price pattern is much less regular
844 Agent-Based Computational Economics
Agent-Based Computational Economics, Fig. 6 Two percent EMB m ¼ 5, 2 % EMB m ¼ 15, 96 % RII
Agent-Based Computational Economics, Fig. 7 Proportion of the total wealth held by the two EMB populations
in the two-EMB-population case, there still of investors leads to very realistic price and vol-
seems to be a great deal of predictability about ume patterns.
the prices. Moreover, the booms and crashes
generated by this model are unrealistically dra- Full Spectrum of EMB Investors
matic and frequent. In the next section we ana- Up to this point we have analyzed markets with at
lyze a model with a continuous spectrum of EMB most three different subpopulations (one RII pop-
investors. We show that this fuller heterogeneity ulation and two EMB populations). The market
Agent-Based Computational Economics 845
dynamics we found displayed the empirically and sm ¼ 10. The price pattern seems very realis-
observed market anomalies, but they were unreal- tic with “smoother” and more irregular cycles.
istic in the magnitude, frequency, and semi- Crashes are dramatic, but infrequent and
predictability of booms and crashes. In reality, unpredictable.
we would expect not only two or three investor The heterogeneous EMB population model
types, but rather an entire spectrum of investors. generates the following empirically observed
In this section we consider a model with a full market phenomena:
spectrum of different EMB investors. It turns out
that more is different. When there is an entire Return Autocorrelation: Momentum and
range of investors, the price dynamics become Mean Reversion In the heterogeneous EMB
realistic: booms and crashes are not periodic or population model, trends are generated by the
predictable, and they are also less frequent and same positive feedback mechanism that generated
dramatic. At the same time, we still obtain all of cycles in the homogeneous case: high (low)
the market anomalies described before. returns tend to make the EMB investors more
In this model each investor has a different (less) aggressive, this generates more high (low)
number of ex-post observations which he/she uti- returns, etc. The difference between the two cases
lizes to estimate the ex-ante distribution. Namely, is that in the heterogeneous case, there is a very
investor i looks at the set of the mi most recent complicated interaction between all the different
returns on the stock, and we assume that mi is investor subpopulations, and as a result there are
distributed in the population according to a trun- no distinct regular cycles, but rather, smoother and
cated normal distribution with average m and more irregular trends. There is no single cycle
standard deviation sm (as m 0 is meaningless, length – the dynamics are a combination of
the distribution is truncated at m ¼ 0). many different cycles. This makes the autocorre-
Figure 8 shows the price pattern of a typical lation pattern also smoother and more continuous.
simulation of this model. In this simulation 90 % The return autocorrelations in the heterogeneous
of the investors are RII, and the remaining 10 % model are shown in Fig. 9. This autocorrelation
are heterogeneous EMB investors with m e ¼ 40 pattern conforms with the empirical findings. In
Agent-Based Computational Economics, Fig. 8 Spectrum of heterogeneous EMB investors (10 % EMB investors,
90 % RII investors)
846 Agent-Based Computational Economics
the short run (lags 1–4), the autocorrelation is Heavy Volume As investors in our model have
positive – this is the empirically documented phe- different information (the informed investors know
nomenon known as momentum: in the short run, the dividend process, while the EMB investors do
high returns tend to be followed by more high not) and different ways of interpreting the informa-
returns, and low returns tend to be followed by tion (EMB investors with different memory spans
more low returns. In the longer run (lags 5–13), have different estimations regarding the ex-ante
the autocorrelation is negative, which is known as return distribution), there is a high level of trading
mean reversion. For even longer lags the autocor- volume in this model. The average trading volume
relation eventually tends to be zero. The short-run in this model is about 1,700 shares per period (17 %
momentum, longer-run mean reversion, and even- of the total outstanding shares). As explained below,
tual diminishing autocorrelation create the general the volume is positively correlated with contempo-
“U shape” which is found in empirical studies raneous and lagged absolute returns.
(Fama and French 1988; Jegadeesh and Titman
1993; Poterba and Summers 1988) and which is Volume Is Positively Correlated with Contem-
seen in Fig. 9. poraneous and Lagged Absolute
Returns Investors revise their portfolios as a result
of changes in their beliefs regarding the future return
Excess Volatility The price level is generally distribution. The changes in the beliefs can be due to
determined by the fundamental value of the a change in the current price, to a new dividend
stock. However, as in the homogeneous EMB realization (in the case of the informed investors),
population case, the EMB investors occasionally or to a new observation of an ex-post return (in the
induce temporary departures of the price away case of the EMB investors). If all investors change
from the fundamental value. These temporary their beliefs in the same direction (e.g., if everybody
departures from the fundamental value make the becomes more optimistic), the stock price can
price more volatile than the fundamental value. change substantially with almost no volume –
Following Shiller’s methodology we define the everybody would like to increase the proportion of
detrended price, p, and fundamental value, pf. the stock in his/her portfolio, this will push the price
Averaging over 100 independent simulations, we up, but a very small number of shares will change
find s(p) ¼ 27.1 and s(pf), which is an excess hands. This scenario would lead to zero or perhaps
volatility of 41 %. even negative correlation between the magnitude of
Agent-Based Computational Economics 847
the price change (or return) and the volume. How- Vt ¼ a þ bC jRt 1j þ et and
ever, the typical scenario in the LLS model is dif- ð17Þ
Vt ¼ a þ bL jRt1 1j þ et ;
ferent. Typically, when a positive feedback trend is
induced by the EMB investors, the opinions of the
informed investors and the EMB investors change where Vt is the volume at time t and Rt is the total
in opposite directions. The EMB investors see a return on the stock at time t and the subscripts
trend of rising prices as a positive indication about C and L stand for contemporaneous and lagged.
the ex-ante return distribution, while the informed We find an average value of 870 for b bC with an
investors believe that the higher the price level is average t-value of 5.0 and an average value of
above the fundamental value, the more overpriced 886 bbL for with an average t-value of 5.1.
the stock is and the harder it will eventually fall. The
exact opposite holds for a trend of falling prices. Discussion of the LLS Results
Thus, price trends are typically interpreted differ- The LLS model is an agent-based simulation
ently by the two investor types and therefore induce model of the stock market which incorporates
heavy trading volume. The more pronounced the some of the fundamental experimental findings
trend, the more likely it is to lead to heavy volume regarding the behavior of investors. The main non-
and, at the same time, to large price changes which standard assumption of the model is that there is a
are due to the positive feedback trading on behalf of small minority of investors in the market who are
the EMB investors. uninformed about the dividend process and who
This explains not only the positive correlation believe in market efficiency. The investment deci-
between volume and contemporaneous absolute sion of these investors is reduced to the optimal
rates of return but also the positive correlation diversification between the stock and the bond.
between volume and lagged absolute rates of The LLS model generates many of the empir-
return. The reason is that the behavior of the ically documented market phenomena which are
EMB investors induces short-term positive return hard to explain in the analytical rational-
autocorrelation, or momentum (see above), that is, representative-agent framework. These phenom-
a large absolute return this period is associated not ena are:
only with high volume but also with a large abso-
lute return next period and therefore with high • Short-term momentum
volume next period. In other words, when there • Longer-term mean reversion
is a substantial price increase (decrease), EMB • Excess volatility
investors become more (less) aggressive and the • Heavy trading volume
opposite happens to the informed traders. As we • Positive correlation between volume and con-
have seen before, when a positive feedback loop is temporaneous absolute returns
started, the EMB investors are more dominant in • Positive correlation between volume and
determining the price, and therefore another large lagged absolute returns
price increase (decrease) is expected next period. • Endogenous market crashes
This large price change is likely to be associated
with heavy trading volume as the opinions of the The fact that so many “puzzles” are explained
two populations diverge. Furthermore, this large with a simple model built on a small number of
increase (decrease) is expected to make the EMB empirically documented behavioral elements leads
investors even more optimistic (pessimistic) lead- us to suspect that these behavioral elements are very
ing to another large price increase (decrease) and important in understanding the workings of the mar-
heavy volume next period. ket. This is especially true in light of the observa-
In order to verify this relationship quantita- tions that a very small minority of the nonstandard
tively, we regress volume on contemporaneous bounded-rational investors can have a dramatic
and lagged absolute rates of return for 100 inde- influence on the market and that these investors are
pendent simulations. We run the regressions: not wiped out by the majority of rational investors.
848 Agent-Based Computational Economics
Summary and Future Directions Arthur WB, Holland JH, Lebaron B, Palmer RG, Tayler
P (1997) Asset pricing under endogenous expectations
in an artificial stock market. In: Arthur WB, Durlauf S,
Standard economic models typically describe a Lane D (eds) The economy as an evolving complex
world of homogeneous rational agents. This system II. Addison-Wesley, Redwood City
approach is the foundation of most of our present- Brock WA, Hommes CA (1998) Heterogeneous beliefs
day knowledge in economic theory. With the agent- and routes to chaos in a simple asset pricing model.
J Econ Dyn Control 22:1235–1274
based simulation approach, we can investigate a Egenter E, Lux T, Stauffer D (1999) Finite size effects in
much more complex and “messy” world with differ- Monte Carlo simulations of two stock market models.
ent agent types, who employ different strategies to Phys A 268:250–256
try to survive and prosper in a market with structural Epstein JM, Axtell RL (1996) Complex adaptive systems.
In: Growing artificial societies: social science from the
uncertainty. Agents can learn over time, from their bottom up. MIT Press, Washington, DC
own experience and from their observation about the Fama E, French K (1988) Permanent and temporary com-
performance of other agents. They coevolve over ponents of stock prices. J Polit Econ 96:246–273
time and as they do so, the market dynamics change Friend I, Blume ME (1975) The demand for risky assets.
Am Econ Rev 65:900–922
continuously. This is a worldview closer to biology Gordon J, Paradis GE, Rorke CH (1972) Experimental
than it is to the “clean” realm of physical laws which evidence on alternative portfolio decision rules. Am
classical economics has aspired to. Econ Rev 62(1):107–118
The agent-based approach should not and cannot Grossman S, Stiglitz J (1980) On the impossibility of
informationally efficient markets. Am Econ Rev
replace the standard analytical economic approach. 70:393–408
Rather, these two methodologies support and com- Hellthaler T (1995) The influence of investor number on a
plement each other: When an analytical model is microscopic market. Int J Mod Phys C 6:845–852
developed, it should become standard practice to Hommes CH (2002) Modeling the stylized facts in finance
through simple nonlinear adaptive systems. Proc Natl
examine the robustness of the model’s results with Acad Sci U S A 99:7221–7228
agent-based simulations. Similarly, when results Jegadeesh N, Titman S (1993) Returns to buying winners
emerge from agent-based simulation, one should and selling losers: implications for stock market effi-
try to understand their origin and their generality, ciency. J Finance 48:65–91
Karpoff J (1987) The relationship between price changes
not only by running many simulations but also by and trading volume: a survey. J Finance Quant Anal
trying to capture the essence of the results in a 22:109–126
simplified analytical setting (if possible). Kim GW, Markowitz HM (1989) Investment rules, margin,
Although the first steps in economic agent- and market volatility. J Portf Manage 16:45–52
Kirman AP (1992) Whom or what does the representative
based simulations were made decades ago, eco- agent represent? J Econ Perspect 6:117–136
nomics has been slow and cautious to adopt this Kohl R (1997) The influence of the number of different
new methodology. Only in recent years has this stocks on the Levy, Levy Solomon model. Int J Mod
field begun to bloom. It is my belief and hope that Phys C 8:1309–1316
Kroll Y, Levy H, Rapoport A (1988) Experimental tests of
the agent-based approach will prove as fruitful in the separation theorem and the capital asset pricing
economics as it has been in so many other model. Am Econ Rev 78:500–519
branches of science. LeBaron B (2000) Agent-based computational finance:
suggested readings and early research. J Econ Dyn
Control 24:679–702
Levy H (1994) Absolute and relative risk aversion: an
Bibliography experimental study. J Risk Uncertain 8:289–307
Levy H, Lim KC (1998) The economic significance of the
Primary Literature cross-sectional autoregressive model: further analysis.
Admati A, Pfleiderer P (1988) A theory of intraday pat- Rev Quant Finance Account 11:37–51
terns: volume and price variability. Rev Financ Stud Levy M, Levy H (1996) The danger of assuming homoge-
1:3–40 neous expectations. Finance Analyst J 52:65–70
Arthur WB (1994) Inductive reasoning and bounded ratio- Levy M, Levy H, Solomon S (1994) A microscopic model
nality (The El Farol problem). Am Econ Rev of the stock market: cycles, booms, and crashes.
84:406–411 Econom Lett 45:103–111
Agent-Based Computational Economics 849
Levy M, Levy H, Solomon S (2000) Microscopic simula- Stauffer D, de Oliveira PMC, Bernardes AT
tion of financial markets. Academic, San Diego (1999) Monte Carlo simulation of volatility correla-
Levy M, Persky N, Solomon S (1996) The complex dyn of tion in microscopic market model. Int J Theor Appl
a simple stock market model. Int J High Speed Comput Finance 2:83–94
8:93–113 Tesfatsion L (2001) Special issue on agent-based
Lux T (1995) Herd behaviour, bubbles and crashes. Econ computational economics. J Econ Dyn Control
J 105:881 25:281–293
Lux T (1998) The socio-economic dynamics of speculative Tesfatsion L (2002) Agent-based computational econom-
bubbles: interacting agents, chaos, and the fat tails of ics: growing economies from the bottom up. Artif Life
returns distributions. J Econ Behav Organ 33:143–165 8:55–82
Lux T, Marchesi M (1999) Volatility clustering in financial Thaler R (ed) (1993) Advances in behavioral finance.
markets: a micro-simulation of interacting agents. Russel Sage Foundation, New York
Nature 397:498 Thaler R (1994) Quasi rational economics. Russel Sage
Orcutt GH, Caldwell SB, Wertheimer R (1976) Policy Foundation, New York
exploration through microanalytic simulation. The Tversky A, Kahneman D (1981) The framing of decisions
Urban Institute, Washington, DC and the psychology of choice. Science 211:453–480
Palmer RG, Arthur WB, Holland JH, LeBaron B, Tayler Tversky A, Kahneman D (1986) Rational choice and the
P (1994) Artificial economic life: a simple model of a framing of decision. J Bus 59(4):251–278
stock market. Phys D 75:264–274 Tversky A, Kahneman D (1992) Advances in prospect
Poterba JM, Summers LH (1988) Mean reversion in stock theory: cumulative representation of uncertainty.
returns: evidence and implications. J Finance Econ J Risk Uncertain 5:297–323
22:27–59
Samanidou E, Zschischang E, Stauffer D, Lux T (2007)
Agent-based models of financial markets. Rep Prog Books and Reviews
Phys 70:409–450 Anderson PW, Arrow J, Pines D (eds) (1988) The economy
Samuelson PA (1989) The judgement of economic science as an evolving complex system. Addison-Wesley, Red-
on rational portfolio management: timing and long wood City
horizon effects. J Portf Manage 16:4–12 Axelrod R (1997) The complexity of cooperation: agent-
Samuelson PA (1994) The long term case for equities and based models of conflict and cooperation. Princeton
how it can be oversold. J Portf Manag 21:15–24 University Press, Princeton
Sargent T (1993) Bounded rationality and macroeconom- Moss de Oliveira S, de Oliveira H, Stauffer D (1999)
ics. Oxford University Press, Oxford Evolution, money, war and computers. BG Teubner,
Schelling TC (1978) Micro motives and macro behavior. Stuttgart/Leipzig
Norton, New York Solomon S (1995) The microscopic representation of
Shiller RJ (1981) Do stock prices move too much to be complex macroscopic phenomena. In: Stauffer
justified by subsequent changes in dividends? Am Econ D (ed) Annu Rev Comput Phys II. World Scientific,
Rev 71:421–436 Singapore
Cellular Automaton Modeling cell adhesion, ensuring that cells within tissues
of Tumor Invasion are bound together.
Chemotaxis Motion response to chemical con-
Haralambos Hatzikirou1,2, Georg Breier3 and centration gradients of a diffusive chemical
Andreas Deutsch1 substance.
1
Center for Information Services and High Extracellular matrix (ECM) Components that
Performance Computing, Technische Universität are extracellular and composed of secreted
Dresden, Dresden, Germany fibrous proteins (e.g., collagen) and gel-like
2
Helmholtz Centre for Infection Research, polysaccharides (e.g., glycosaminoglycans)
Department Systems Immunology, binding cells and tissues together.
Braunschweig, Germany Fiber tracts Bundle of nerve fibers having a
3
Division of Medical Biology, Medical Faculty common origin, termination, and function
Carl Gustav Carus, Technische Universität within the spinal cord and brain.
Dresden, Dresden, Germany Haptotaxis Directed motion of cells along adhe-
sion gradients of fixed substrates in the ECM,
such as integrins.
Article Outline Slime trail motion Cells secrete a nondiffusive
substance; concentration gradients of the sub-
Glossary stance allow the cells to migrate toward already
Definition of the Subject explored paths.
Introduction Somatic evolution Darwinian-type evolution
Cellular Automata that occurs on soma (as opposed to germ)
Models of Tumor Invasion cells and characterizes cancer progression
Invasive Tumor Morphology (Bodmer 1997).
Effects of Directed Cell Motion
Spatial Structure of Invasive Tumors
Tumor Cell Migration and the Influence of the Definition of the Subject
Extracellular Matrix
The Role of Cell-Cell and Cell-ECM Adhesion Cancer cells acquire characteristic traits in a step-
Cellular Mechanisms of Glioma Cell Migration wise manner during carcinogenesis. Some of
Effects of Fiber Tracts on Glioma Invasion these traits are autonomous growth, induction of
Effect of Heterogeneous Environments on Tumor angiogenesis, invasion, and metastasis. In this
Cell Migration entry, the focus is on one of the late stages of
Metabolism and Acidosis tumor progression: tumor invasion. Tumor inva-
Emergence of Tumor Invasion sion has been recognized as a complex system,
Influence of Metabolic Changes since its behavior emerges from the combined
The Game of Invasion effect of tumor cell-cell and cell-
Discussion microenvironment interactions. Cellular automata
Bibliography (CA) provide simple models of self-organizing
complex systems in which collective behavior
Glossary can emerge out of an ensemble of many
interacting “simple” components. Cellular autom-
Cadherins Important class of transmembrane ata have also been used to gain a deeper insight in
proteins. They play a significant role in cell- tumor invasion dynamics. In this entry, we briefly
© Springer Science+Business Media, LLC, part of Springer Nature 2020 851
M. Sotomayor et al. (eds.), Complex Social and Behavioral Systems,
https://doi.org/10.1007/978-1-0716-0368-0_60
Originally published in
R. A. Meyers (ed.), Encyclopedia of Complexity and Systems Science, © Springer Science+Business Media LLC 2019
https://doi.org/10.1007/978-3-642-27737-5_60-6
852 Cellular Automaton Modeling of Tumor Invasion
Introduction
(Marchant et al. 2000; Perumpanani et al. 1996, and they are well suited for simulating such
1999; Sherratt and Nowak 1992; Sherratt and phenomena.
Chaplain 2001). Computational investigations of • CA are paradigms of parallelizable algorithms.
the invasiveness of glioma tumors illustrate that This fact makes them computationally efficient.
the ratio of tumor growth and spatial anisotropy in
cell motility can quantify the degree of tumor In the following section, we provide a definition
invasiveness (Swanson et al. 2002; Jbabdi et al. of CA. In section “Models of Tumor Invasion,” we
2005). While these models are able to capture the review the existing CA models for central pro-
tumor structure at the tissue level, they fail to cesses of tumor invasion. Finally, in the discussion,
describe the tumor at the cellular and the subcel- we critically discuss the use of CA in tumor inva-
lular levels. Meanwhile, multi-scale approaches sion modeling, and we identify future research
attempt to describe and predict invasive tumor questions related to tumor invasion.
morphologies, growth, and phenotypical hetero-
geneity (Anderson et al. 2006; Frieboes et al. Cellular Automata
2007, see also Alfonso et al. 2017).
Cellular automata (CA), and more generally The notion of a cellular automaton originated in
cell-based models, provide an alternative model- the works of John von Neumann (1903–1957) and
ing approach, where a microscale investigation is Stanislaw Ulam (1909–1984). Cellular automata
allowed through a stochastic description of the may be viewed as simple models of self-
dynamics at the cellular level (Deutsch and organizing complex systems in which collective
Dormann 2018). In particular, CA define an behavior can emerge out of an ensemble of many
appropriate modeling framework for tumor inva- interacting “simple” components. In complex sys-
sion since they allow for the following: tems, even if the basic and local interactions are
perfectly known, it is possible that the global
• CA rules can mimic the processes at the cellu- behavior obeys new laws that cannot be obviously
lar level. This fact allows for the modeling of extrapolated from the individual properties, as if
an abundance of experimental data that refer to the whole is more than the sum of the parts. This
cellular and subcellular processes related to property makes cellular automata a very interest-
tumor invasion. ing approach to model complex systems in phys-
• The discrete nature of CA can be exploited for ics, chemistry, and biology (examples are
investigations of the boundary layer of a tumor introduced in Deutsch and Dormann 2018 ;
de Franciscis et al. (2011). Bru et al. (2003) have Chopard et al. 2002). CA can be defined as a
analyzed the fractal properties of tumor surfaces 4-tuple ℒ,S,N ,F , where:
(calculated by means of fractal scaling analysis)
which can be compared with corresponding CA • ℒ is a finite or infinite regular lattice of nodes
simulations to gain a better understanding of the (discrete space)
tumor phenomenon. In addition, the discrete • S is a finite set of states (discrete states); each
structure of CA facilitates the implementation cell i ℒ is assigned a state S S
of complicated environments without any of the • N is a finite set of neighbors
computational problems characterizing the sim- • F is a deterministic or probabilistic map
ulation of continuous models.
• Motion of tumor cells through heterogeneous
media (e.g., ECM) involves phenomena at var- F : S jN j ! S
ious spatial and temporal scales (Lesne 2007). fS i gi N ! S,
These cannot be captured in a purely macro-
scopic modeling approach. Alternatively, dis- which assigns a new state to a node depending on
crete microscopic models, such as CA, can the state of all its neighbors indicated by N (local
incorporate different spatiotemporal scales, rule).
Cellular Automaton Modeling of Tumor Invasion 855
The evolution of CA is defined by applying the and biophysical models for different levels of bio-
function F synchronously to all nodes of the logical knowledge (Deutsch and Lawniczak 1999;
lattice ℒ (homogeneity in space and time). Hatzikirou et al. 2010; Mente et al. 2012; Nava-
The above features can be extended, giving rise Sedeño et al. 2017a, b, 2020a, b).
to several variants of the classical CA notion
(Moreira and Deutsch 2002). Some of these are:
Models of Tumor Invasion
Asynchronous CA In such CA, the restriction of
simultaneous update of all the This section reviews the existing cellular autom-
nodes is revoked, allowing for
asynchronous update ata models of tumor invasion. Categorizing these
Nonhomogeneous This variation allows the models is a nontrivial task. Moreover, existing CA
CA transition rules to depend on node models describe tumor invasion at more than one
position. Agent-based models are scale (subcellular, cellular, and tissue). In this
“relatives” of CA that lost the
entry, we distinguish models that analyze (i) the
homogeneity property, i.e., each
individual particle may have its invasive morphology, (ii) tumor cell migration
own set of rules and the influence of the ECM, (iii) metabolism
Coupled-map In this case the constraint of and acidosis, and (iv) the emergence of tumor
lattices discrete state space is withdrawn, invasion.
i.e., the state variables are
assumed to be continuous. An
important type of coupled map
lattices is the so-called Lattice Invasive Tumor Morphology
Boltzmann model (Succi 2001)
Structurally In these systems, the underlying The tumor morphology arising from the spatial
dynamic CA lattice is no longer a passive static
object but becomes a dynamic pattern formation of the tumor cell population has
component. Therefore, the lattice been recognized as a very important aspect of
structure evolves depending on tumor growth. Several researchers have attempted
the values of the node’s state to reveal the mechanisms of spatial pattern forma-
variables
tion of invasive tumors. Here, we present the most
representative CA models for the invasive tumor
An important class of cellular automaton
morphology.
models is lattice-gas cellular automata (LGCA).
This CA model can describe discrete individuals
interacting stochastically and moving in space.
LGCA models were introduced to simulate Effects of Directed Cell Motion
aspects of fluid dynamics (Frisch et al. 1986) but
have also been used successfully to investigate Sander and Deisboeck (2002) developed a CA
collective cell migration, biological pattern for- model to investigate the branching morphology
mation, and the growth, invasion and progression, of invasive brain tumors. In the model tumor cell
of tumors (Böttger et al. 2012, 2015; Bussemaker motion is influenced by two key processes:
et al. 1997; Chopard et al. 2010; de Franciscis (i) chemotaxis and (ii) “slime trail following.”
2011; Deutsch 1995, 2000; Dormann and Deutsch A typical example of a slime trail following mech-
2002; Dormann et al. 2001; Hatzikirou et al. 2015; anism is found in the motion of certain
Mente et al. 2012; Syga et al. 2019; Tektonidis myxobacteria (Wolgemuth et al. 2002).
et al. 2011; Mente et al. 2010; Buder et al. 2015, The authors show that the branching morphol-
2019; Dirkse et al. 2019; Alfonso et al. 2016, ogy of tumors can be explained as a result of
2017; Reher et al. 2017; Talkenberger et al. chemotaxis and “slime trail following.” In partic-
2017). LGCA models are cell-based and compu- ular, simulations reproduce the branching pattern
tationally efficient and allow to integrate statistical formation observed in in vitro cultures of glioma
856 Cellular Automaton Modeling of Tumor Invasion
cells. However, the assumption of slime trail fol- Tumor Cell Migration and the Influence
lowing has not been proven biologically as yet. of the Extracellular Matrix
Cellular Automaton Modeling of Tumor Invasion, (Reprinted from Habib et al. 2003 with permission.)
Fig. 2 Left: Microscopy image of a multicellular tumor Right: Simulation of Anderson’s model (Anderson et al.
spheroid, exhibiting an extensive branching system that 2006) reproducing the experimentally observed morphol-
rapidly expands into the surrounding extracellular matrix ogy of invasive tumors
gel. These branches consist of multiple invasive cells.
Cellular Automaton Modeling of Tumor Invasion 857
The authors show that adhesive dynamics can The authors develop and analyze different sce-
explain the “fingering” patterns observed in their narios of fiber tract influence. A gradient field may
simulations. Moreover, the authors demonstrate increase the speed of the invading tumor front. For
that the width of the invasion zone depends less high field intensities, the model predicts the for-
on cell-cell adhesion and more on cell-ECM adhe- mation of cancer islets at distances away from the
sion facilitated by haptotaxis and proteolysis. main tumor bulk. The simulated invasion patterns
qualitatively resemble clinical observations.
Wurzel et al. (2005) model glioma tumor invasion In the course of cancer progression, tumor cells
with a lattice-gas cellular automaton (LGCA) undergo several phenotypic changes in terms of
(Deutsch and Dormann 2018). The authors motility, metabolism, and proliferative rates. In
address the question of how fiber tracts found in particular, it is important to analyze the effect of
the brain’s white matter influence the spatiotem- the anaerobic metabolism of tumor cells and the
poral evolution and the invading front morphol- acidification of the environment (as a side product
ogy of glioma tumors. Cells are assumed to move, of glycolysis) on tumor invasion (Fig. 4).
proliferate, and undergo apoptosis according to Patel et al. (2001) proposed a model of tumor
corresponding stochastic processes. Fiber tracts growth to examine the roles of native tissue vas-
are represented as a local gradient field that cularity and anaerobic metabolism on the growth
enhances cell motion in a specific direction. and invasion efficacy of tumors. The model
858 Cellular Automaton Modeling of Tumor Invasion
a b
180 180
160 160
140 140
120 120
100 100
80 80
60 60
40 40
20 20
20 40 60 80 100 120 140 160 180 20 40 60 80 100 120 140 160 180
c d
Cellular Automaton Modeling of Tumor Invasion, the brain strongly drive the evolution of the tumor growth.
Fig. 3 The effect of the brain’s fiber tracts on glioma (c, d) Figures display a close-up of the tumor area of the top
growth. (a) A simulation is shown without taking into (a, b) simulations. (Reprinted from Hatzikirou and
account the influence of fiber tracts. (b) The fiber tracts in Deutsch 2008)
assumes a vascularized host tissue. Anaerobic and (ii) there is an optimal density of microvessels
metabolism involves the consumption of glucose that maximizes tumor growth and invasion, by
and the production of H + ions, leading to the minimizing the acidification effects on tumor
acidification of the local tissue. The vascular net- cell proliferation (absorption of H + ions) and
work allows for the absorption of H + ions. Cells maximizing the negative effect of H + ions on
are assumed to be proliferative and non-motile. the neighboring tissue.
The pH level, i.e., the H + concentration, and the
glucose concentration determine the survival and
death of the cells. Emergence of Tumor Invasion
Simulations of the model show that (i) high
tumor H + ion production favors tumor invasion Several models have been proposed that concen-
by the acidification of the neighboring host tissue trate on the evolutionary dynamics of tumors
Cellular Automaton Modeling of Tumor Invasion 859
(Fig. 5). The main goal of these models is to under- Influence of Metabolic Changes
stand under which environmental conditions par-
ticular phenotypes appear. Here, we review those Smallbone et al. (2007) developed an evolution-
models that focus on the mechanisms that allow the ary CA model to investigate the cell-
emergence of invasive behavior. microenvironment interactions that mediate
somatic evolution of cancer cells. In particular,
the authors investigate the sequence of tumor
phenotypes that ultimately leads to invasive
behavior. The model considers three phenotypes:
(i) the hyperplastic phenotype that allows growth
away from the basement membrane, (ii) the gly-
colytic phenotype that allows anaerobic metabo-
lism (the “fuel” is glucose), and (iii) the acid-
resistant phenotype that enables the cell to survive
in low pH. Cells are allowed to proliferate, die, or
adapt (change their phenotype). No cell motion is
explicitly considered.
The model predicts three phases of somatic
evolution: (i) Initially, cell survival and prolifera-
tion are dependent on the oxygen concentration.
(ii) When the oxygen becomes scarce, the glyco-
lytic phenotype confers a significant proliferative
advantage. (iii) The side products of glycolysis,
Cellular Automaton Modeling of Tumor Invasion, e.g., galactic acid, increase the microenvironmen-
Fig. 4 Typically, tumors exhibit abnormal levels of glu- tal pH and promote the selection of acid-resistant
cose metabolism. Positron emission imaging (PET) tech-
niques localize the regions of abnormal glycolytic activity
phenotypes. The latter cell type is able to invade
and identify the tumor locus the neighboring tissue since it takes advantage of
Cellular Automaton
Modeling of Tumor The evolving µenvironment
Invasion, Fig. 5 The
evolving microenvironment Normal Intracpithelial Carcinoma Invasive Metastatic
of breast cancer. The Epithelium Neoplasia in situ Carcinoma Disease
multiple stages of breast
carcinogenesis are shown
progressing from left to
right, along with Avascular Vascular
histological representations
of these stages. As
indicated, the preinvasive
stages occur in an avascular
environment, whereas
cancer cells have direct
access to vasculature
following invasion. Cancer and stroma separated Cancer and stroma
(Reprinted from Gillies and by BM; hypoxia, acidosis and contiguous; angiogenic
Gatenby 2007)
metabolic compartments vasculature.
860 Cellular Automaton Modeling of Tumor Invasion
the death of host cells, due to acidification, and invasion dynamics, it is important to use mathe-
proliferates using the available free space. matical tools that allow for modeling subcellular
or cellular processes and to analyze the emergent
macroscopic behavior. Individual-based models,
The Game of Invasion especially CA, are well suited for this task. More-
over, some types of CA models, such as lattice-
Basanta et al. (2008) have developed a game the- gas cellular automata (Deutsch and Dormann
ory-inspired CA that addresses the question of how 2018; Hatzikirou and Deutsch 2008), facilitate
invasive behavior emerges during tumor progres- analytical investigations allowing for deeper
sion (see also Basanta et al. 2008; Hummert et al. insight into the modeled phenomena.
2014). The authors study the circumstances under In this entry, we reviewed the existing CA
which mutations that confer increased motility to models of tumor invasion. The presented models
cells can spread through a tumor composed of explore central aspects of tumor invasion. Some
rapidly proliferating cells. The model assumes the of the models are in good agreement with biomed-
existence of only two phenotypes: “proliferative” ical observations for in vitro and in vivo tumors.
(high division rate and no motility) and “migra- In the following, we list the most interesting bio-
tory” (low division rate and high motility). Muta- logical insights that can be gained from the
tions are allowed for by the random change of reviewed models:
phenotypes. Nutrients are assumed to be uniformly
distributed over the lattice. • The significance of hypoxia in the process of
Simulations show that low-nutrient conditions tumor progression: Activation of glycolysis
confer a reproductive advantage to motile cells and acidification of the host tissue facilitate
over the proliferative ones. The model suggests tumor invasion. Low-nutrient conditions, such
novel ideas for therapeutic strategies, e.g., by as hypoxia, may trigger invasive behaviors.
increasing the oxygen supply around the tumor • Cell-cell adhesion: It is evident that
to favor the reproduction of proliferative cells intercellular adhesion has a great impact in
over the migrating ones. This is not necessarily a the early stages of tumor growth. However, in
therapy since there are benign tumors that are life- tumor invasion the role of cell-cell adhesion is
threatening even if they do not become invasive. minor, since mainly the cell-ECM interactions
Despite that, in most cases a growing but non- appear to dictate the tumor cell behavior.
aggressive tumor will have a much better progno- • Cell-ECM adhesion: This is an important process
sis than a smaller but invasive one. for tumor invasion. In particular, the heteroge-
neous structure of the ECM strongly influences
the spatial morphology of invasive tumors.
Discussion
Mathematical modeling offers potentially sig-
In this entry, we have focused on one of the most nificant insight into tumor invasion. Several cru-
important aspects of cancer progression: tumor cial questions have not been adequately addressed
invasion. The main processes involved in tumor so far by modeling efforts.
invasion are related to tumor cell migration and
cell-ECM interactions, especially ECM degrada- Branching Several mechanisms have been
tion/remodeling and tumor cell proliferation. morphology proposed that lead to branching
These processes are evolving at different scales, patterns, e.g., diffusion-limited
e.g., cell-ECM adhesion is the response of tumor aggregation, the interplay of cell-cell
and cell-ECM adhesion, as well as
cells to ECM integrins (molecular level) leading chemotaxis or slime trail following
to a haptotactical cell motion (cellular level) and motion. However, biologists and
influencing the tumor morphology (macroscopic modelers have not yet identified a
level). Therefore, in order to understand tumor (continued)
Cellular Automaton Modeling of Tumor Invasion 861
Deutsch A (1995) Towards analyzing complex swarming tumour development: a review. Math Models Method
patterns in biological systems with the help of lattice- Appl Sci 15(11):1779–1794
gas cellular automata. J Biol Syst 3:947–955 Hatzikirou H, Brusch L, Deutsch A (2010) From cellular
Deutsch A (2000) A new mechanism of aggregation in a automaton rules to a macroscopic mean- field descrip-
lattice-gas cellular automaton model. Math Comput tion. Acta Phys Pol B Proc Suppl 3:399–416
Model 31:35–40 Hatzikirou H, Basanta B, Simon M, Schaller C, Deutsch
Deutsch A, Dormann S (2018) Cellular automaton model- A (2012) “Go or grow”: the key to the emergence of
ing of biological pattern formation. Birkhauser, Boston invasion in tumor progression? Math Med Biol
Deutsch A, Lawniczak AT (1999) Probabilistic lattice 29(1):49–65
models of collective motion and aggregation: from Hatzikirou H, Böttger K, Deutsch A (2015) Model-based
individual to collective dynamics. Math Biosci comparison of cell density-dependent cell migration
156:255–269 strategies. Math Model Nat Phenom 10:94–107
Dirkse A, Golebiewska A, Buder T, Nazarov PV, Muller A, Hummert S, Bohl K, Basanta D, Deutsch A, Werner S,
Poovathingal S, Brons NHC, Leite S, Sauvageot N, Theißen G, Schroeter A, Schuster S (2014) Evolution-
Sarkisjan D, Seyfrid M, Fritah S, Stieber D, Michelucci ary game theory: cells as players Mol. BioSyst., 10,
A, Hertel F, Herold-Mende C, Azuaje F, Skupin A, 3044–3065
Bjerkvig R, Deutsch A, Voss-Böhme A, Niclou SP Jbabdi S, Mandonnet E, Duffau H, Capelle L, Swanson K,
(2019) Stem cell-associated heterogeneity in Glioblas- Pelegrini-Issac M, Guillevin R, Benali H (2005) Simu-
toma results from intrinsic tumor plasticity shaped by the lation of anisotropic growth of low-grade gliomas using
microenvironment Nature Communications, 10(1):1787 diffusion tensor imaging. Magn Reson Med 54:616–624
Dormann S, Deutsch A (2002) Modeling of self-organized Lesne A (2007) Discrete vs continuous controversy in
avascular tumor growth with a hybrid cellular automa- physics. Math Struct Comput Sci 17:185–223
ton. Silico Bio 2:393–406 Marchant BP, Norbury J, Perumpanani AJ (2000) Traveling
Dormann S, Deutsch A, Lawniczak AT (2001) Fourier shock waves arising in a model of malignant invasion.
analysis of Turing-like pattern formation in cellular SIAM J Appl Math 60(2):263–276
automaton models. Futur Gener Comput Syst Mente C, Prade I, Brusch L, Breier G, Deutsch A (2010)
17:901–909. https://doi.org/10.1016/S0167-739X(00) Parameter estimation with a novel gradient- based opti-
00068-6 mization method for biological lattice-gas cellular
Fedotov S, Iomin A (2007) Migration and proliferation automaton models. J Math Bio 63:173–200
dichotomy in tumor-cell invasion. Phys Rev Lett Mente C, Prade I, Brusch L, Breier G, Deutsch A (2012)
98:118101–118104 A lattice-gas cellular automaton model for in vitro
Frieboes H, Lowengrub J, Wise S, Zheng X, Macklin P, sprouting angiogenesis. Acta Phys Pol B 5:99–115
Bearer E, Cristini V (2007) Computer simulation of Moreira J, Deutsch A (2002) Cellular automaton models of
glioma growth and morphology. NeuroImage tumour development: a critical review. Adv Compl
37(1):59–70 Syst 5:1–21
Friedl P (2004) Prespecification and plasticity: shifting Nava-Sedeño JM, Hatzikirou H, Klages R, Deutsch
mechanisms of cell migration. Curr Opin Cell Biol A (2017a) Cellular automaton models for time- corre-
16(1):14–23 lated random walks: derivation and analysis. Sci Rep
Frisch U, Hasslacher B, Pomeau Y (1986) Lattice-gas 7:16952
automata for the Navier-Stokes equation. Phys Rev Nava-Sedeño JM, Hatzikirou H, Peruani F, Deutsch
Lett 56:1505–1508 A (2017b) Extracting cellular automaton rules from
Gillies RJ, Gatenby RA (2007) Hypoxia and adaptive physical Langevin equation models for single and col-
landscapes in the evolution of carcinogenesis. Cancer lective cell migration. J Math Biol 75:1075–1100
Metastasis Rev 26:311–317 Nava-Sedeño JM, Voss-Böhme A, Hatzikirou H, Deutsch
Graner F, Glazier J (1992) Simulation of biological cell A, Peruani F (2020) Modeling collective cell motion:
sorting using a two-dimensional extended Potts model. are on- and off-lattice models equivalent? Roy. Soc.
Phys Rev Lett 69:2013–2016 Open Sc
Habib S, Molina-Paris C, Deisboeck TS (2003) Complex Nava-Sedeno JM, Hatzikirou H, Voss-Böhme A, Brusch L,
dynamics of tumors: modeling an emerging brain Deutsch A, Peruani F (2020) Vectorial active matter on
tumor system with coupled reaction-diffusion equa- the lattice: emergence of polar condensates and nematic
tions. Phys A 327:501–524 bands in an active zero-range process hal-02460291
Hanahan D, Weinberg R (2000) The hallmarks of cancer. Nowell PC (1976) The clonal evolution of tumor cell
Cell 100:57–70 populations. Science 194(4260):23–28
Hanahan D, Weinberg R (2011) Hallmarks of cancer. The Patel A, Gawlinski E, Lemieux S, Gatenby R (2001) Cel-
next generation. Cell 144:646–674 lular automaton model of early tumor growth and inva-
Hatzikirou H, Deutsch A (2008) Cellular automata as sion: the effects of native tissue vascularity and
microscopic models of cell migration in heterogeneous increased anaerobic tumor metabolism. J Theor Biol
environments. Curr Top Dev Biol 81:401–434 213:315–331
Hatzikirou H, Deutsch A, Schaller C, Simon M, Swanson Perumpanani AJ, Sherratt JA, Norbury J, Byrne HM
K (2005) Mathematical modelling of glioblastoma (1996) Biological inferences from a mathematical
Cellular Automaton Modeling of Tumor Invasion 863
model of malignant invasion. Invasion Metastasis Succi S (2001) The lattice Boltzmann equation: for fluid
16:209–221 dynamics and beyond, series numerical mathematics
Perumpanani AJ, Sherratt JA, Norbury J, Byrne HM and scientific computation. Oxford University Press,
(1999) A two parameter family of travelling waves Oxford
with a singular barrier arising from the modelling of Swanson KR, Alvord EC, Murray J (2002) Quantifying
extracellular matrix mediated cellular invasion. Phys efficacy of chemotherapy of brain tumors (gliomas)
D 126:145–159 with homogeneous and heterogeneous drug delivery.
Preziozi L (ed) (2003) Cancer modelling and simulation. Acta Biotheor 50:223–237
Chapman & Hall/CRC Press, Boca Raton Syga S, Nava-Sedeño JM, Brusch L, Deutsch A (2019)
Reher D, Klink B, Deutsch A, Voss-Böhme A (2017) Cell A lattice-gas cellular automaton model for discrete
adhesion heterogeneity reinforces tumour cell dissem- excitable media, chapter 15. In: Müller S, Tsuji K
ination: novel insights from a mathematical model (eds) Spirals and vortices. Springer, Cham,
Biology Direct, 12(1):18 pp 253–264, Springer
Sander LM, Deisboeck TS (2002) Growth patterns of Talkenberger K, Cavalcanti-Adam EA, Voss-Böhme A,
microscopic brain tumours. Phys Rev E 66:051901 Deutsch A (2017) Amoeboid-mesenchymal migration
Sanga S, Frieboes H, Zheng X, Gatenby R, Bearer E, Cristini plasticity promotes invasion only in complex heteroge-
V (2007) Predictive oncology: multidisciplinary, multi- neous microenvironments Scientific Reports, 7:9237
scale in-silico modeling linking phenotype, morphology Tektonidis M, Tektonidis HH, Simon M, Schaller C,
and growth. NeuroImage 37(1):120–134 Deutsch A (2011) Identification of intrinsic in vitro
Sherratt JA, Chaplain MAJ (2001) A new mathematical cellular mechanisms for glioma invasion. J Theor Bio
model for avascular tumour growth. J Math Biol 287:131–147
43:291–312 Turner S, Sherratt JA (2002) Intercellular adhesion and
Sherratt JA, Nowak MA (1992) Oncogenes, anti- cancer invasion: a discrete simulation using the
oncogenes and the immune response to cancer: a math- extended Potts model. J Theor Biol 216:85–100
ematical model. Proc R Soc Lond B 248:261–271 Wolgemuth CW, Hoiczyk E, Kaiser D, Oster GF (2002)
Smallbone K, Gatenby R, Gillies R, Maini P, Gavaghan D How myxobacteria glide. Curr Biol 12(5):369–377
(2007) Metabolic changes during carcinogenesis: Wurzel M, Schaller C, Simon M, Deutsch A (2005) Cancer
potential impact on invasiveness. J Theor Biol cell invasion of normal brain tissue: guided by pre-
244:703–713 pattern? J Theor Med 6(1):21–31
Attributes Attributes are a C# feature for includ-
Agent-Based Modeling and ing metadata in compiled code.
Computer Languages Bytecode Bytecode is compiled Java binary code.
C# C# (Archer 2001) is an object-oriented pro-
Michael J. North1 and Charles M. Macal2 gramming language that was developed and is
1
Argonne National Laboratory, Global Security maintained by Microsoft. C# is one of many
Sciences Division, Argonne, IL, USA languages that can be used to generate Micro-
2
Center for Complex Adaptive Agent Systems soft.NET Framework code. This code is run
Simulation (CAS2), Decision and Information using a “virtual machine” that potentially gives
Sciences Division, Argonne National Laboratory, it a consistent execution environment on dif-
Argonne, IL, USA ferent computer platforms.
C++ C++ is a widely used object-oriented pro-
gramming language that was created by Bjarne
Article Outline Stroustrup (Stroustrup 2008) at AT&T. C++ is
widely used for both its object-oriented structure
Glossary and its ability to be easily compiled into native
Definition: Agent-Based Modeling and Computer machine code.
Languages Class A class is the object-oriented inheritable
Agent-Based Modeling binding of procedures and data that provides
Types of Computer Languages the basis for creating objects.
Requirements of Computer Languages for Agent- Common Intermediate Language Common
Based Modeling Intermediate Language (CIL) is compiled
Example Computer Languages Useful for Agent- binary code for the Microsoft.NET Frame-
Based Modeling work. CIL was originally called Microsoft
Future Directions Intermediate Language (MSIL).
Bibliography Computational Algebra Systems Compu-
tational Algebra Systems (CAS) are computa-
tional mathematics systems that calculate
Keywords using symbolic expressions.
Computational Mathematics Systems Compu-
Agent-based mode · Agent-based simulation ·
tational Mathematics Systems (CMS) are soft-
Computer language · Complex adaptive
ware programs that allow users to apply
systems modeling
powerful mathematical algorithms to solve
problems through a convenient and interactive
Glossary user interface. CMS typically supply a wide
range of built-in functions and algorithms.
Agent An agent is a self-directed component in Computer Language A computer language is a
an agent-based model. method of specifying directives for computers.
Agent-Based Model An agent-based model is a Computer programming languages, or more
simulation made up of a set of agents and an simply programming languages, are an impor-
agent interaction environment. tant category of computer languages.
Annotations Annotations are a Java feature for Computer Programming Language Please see
including metadata in compiled code. the entry for “Programming Language.”
Aspects Aspects are a way to implement dis- Declarative Language According to Watson
persed but recurrent tasks in one location. (1989) a “declarative language (or non-
© Springer Science Business Media New York (outside the USA) 2020 865
M. Sotomayor et al. (eds.), Complex Social and Behavioral Systems,
https://doi.org/10.1007/978-1-0716-0368-0_8
Originally published in
R. A. Meyers (ed.), Encyclopedia of Complexity and Systems Science, © Springer Science Business Media New York
(outside the USA) 2014 https://doi.org/10.1007/978-3-642-27737-5_8-5
866 Agent-Based Modeling and Computer Languages
Module According to Stevens et al. (1974), “the Record A record is an independently address-
term module is used to refer to a set of one or able collection of data items.
more contiguous program statements having a Reflection Reflection, combined with dynamic
name by which other parts of the system can method invocation, is a Java and C# approach
invoke it and preferably having its own distinct to higher-order programming.
set of variable names.” ReLogo An object-oriented Logo implementa-
NetLogo NetLogo (Wilensky 1999) is an agent- tion in Repast Simphony.
based modeling and simulation platform that Repast The Recursive Porous Agent Simulation
uses a domain-specific language to define Toolkit (Repast) is a free and open source
models. NetLogo models are built using a met- family of agent-based modeling and simulation
aphor of turtles as agents and patches as envi- platforms (ROAD 2013). Information on
ronmental components (Wilensky 1999). Repast and free downloads can be found at
NetLogo is Java based. NetLogo is free for http://repast.sourceforge.net/
use in education and research. More informa- Repast Simphony Repast Simphony is the
tion on NetLogo and downloads can be found member of the Repast Suite of free and open
at http://ccl.northwestern.edu/netlogo/ source agent-based modeling and simulation
Non-procedural Language Please see the entry software (North et al. 2013). The Java-based
for Declarative Language. Repast Simphony system includes advanced
Object An object is the instantiation of a class to features for specifying, executing, and analyz-
produce executable instances. ing agent-based simulations.
Objective-C Objective-C is an object-oriented Runtime Type Identification Runtime Type
language that extends the C language. Identification (RTTI) is part of C++’s approach
Object-Oriented Language Object-oriented lan- to higher-order programming. Function
guages are structured languages that have special pointers are another component of this
features for binding data with procedures, inher- approach.
itance, encapsulation, and polymorphism. Care- Structured Language Structured languages are
ful abstraction that avoids unnecessary details is languages that divide programs into separate
an important design principle associated with the modules, each of which has one controlled
use of object-oriented languages. entry point, a limited number of exit points,
Observer The observer is a NetLogo agent that and no internal jumps (Dijkstra 1968).
has a view of an entire model. There is exactly Swarm Swarm (Swarm Development Group
one observer in every NetLogo model. 2013) is a free and open source agent-based
ODD Protocol Describes models using a three- modeling and simulation library maintained by
part natural language approach: overview, con- the Swarm Development Group. The core
cepts, and details (Grimm et al. 2006). Swarm system uses Objective-C. A Java-
Patch A patch is a NetLogo agent with a fixed based “Java Swarm” wrapper for the
location on a master grid. Objective-C core is also available. Information
Polymorphism Polymorphism is the ability of on Swarm and free downloads can be found at
an object-oriented class to respond to multiple http://www.swarm.org/
related messages, often method calls with the Templates Templates are a C++ feature for gen-
same name but different parameters. eralizing classes.
Procedural Language According to Watson Turtle A turtle is a mobile NetLogo agent.
(1989) “procedural languages. . .are those in Unified Modeling Language The Unified
which the action of the program is defined by a Modeling Language (UML) is a predomi-
series of operations defined by the programmer.” nantly visual approach to specifying the design
Programming Language A programming lan- of software (Object Management Group 2001,
guage is a computer language that allows any Object Management Group 2013) that consists
computable activity to be expressed. of a variety of diagram types.
868 Agent-Based Modeling and Computer Languages
environment and in its interactions with other into groups or by identifying the “average” agent
agents, at least over a limited range of situa- as representative of the entire population. Behav-
tions that are of interest. ioral rules vary in their sophistication, how much
• An agent is situated, living in an environment information is considered in the agent decisions
with which it interacts along with other agents. (i.e., cognitive “load”), the agent’s internal models
Agents have the ability to recognize and dis- of the external world including the possible reac-
tinguish the traits of other agents. Agents also tions or behaviors of other agents, and the extent
have protocols for interaction with other of memory of past events the agent retains and
agents, such as for communication, and the uses in its decisions. Agents can also vary by the
capability to respond to the environment. resources they have managed to accumulate dur-
• An agent may be goal directed, having targets ing the simulation, which may be due to some
to achieve with respect to its behaviors. This advantage that results from specific attributes.
allows an agent to compare the outcome of its The only limit on the number of agents in an
behavior to its goals. An agent’s goals need not agent-based model is imposed by the computa-
be comprehensive or well defined. For exam- tional resources required to run the model.
ple, an agent does not necessarily have for- As a point of clarification, agent-based model-
mally stated objectives it is trying to maximize. ing is also known by other names. ABS (agent-
• An agent might have the ability to learn and based systems), IBM (individual-based model-
adapt its behaviors based on its experiences. ing), and MAS (multi-agent systems) are widely
An agent might have rules that modify its used acronyms, but “ABM” will be used through-
behavior over time. Generally, learning and out this discussion. The term “agent” has conno-
adaptation at the agent level requires some tations other than how it is used in ABM. For
form of memory to be built into the agent’s example, ABM agents are different from the typ-
behaviors. ical agents found in mobile agent systems.
“Mobile agents” are lightweight software proxies
Often, in an agent-based model, the population that roam the World Wide Web and perform var-
of agents varies over time, as agents are born and ious functions programmed by their owners such
die. Another form of adaptation can occur at the as gathering information from Web sites. To this
agent population level. Agents that are fit are extent, mobile agents are autonomous and share
better able to sustain themselves and possibly this characteristic with agents in ABM.
reproduce as time in the simulation progresses,
while agents that have characteristics less suited
to their continued survival are excluded from the Types of Computer Languages
population.
Another basic assumption of agent-based model- A “computer language” is a method of specifying
ing is that agents have access only to local informa- directives for computers. “Computer program-
tion. Agents obtain information about the rest of the ming languages,” or “programming languages,”
world only through their interactions with the lim- are an important category of computer languages.
ited number of agents around them at any one time, A programming language is a computer language
and from their interactions with a local patch of the that allows any computable activity to be
environment in which they are situated. expressed. This entry focuses on computer pro-
These aspects of how agent-based modeling gramming languages rather than the more general
treats agents highlight the fact that the full range computer languages, since virtually all agent-
of agent diversity can be incorporated into an based modeling systems require the power of
agent-based model. Agents are diverse and het- programming languages. This entry sometimes
erogeneous as well as dynamic in their attributes uses the simpler term “computer languages”
and behavioral rules. There is no need to make when referring to computer programming lan-
agents homogeneous through aggregating agents guages. According to Watson (1989),
870 Agent-Based Modeling and Computer Languages
Programming languages are used to describe algo- how to reach a solution consistent with the given
rithms, that is, sequences of steps that lead to the rules. . .The language Prolog falls into this category,
solution of problems. . .A programming language although it retains some procedural aspects.
can be considered to be a ‘notation’ that can be Another widespread non-procedural system is the
used to specify algorithms with precision. spreadsheet program.
Watson (1989) goes on to say that “programming Imperative and functional languages are usu-
languages can be roughly divided into four groups: ally procedural, while logic programming lan-
imperative languages, functional languages, logic guages are generally declarative. This distinction
programming languages, and others.” Watson is important since it implies that most imperative
(1989) states that in imperative languages “there is and functional languages require users to define
a fundamental underlying dependence on the how each operation is to be completed, while
assignment operation and on variables implemented logic programming languages only require users
as computer memory locations, whose contents can to define what is to be achieved. However, when
be read and altered.” However, “in functional lan- faced with multiple possible solutions with differ-
guages (sometimes called applicative languages) the ent execution speeds and memory requirements,
fundamental operation is function application” imperative and functional languages offer the
(Watson 1989). Watson cites LISP as an example. potential for users to explicitly choose more effi-
Watson (1989) continues by noting that “in a logic cient implementations over less efficient ones.
programming language, the programmer needs only Logic programming languages generally need to
to supply the problem specification in some formal infer which solution is best from the problem
form, as it is the responsibility of the language description and may or may not choose the most
system to infer a method of solution.” Watson cites efficient implementation. Naturally, this potential
Prolog as an example. strength of imperative and functional languages
A useful feature of most functional languages, may also be cast as a weakness. With imperative
many logic programming languages, and some and functional language, users need to correctly
imperative languages is higher-order program- choose a good implementation among any com-
ming. According to Reynolds (1998), peting candidates that may be available.
In analogy with mathematical logic, we will say that Similarly to Watson (1989), Van Roy and
a programming language is higher-order if proce- Haridi (2004) define several common computa-
dures or labels can occur as data, i.e., if these enti-
ties can be used as arguments to procedures, as
tional models, namely, those that are object ori-
results of functions, or as values of assignable vari- ented, those that are logic based, and those that are
ables. A language that is not higher-order will be functional. Object-oriented languages are proce-
called first-order. dural languages that bind procedures (i.e., “encap-
Watson (1989) offers that “another way of sulated methods”) to their corresponding data
grouping programming languages is to classify (i.e., “fields”) in nested hierarchies (i.e., “inheri-
them as procedural or declarative languages.” tance” graphs) such that the resulting “classes”
Elaborating, Watson (1989) states that can be instantiated to produce executable
instances (i.e., “objects”) that respond to multiple
Procedural languages. . .are those in which the related messages (i.e., “polymorphism”). Logic-
action of the program is defined by a series of
operations defined by the programmer. To solve a
based languages correspond to Watson’s (1989)
problem, the programmer has to specify a series of logic programming languages. Similarly, Van Roy
steps (or statements) which are executed in and Haridi’s (2004) functional languages corre-
sequence spond to those of Watson (1989).
On the other hand, Watson (1989) notes that Two additional types of languages can be
added to Van Roy and Haridi’s (2004) list of
Programming in a declarative language
three. These are unstructured and structured lan-
(or non-procedural language) involves the specifi-
cation of a set of rules defining the solution to the guages (Dijkstra 1968). Both unstructured and
problem; it is then up to the computer to determine structured languages are procedural languages.
Agent-Based Modeling and Computer Languages 871
Unstructured languages are languages that rely objects, methods, and fields the same way that they
on step-by-step solutions such that the solutions apply to generic structured language modules.
can contain arbitrary jumps between steps Objective-C, C++, C#, and Java are all examples
(Dijkstra 1968). BASIC, COBOL, Fortran, and of object-oriented languages. As with C, the lan-
C are examples of unstructured languages. The guages Objective-C, C++, and C# offer goto state-
arbitrary jumps are often implemented using ments but they have object-oriented features and
“goto” statements. Unstructured languages were are generally used in a structured way. Java is an
famously criticized by Edsger Dijkstra in his clas- interesting case in that the word “goto” is reserved
sic paper “Go To Statement Considered Harmful” as a keyword in the language specification, but it is
(1968). This and related criticism lead to the intro- not intended to be implemented.
duction of structured languages. It is possible to develop agent-based models
Structured languages are languages that divide using any of the programming languages discussed
programs into separate modules, each of which above, namely, unstructured languages, structured
has one controlled entry point, a limited number languages, object-oriented languages, logic-based
of exit points, and no internal jumps (Dijkstra languages, and functional languages. Specific
1968). Following Stevens et al. (1974) “the term examples are provided later in this entry. However,
module is used to refer to a set of one or more certain features of programming languages are par-
contiguous program statements having a name by ticularly well suited for supporting the require-
which other parts of the system can invoke it and ments of agent-based modeling and simulation.
preferably having its own distinct set of variable
names.” Structured language modules, often
called procedures, are generally intended to be Requirements of Computer Languages
small. As such, large numbers of them are usually for Agent-Based Modeling
required to solve complex problems. Standard
Pascal is an example of structured, but not The requirements of computer languages for
object-oriented, language. As stated earlier, C is agent-based modeling and simulation include the
technically an unstructured language (i.e., it following:
allows jumps within procedures and “long
jumps” between procedures), but it is used so • There is a need to create well-defined modules
often in a structured way that many people think that correspond to agents. These modules
of it as a structured language. should bind together agent state data and
The quality of modularization in structured agent behaviors into integrated independently
language code is often considered to be a function addressable constructs. Ideally these modules
of coupling and cohesion (Stevens et al. 1974). will be flexible enough to change structure
Coupling is the tie between modules such that the over time and to optionally allow fuzzy bound-
proper functioning of one module depends on the aries to implement models that go beyond
functioning of another module. Cohesion refers to methodological individualism (Heath 2005).
the ties within a module such that proper func- • There is a need to create well-defined con-
tioning of one line of code in a module depends on tainers that correspond to agent environments.
the functioning of another one line of code in the Ideally these containers will be recursively
same module. The goal for modules is maximiz- nestable or will otherwise support sophisti-
ing cohesion while minimizing coupling. cated definitions of containment.
Object-oriented languages are a subset of struc- • There is a need to create well-defined spatial
tured languages. Object-oriented methods and relationships within agent environments.
classes are structured programming modules that These relationships should include notions of
have special features for binding data, inheritance, abstract space (e.g., lattices), physical space
and polymorphism. The previously introduced (e.g., maps), and connectedness (e.g.,
concepts of coupling and cohesion apply to classes, networks).
872 Agent-Based Modeling and Computer Languages
• There is a need to easily set up model config- configuration, program output collection, or pro-
urations such as the number of agents, the gram results analysis. As such, these tasks usually
relationships between agents, the environmen- need to be manually implemented by model
tal details, and the results to be collected. developers.
• There is a need to conveniently collect and In terms of agent-based modeling, structured
analyze model results. languages are similar to unstructured languages in
that they do not provide tools to directly integrate
Each of the kinds of programming languages, data and procedures into independently address-
namely, unstructured languages, structured lan- able constructs. Therefore, structured language
guages, object-oriented languages, logic-based support for agents, agent environments, and
languages, and functional languages, can address agent spatial relationships is similar to that pro-
these requirements. vided by unstructured languages. However, the
Unstructured languages generally support proce- lack of jump statements in structured languages
dure definitions which can be used to implement tends to increase program maintainability and
agent behaviors. They also sometimes support the extensibility compared to unstructured languages.
collection of diverse data into independently This generally gives structured languages a sub-
addressable constructs in the form of data structures stantial advantage over unstructured languages for
often called “records.” However, they generally lack implementing agent-based models.
support for binding procedures to individual data Object-oriented languages build on the main-
items or records of data items. This lack of support tainability and extensibility advantages of struc-
for creating integrated constructs also typically tured languages by adding the ability to bind data
limits the language-level support for agent con- to procedures. This binding in the form of classes
tainers. Native support for implementing spatial provides a natural way to implement agents. In
environments is similarly limited by the inability to fact, object-oriented languages have their roots in
directly bind procedures to data. Ole-Johan Dahl and Kristen Nygaard’s Simula
As discussed in the previous section, unstruc- simulation language (Dahl and Nygaard 1966,
tured languages offer statements to implement 2001; Van Roy and Haridi 2004)! According to
execution jumps. The use of jumps within and Dahl and Nygaard (1966),
between procedures tends to reduce module cohe-
SIMULA (SIMULation LAnguage) is a language
sion and increase module coupling compared to designed to facilitate formal description of the layout
structured code. The result is reduced code main- and rules of operation of systems with discrete events
tainability and extensibility compared to struc- (changes of state). The language is a true extension
of ALGOL 60 (Backus et al. 1963), i.e., it contains
tured solutions. This is a substantial
ALGOL 60 as a subset. As a programming language,
disadvantage of unstructured languages. apart from simulation, SIMULA has extensive list
In contrast, many have argued that, at least processing facilities and introduces an extended
theoretically, unstructured languages can achieve co-routine concept in a high-level language.
the highest execution speed and lowest memory Dahl and Nygaard go on to state the impor-
usage of the language options since nearly every- tance of specific languages for simulation (1966)
thing is left to the application programmers. In as follows:
practice, programmers implementing agent-based
models in unstructured languages usually need to Simulation is now a widely used tool for analysis of
a variety of phenomena: nerve networks, commu-
write their own tools to form agents by correlating nication systems, traffic flow, production systems,
data with the corresponding procedures. Ironically, administrative systems, social systems,
these tools are often similar in design, implemen- etc. Because of the necessary list processing, com-
tation, and performance to some of the structured plex data structures and program sequencing
demands, simulation programs are comparatively
and object-oriented features discussed later. difficult to write in machine language or in
Unstructured languages generally do not pro- ALGOL or FORTRAN. This alone calls for the
vide special support for application data introduction of simulation languages.
Agent-Based Modeling and Computer Languages 873
However, still more important is the need for a generally do not provide special support for appli-
set of basic concepts in terms of which it is possible cation data configuration, collection of outputs, or
to approach, understand and describe all the appar-
ently very different phenomena listed above. analysis of results. As such, these tasks usually
A simulation language should be built around need to be manually implemented by model devel-
such a set of basic concepts and allow a formal opers. Regardless of this, the ability to bind data
description which may generate a computer pro- and procedures provides such a straightforward
gram. The language should point out similarities
and differences between systems and force the method for implementing agents that most agent-
research worker to consider all relevant aspects of based models are written using object-oriented
the systems. System descriptions should be easy to languages.
read and print and hence useful for communication. It should be noted that traditional object-
Again, according to Dahl and Nygaard (2001), oriented languages do not provide a means to
modify class and object structures once a program
SIMULA I (1962–1965) and Simula 67 (1967) are
the two first object-oriented languages. Simula begins to execute. Newer “dynamic” object-
67 introduced most of the key concepts of object- oriented languages such as Groovy (Koenig
oriented programming: both objects and classes, et al. 2007) offer this capability. This potentially
subclasses (usually referred to as inheritance) and allows agents to gain and lose attributes and
virtual procedures, combined with safe referencing
and mechanisms for bringing into a program col- methods during the execution of a model based
lections of program structures described under a on the flow of events in a simulation. This in turn
common class heading (prefixed blocks). offers the possibility of implementing modules
The Simula languages were developed at the with fuzzy boundaries that are flexible enough to
Norwegian Computing Center, Oslo, Norway by
Ole-Johan Dahl and Kristen Nygaard. Nygaard’s change structure over time.
work in Operational Research in the 1950s and As discussed in the previous section, logic-
early 1960s created the need for precise tools for based languages offer an alternative to the progres-
the description and simulation of complex man– sion formed by unstructured, structured, and
machine systems. In 1961 the idea emerged for
developing a language that both could be used for object-oriented languages. Logic-based languages
system description (for people) and for system pre- can provide a form of direct support for binding
scription (as a computer program through a com- data (e.g., asserted propositions) with actions (e.g.,
piler). Such a language had to contain an logical predicates), sometimes including the use of
algorithmic language, and Dahl’s knowledge of
compilers became essential. . .When the inheritance higher-order programming. In principle, each
mechanism was invented in 1967, Simula 67 was agent can be implemented as a complex predicate
developed as a general programming language that with multiple nested sub-terms. The sub-terms,
also could be specialised for many domains, includ- which may contain unresolved variables, can then
ing system simulation.
be activated and resolved as needed during model
Generally, object-oriented classes are used to execution. Agent templates which are analogous to
define agent templates, and instantiated objects object-oriented classes can be implemented using
are used to implement specific agents. Agent envi- the same approach but with a larger number of
ronment templates and spatial relationship patterns unresolved variables. Agent environments and the
are also typically implemented using classes. resulting relationships between agents can be
Recursive environment nesting and abstract formed in a similar way. Since each of these con-
spaces, physical spaces, and connectedness can structs can be modified at any time, the resulting
all be represented in relatively straightforward system can change structure over time and may
ways. Instantiated objects are used to implement even allow fuzzy boundaries. In practice this
specific agent environments and spatial relation- approach is rarely, if ever, used. As with the previ-
ships in individual models. Within these models, ously discussed approaches, logic-based languages
model configurations are also commonly usually do not provide special support for applica-
implemented as objects instantiated from one or tion data configuration, output collection, or results
more classes. However, as with unstructured and analysis so these usually need to be manually
structured languages, object-oriented languages developed.
874 Agent-Based Modeling and Computer Languages
Functional languages offer yet another alterna- (Coplien 2001). Software design patterns were pop-
tive to the previously discussed languages. Like ularized by Gamma et al. (1995). They have subse-
logic-based and object-oriented languages, func- quently been shown to be of substantial value in
tional languages often provide a form of direct improving software quality and development effi-
support for binding data with behaviors. This sup- ciency. Several authors, such as North and Macal
port often leverages the fact that most functional (2011), have suggested that there is great potential
languages support higher-order programming. As a for patterns to improve the practice of agent-based
result, the data is usually in the form of nested lists modeling as well. North and Macal (2013)
of values and functions, while the behaviors them- discussed product and process patterns. Product pat-
selves are implemented in the form of functions. terns are a vocabulary for designing or
Agent templates (i.e., “classes”), agent environ- implementing models. Process patterns are methods
ments, and agent relationships can be implemented for designing, implementing, or using models.
similarly. Each of the lists can be dynamically According to Alexander (1979), “each pattern
changed during a simulation run so the model is a three-part rule, which expresses a relation
structure can evolve and can potentially have between a certain context, a problem, and a solu-
fuzzy boundaries. Unlike the other languages tion.” The first part of a pattern characterizes the
discussed so far, a major class of functional lan- situation in which the problem occurs. The second
guages, namely, those designed for computational part defines the problem to be solved. The third
mathematics, usually include sophisticated support part describes a resolution to the outstanding issue
for program output collection and results analysis. as well as its positive and negative consequences.
An example is Mathematica (Wolfram 2013). If Every pattern has both fixed and variable ele-
the application data is configured in mathemati- ments. The fixed elements define the pattern.
cally regular ways, then these systems may also The variable elements allow the pattern to be
provide support for application data setup. adapted for each situation. Each pattern identifies
a set of decisions to make in the development of a
system. Sets of patterns that have been adapted for
Example Computer Languages Useful the situation are then used as a vocabulary to
for Agent-Based Modeling describe solutions to problems. North and Macal
(2013) introduce a catalog of patterns specifically
Design Languages for agent-based modeling. An example from
Design languages provide a way to describe North and Macal (2013) is shown in Table 1.
models at a more abstract level than typical pro-
gramming languages. Some design languages ODD Protocol
ultimately offer the opportunity to compile to Grimm et al.’s (2006) ODD protocol describes
executable code. Other design languages act as models using a three-part approach: overview,
intermediate stages between initial conceptualiza- concepts, and details. The model overview
tion and compliable implementation. In either includes a statement of the model’s intent, a
case, the resulting design documents can be used description of the main variables, and a discussion
to describe the model once they are complete. of the agent activities. The design concepts
include a discussion of the foundations of the
Design Patterns model. The details include the initial setup con-
Patterns have offered a powerful yet simple way to figuration, input value definitions, and descrip-
conceptualize and communicate ideas in many dis- tions of any embedded models. The resulting
ciplines since Christopher Alexander introduced natural language document cannot be translated
them in the late 1970s (Alexander et al. 1977; directly into executable code. However, it pro-
Alexander 1979). Design patterns form a “common vides a basis for describing the design of models
vocabulary” describing tried-and-true solutions for for publications, user documentation, and model
commonly faced software design problems development programmers.
Agent-Based Modeling and Computer Languages 875
Agent-Based Modeling and Computer Languages, Table 1 The scheduler scramble product design pattern
Name: Scheduler scramble
Problem: How can multiple agents act during the same scheduler pattern clock tick without biasing the model results or
giving a long-term advantage or disadvantage to any one agent?
Context: Two or more agents from the agent-based model pattern may attempt to simultaneously execute behaviors
during the same clock tick
Forces: Activating a behavior before other agents can be either an artificial advantage or disadvantage for the agent that
goes first. Agent rules should not have to include coordination functions
Solution: The competing behaviors at each clock tick are scheduled in a random order. This is a simulation pattern
Resulting context: A sequential behavioral activation order with long-term fairness and that is unbiased is produced
packages. Spreadsheets are usually programmed agent-based model exploration, scoping, and pro-
using a “macro language.” As discussed further in totyping. Simple agent models can be
North and Macal (2007), any modern spreadsheet implemented on the desktop using environments
program can be used to do basic agent-based outside of spreadsheets as well (Fig. 2).
modeling. The most common convention is to
associate each row of a primary spreadsheet Science and Engineering Languages
worksheet with an agent and use consecutive col- Science and engineering languages embodied in
umns to store agent properties. Secondary commercial products such as Mathematica,
worksheets are then used to represent the agent MATLAB, Maple, and others can be used as a
environment and to provide temporary storage for basis for developing agent-based models. Such
intermediate calculations. A simple loop is usu- systems usually have a large user base, are readily
ally used to scan down the list of agents and to available on desktops, and are widely integrated
allow each one to execute in turn. The beginning into academic training programs. They can be
and end of the scanning loop are generally used used as rapid prototype development tools or as
for special setup activities before and special components of large-scale modeling systems. Sci-
cleanup activities after each round. An example ence and engineering languages have been
agent spreadsheet from North and Macal (2007) is applied to agent-based modeling. Their advan-
shown in Fig. 1. Agent spreadsheets have both tages include a fully integrated development envi-
strengths and weaknesses compared to the other ronment, their interpreted (as opposed to
ABM tools. Agent spreadsheets tend to be easy to compiled) nature providing immediate feedback
build but they also tend to have limited capabili- to users during the development process, and a
ties. This balance makes spreadsheets ideal for packaged user interface. Integrated tools provide
Agent-Based Modeling and Computer Languages, Fig. 1 An example agent spreadsheet (North and Macal 2007)
Agent-Based Modeling and Computer Languages 877
Agent-Based Modeling and Computer Languages, Fig. 2 An example agent spreadsheet code (North and Macal
2007)
support for data import and graphical display. supply a wide range of built-in functions and
Macal (2004) describes the use of Mathematica algorithms. MATLAB, Mathematica, and Maple
and MATLAB in agent-based simulation, and are examples of commercially available CMS
Macal and Howe (2005) detail investigations whose origins go back to the late 1980s. CMS
into linking Mathematica and MATLAB to the are structured in two main parts: (1) the user
Repast ABM toolkit to make use of Repast’s interface that allows dynamic user interaction
simulation scheduling algorithms. In the follow- and (2) the underlying computational engine, or
ing sections, we focus on MATLAB and kernel, that performs the computations according
Mathematica as representative examples of sci- to the user’s instructions. Unlike conventional
ence and engineering languages. programming languages, CMS are interpreted
MATLAB and Mathematica are both examples instead of compiled, so there is immediate feed-
of Computational Mathematics Systems (CMS). back to the user, but some performance penalty is
CMS allow users to apply powerful mathematical paid. The underlying computational engine is
algorithms to solve problems through a conve- written in the C programming language for these
nient and interactive user interface. CMS typically systems, but C coding is unseen by the user. The
878 Agent-Based Modeling and Computer Languages
most recent releases of CMS are fully integrated that every variable have a value assigned to it
systems, combining capabilities for data input and before it is used in the program. In this respect,
export, graphical display, and the capability to although Mathematica and MATLAB may appear
link to external programs written in conventional similar and share many capabilities, Mathematica
languages such as C or Java using inter-process is fundamentally much different than MATLAB,
communication protocols. The powerful features with a much different style of programming and
of CMS, their convenience of use, the need to ultimately with a different set of capabilities appli-
learn only a limited number of instructions on cable to agent-based modeling.
the part of the user, and the immediate feedback Mathematica’s symbolic processing capabilities
provided to users are features of CMS that make allow one to program in multiple programming
them good candidates for developing agent-based styles, either as alternatives or in combination,
simulations. such as functional programming, logic program-
A further distinction can be made among CMS. ming, procedural programming, and even object-
A subset of CMS are what is called Computational oriented programming styles. Like MATLAB,
Algebra Systems (CAS). CAS are computational Mathematica is also an interpreted language, with
mathematics systems that calculate using symbolic the kernel of Mathematica running in the back-
expressions. CAS owe their origins to the LISP ground in C. In terms of data types, everything is
programming language, which was the earliest an expression in Mathematica. An expression is a
functional programming language (McCarthy data type with a head and a list of arguments in
1960). Macsyma (www.scientek.com/macsyma) which even the head of the expression is part of the
and Scheme (Springer and Freeman 1989) (www. expression’s arguments.
swiss.ai.mit.edu/projects/scheme) are often men- The Mathematica user interface consists of a
tioned as important implementations leading to what is referred to as a notebook (Fig. 3).
present-day CAS. Typical uses of CAS are equa- A Mathematica notebook is a fully integratable
tion solving, symbolic integration and differentia- development environment and a complete publi-
tion, exact calculations in linear algebra, cation environment. The Mathematica application
simplification of mathematical expressions, and programming interface (API) allows programs
variable precision arithmetic. Computational math- written in C, Fortran, or Java to interact with
ematics systems consist of numeric processing sys- Mathematica. The API has facilities for dynami-
tems or symbolic processing systems, or possibly a cally calling routines from Mathematica as well as
combination of both. Especially when algebraic calling Mathematica as a computational engine.
and numeric capabilities are combined into a Figure 3 shows Mathematica desktop notebook
multi-paradigm programming environment, new environment. A Mathematica notebook is displayed
modeling possibilities open up for developing in its own window. Within a notebook, each item is
sophisticated agent-based simulations with mini- contained in a cell. The notebook cell structure has
mal coding. underlying coding that is accessible to the user.
In Mathematica, a network representation con-
Mathematica Mathematica is a commercially sists of combining lists of lists, or more generally
available numeric processing system with enor- expressions of expressions, to various depths. For
mous integrated numerical processing capability example, in Mathematica, an agent can be
(http://www.wolfram.com). Beyond numeric represented explicitly as an expression that
processing, Mathematica is a fully functional includes a head named agent, a sequence of
programming language. Unlike MATLAB, agent attributes, and a list of the agent’s neighbors.
Mathematica is a symbolic processing system Agent data and methods are linked together by the
that uses term replacement as its primary operation. use of what are called “up values.”
Symbolic processing means that variables can be Example references for agent-based simulation
used before they have values assigned to them; in using Mathematica include Gaylord and Davis
contrast, a numeric processing language requires (1999), Gaylord and Nishidate (1994), and
Agent-Based Modeling and Computer Languages 879
Agent-Based Modeling and Computer Languages, Fig. 3 Example Mathematica cellular automata model
Gaylord and Wellin (1995). Gaylord and consists of the MATLAB desktop, which is a fully
D’Andria (1998) describe applications in social integrated and mature development environment.
agent-based modeling. MATLAB has an application programming inter-
face (API). The MATLAB API allows programs
MATLAB The MATrix LABoratory (MATLAB) written in C, Fortran, or Java to interact with
is a numeric processing system with enormous inte- MATLAB. There are facilities for calling routines
grated numerical processing capability (http://www. from MATLAB (dynamic linking) as well as rou-
mathworks.com). It uses a scripting-language tines for calling MATLAB as a computational
approach to programming. MATLAB is a high- engine, as well as for reading and writing special-
level matrix/array language with control flow, func- ized MATLAB files.
tions, data structures, input/output, and object- Figure 4 shows the MATLAB desktop envi-
oriented programming features. The user interface ronment illustrating the Game of Life, which is a
880 Agent-Based Modeling and Computer Languages
Agent-Based Modeling and Computer Languages, Fig. 4 Example MATLAB cellular automata model
required become very simple. Pressing the “Start” More information on NetLogo and downloads can
button automatically seeds this universe with sev- be found at http://ccl.northwestern.edu/netlogo/.
eral small random communities and initiates a NetLogo is designed to provide a basic com-
series of cell updates. After a short period of sim- putational laboratory for teaching complex adap-
ulation, the initial random distribution of live (i.e., tive systems concepts. NetLogo was originally
highlighted) cells develops into sets of sustainable developed to support teaching, but it can be used
patterns that endure for generations. to develop a wider range of applications. NetLogo
Several agent-based models using MATLAB provides a graphical environment to create pro-
have been published in addition to the Game of grams that control graphic “turtles” that reside in a
Life. These include a model of political institu- world of “patches” that is monitored by an
tions in modern Italy (Bhavnani 2003), a model of “observer.” NetLogo’s DSL is limited to its turtle
pair interactions and attitudes (Pearson and and patch paradigm. However, NetLogo models
Boudarel 2001), a bargaining model to simulate can be extended using Java to provide for more
negotiations between water users (Thoyer et al. general programming capabilities. An example
2001), and a model of sentiment and social mito- NetLogo model of an ant colony (Wilensky
sis based on Heider’s Balance Theory (Guetzkow 1999) (center) feeding on three food sources
et al. 1972; Wang and Thorngate 2003). The latter (upper left corner, lower left corner, and middle
model uses Euler, a MATLAB-like language. right) is shown in Fig. 5. Example code (Wilensky
Thorngate argues for the use of MATLAB as an 1999) from this model is shown in Fig. 6.
important tool to teach simulation programming
techniques (Thorngate 2000). Repast Simphony Flowcharts
The Recursive Porous Agent Simulation Toolkit
(Repast) is a free and open source suit of agent-
Dedicated Agent-Based Modeling Languages
based modeling and simulation library (ROAD
Dedicated agent-based modeling languages are
2013). The Repast Suite is a family of advanced,
DSLs that are designed to specifically support
free, and open source agent-based modeling and
agent-based modeling. Several such languages
simulation software that have collectively been
currently exist. These languages are functionally
under continuous development for over 10 years.
differentiated by the underlying assumptions their
Repast Simphony is a richly interactive and easy
designers made about the structures of agent-
to learn Java-based modeling environment that is
based models. The designers of some of these
designed for use on workstations and small com-
languages assume quite a lot about the situations
puting clusters. Repast for high-performance
being modeled and use this information to provide
computing (HPC) is a lean and expert-focused
users with pre-completed or template compo-
C++-based modeling library that is designed for
nents. The designers of other languages make
use on large computing clusters and supercom-
comparatively fewer assumptions and encourage
puters. Repast Simphony and Repast HPC share a
users to implement a wider range of models.
common architecture. Information on the Repast
However, more work is often needed to build
Suite and free downloads can be found at http://
models in these systems. This entry will discuss
repast.sourceforge.net/.
two selected examples, namely, NetLogo and
Repast Simphony (North et al. 2013) includes
Repast Simphony flowcharts.
advanced features for specifying, executing, and
analyzing agent-based simulations. Repast
NetLogo Simphony offers several methods for specifying
NetLogo is an education-focused ABM environ- agents and agent environments including visual
ment (Wilensky 1999). The NetLogo language specification, specification with the dynamic
uses a modified version of the Logo programming object-oriented Groovy language (Koenig et al.
language (Harvey 1997). NetLogo itself is Java 2007), and specification with Java. In principle,
based and is free for use in education and research. Repast Simphony’s visual DSL can be used for
882 Agent-Based Modeling and Computer Languages
Agent-Based Modeling and Computer Languages, Fig. 5 Example NetLogo ant colony model (Wilensky 1999)
any kind of programming, but models beyond a document some of the challenges users face in
certain level of complexity are better implemented learning general-purpose programming languages.
in Groovy or Java. As discussed later, Groovy and Despite these issues, general-purpose program-
Java are general-purpose languages. All of Repast ming languages are essential for allowing users to
Simphony’s languages can be fluidly combined in access the full capabilities of modern computers.
a single model. An example Repast Simphony Naturally, there are a huge number of general-
zombie model is shown in Fig. 7 (North et al. purpose programming languages. This entry con-
2013). In all cases, the user has a choice of a visual siders these options from two perspectives. First,
rich point-and-click interface or a “headless” general language toolkits are discussed. These
batch interface to execute models (Fig. 8). toolkits provide libraries of functions to be used
in a general-purpose host language. Second, the
use of three raw general-purpose languages,
General Languages
namely, Java, C#, and C++, is discussed.
Unlike DSLs, general languages are designed to
take on any programming challenge. However, in
order to meet this challenge, they are usually more General Language Toolkits
complex than DSLs. This tends to make them more As previously stated, general language toolkits are
difficult to learn and use. Lahtinen et al. (2005) libraries that are intended be used in a general-
Agent-Based Modeling and Computer Languages 883
Agent-Based Modeling and Computer Languages, Fig. 6 Example NetLogo code from the ant colony model
(Wilensky 1999)
purpose host language. These toolkits usually pro- and free downloads can be found at http://www.
vide model developers with software for functions swarm.org/ From Marcus Daniels (Daniels 1999),
such as simulation time scheduling, results visu- Swarm is a set of libraries that facilitate implemen-
alization, results logging, and model execution as tation of agent-based models. Swarm’s inspiration
well as domain-specific tools (North et al. 2006). comes from the field of Artificial Life. Artificial
Users of raw general-purpose languages have to Life is an approach to studying biological systems
that attempts to infer mechanism from biological
write all of the needed features by themselves phenomena, using the elaboration, refinement, and
by hand. generalization of these mechanisms to identify uni-
A wide range of general language toolkits cur- fying dynamical properties of biological
rently exist. This entry will discuss two selected systems. . .To help fill this need, Chris Langton
initiated the Swarm project in 1994 at the Santa Fe
examples, namely, Swarm and the Groovy and Institute. The first version was available by 1996,
Java interfaces for Repast Simphony. and since then it has evolved to serve not only
researchers in biology, but also anthropology, com-
Swarm Swarm (Minar et al. 1996; Swarm puter science, defense, ecology, economics, geog-
raphy, industry, and political science.
Development Group 2013) is a free and open
source agent-based modeling library. Swarm The Swarm simulation system has two funda-
seeks to create a shared simulation platform for mental components. The core component runs
agent modeling and to facilitate the development general-purpose simulation code written in
of a wide range of models. Users build simula- Objective-C, Tcl/Tk, and Java. This component
tions by incorporating Swarm library components handles most of the behind-the-scenes details. The
into their own programs. Information on Swarm external wrapper components run user-specific
884 Agent-Based Modeling and Computer Languages
Agent-Based Modeling and Computer Languages, Fig. 7 Example Repast Simphony zombie model (North et al.
2013)
Agent-Based Modeling and Computer Languages, Fig. 8 Example Repast Simphony visual behavior from a
zombie model (North et al. 2013)
Agent-Based Modeling and Computer Languages 885
Agent-Based Modeling
and Computer
Languages,
Fig. 9 Example Swarm
supply chain model (Swarm
Development Group 2013)
Agent-Based Modeling and Computer Languages, Fig. 10 Example Repast Simphony Groovy code from a zombie
model (North et al. 2013)
Agent-Based Modeling and Computer Languages, Fig. 11 Example Repast Simphony Java code from a zombie
model North et al. 2013)
Agent-Based Modeling and Computer Languages 887
implement dispersed but recurrent tasks make it a sophisticated users the opportunity to highly opti-
good choice for agent-based model development. mize selected areas of model code. However, this
also opens the possibility of introducing difficult-
C# to-resolve errors and hard-to-maintain code. It is
C# (Archer 2001) is an object-oriented program- also more difficult to port C++ code from one
ming language that was developed and is computer architecture to another than it is for
maintained by Microsoft. C# is one of many lan- virtual machine-based languages such as Java.
guages that can be used to generate Microsoft. C++ can use a combination of Runtime Type
NET Framework code or Common Intermediate Identification (RTTI) and function pointers to
Language (CIL). Like Java bytecode, CIL is run implement higher-order programming. Similar to
using a “virtual machine” that potentially gives it the Java approach, C++ RTTI can be used for
a consistent execution environment on different runtime class structure examination, while func-
computer platforms. A growing number of tools tion pointers can be used to call newly referenced
are emerging to support C# development. C# and methods at runtime. C++’s object orientation,
the Microsoft.NET Framework more generally RTTI, function pointers, and low-level machine
are in principle cross-platform, but in practice access make it a reasonable choice for the devel-
they are mainly executed under Microsoft opment of extremely large or complicated agent-
Windows. based models.
The Microsoft.NET Framework provides for
the compilation into CIL of many different lan-
guages such as C#, Managed C++, and Managed Future Directions
Visual Basic to name just a few. Once these lan-
guages are compiled to CIL, the resulting modules Future developments in computer languages
are fully interoperable. This allows users to con- could have enormous implications for the devel-
veniently develop integrated software using a opment of agent-based modeling. Some of the
mixture of different languages. Like Java, C# challenges of agent-based modeling for the future
supports reflection and dynamic method invoca- include (1) scaling up models to handle large
tion for higher-order programming. C#’s object numbers of agents running on distributed hetero-
orientation, multilingual integration, generics, geneous processors across the grid, (2) handling
attributes for including metadata in compiled the large amounts of data generated by agent
code, aspects, reflection, and dynamic method models and making sense out of it, and (3) devel-
invocation make it well suited for agent-based oping user-friendly interfaces and modular com-
model development, particularly on the Microsoft ponents in a collaborative environment that can be
Windows platform. used by domain experts with little or no knowl-
edge of standard computer coding techniques.
C++ Visual and natural language development envi-
C++ is a widely used object-oriented program- ronments that can be used by nonprogrammers
ming language that was created by Bjarne are continuing to advance but remain to be proven
Stroustrup (Stroustrup 2008) at AT&T. C++ is at reducing the programming burden. There are a
widely noted for both its object-oriented structure variety of next steps for the development of com-
and its ability to be easily compiled into native puter languages for agent-based modeling includ-
machine code. C++ gives users substantial access ing the further development of DSLs, increasing
to the underlying computer but also requires sub- visual modeling capabilities, and the development
stantial programming skills. of languages and language features that better
Most C++ compilers are actually more prop- support pattern-based development. DSLs are
erly considered C/C++ compilers since they can likely to become increasingly available as agent-
compile non-object-oriented C code as well as based modeling grows into a wider range of
object-oriented C++ code. This allows domains. More agent-based modeling systems
888 Agent-Based Modeling and Computer Languages
are developing visual interfaces for specifying Gaylord R, Davis J (1999) Modeling nonspatial social
model structures and agent behaviors. Many of interactions. Math Educ Res 8(2):1–4
Gaylord R, Nishidate K (1994) Modeling nature: cellular
these visual environments are themselves DSLs. automata simulations with Mathematica. Springer,
The continued success of agent-based modeling New York
will likely yield an increasing number of design Gaylord R, Wellin P (1995) Computer simulations with
patterns. Supporting and even automating Mathematica: explorations in complex physical and
biological systems. Springer/TELOS, New York
implementations of these patterns may form a Grimm V et al (2006) A standard protocol for describing
natural source for new language features. Many individual-based and agent-based models. Ecol Model
of these new features are likely to be implemented 198(1–2):115–126
within DSLs. Guetzkow H, Kotler P, Schultz R (eds) (1972) Simulation
in social and administrative science. Prentice Hall,
Englewood Cliffs
Harvey B (1997) Computer science logo style. MIT Press,
Boston
Bibliography Heath J (2005) Methodological individualism. In: Zalta
E (ed) Stanford encyclopedia of philosophy. Stanford
Alexander C (1979) The timeless way of building. Oxford University, Stanford. Available at http://plato.
University Press, Oxford standford.edu/
Alexander C, Ishikawa S, Silverstein M (1977) A pattern Jennings N (2000) On agent-based software engineering.
language. Oxford University Press, Oxford Artif Intel 117:277–296
Archer T (2001) Inside C#. Microsoft Press, Redmond Koenig D, Glover A, King P, Laforge G, Skeet J (2007)
Backus J, Bauer F, Green J, Katz C, McCarthy J, Naur P, Groovy in action. Manning Publications, Greenwich
Perlis A, Rutishauser H, Samuelson K, Vauquois B, Lahtinen E, Ala-Mutka K, Jarvinen H-M (2005) A study of
Wegstein J, van Wijngaarden A, Woodger M (1963) the difficulties of novice programmers. In: Proceedings
Revised report on the algorithmic language ALGOL of the 10th annual SIGCSE conference on innovation
60. In: Naur P (ed) Communications of the association and technology in computer science education. ACM,
for computing machinery (ACM), vol 6. ACM, Caparica
New York, pp 1–17 Macal C (2004) Agent-based modeling and social simula-
Bhavnani R (2003) Adaptive agents, political institutions tion with mathematica and MATLAB. In: Macal C,
and civic traditions in modern Italy. JASSS 6(4). Avail- Sallach D, North M (eds) Proceedings of the agent
able at http://jasss.soc.surrey.ac.uk/6/4/1.html 2004 conference on social dynamics: interaction,
Bonabeau E (2001) Agent-based modeling: methods and reflexivity and emergence. Argonne National Labora-
techniques for simulating human systems. Proc Natl tory, Argonne
Acad Sci 99(3):7280–7287 Macal C, Howe T (2005) Linking repast to computational
Casti J (1997) Would-be worlds: how simulation is chang- mathematics systems: Mathematica and MATLAB. In:
ing the world of science. Wiley, New York Macal C, Sallach D, North M (eds) Proceedings of the
Coplien J (2001) Software patterns home page. Available agent 2005 conference on generative social processes,
at http://hillside.net/patterns/ models, and mechanisms. Argonne National Labora-
Dahl O-J, Nygaard K (1966) SIMULA – an ALGOL-based tory, Argonne
simulation language. Commun ACM 9:671–678 Macal C, North M (2007) Agent-based modeling and sim-
Dahl O-J, Nygaard K (2001) How object-oriented pro- ulation: desktop ABMS. In: Henderson SG, Biller B,
gramming started. Available at http://heim.ifi.uio.no/ Hsieh M-H, Shortle J, Tew JD, Barton RR (eds) Pro-
~kristen/FORSKNINGSDOK_MAPPE/F_OO_start. ceedings of the 2007 winter simulation conference.
html IEEE/ACM, Washington, DC
Daniels M (1999) Integrating simulation technologies with McCarthy J (1960) Recursive functions of symbolic
swarm. In: Proceedings of the agent 1999 workshop on expressions and their computation by machine I. J
agent simulation: applications, models, and tools. ACM 3:184–195, ACM, New York, NY, USA
Argonne National Laboratory, Argonne Minar N, Burkhart R, Langton C, Askenazi M (1996) The
Dijkstra E (1968) Go to statement considered harmful. swarm simulation system: a toolkit for building multi-
Commun ACM 11(3):147–148 agent simulations. Available at http://alumni.media.
Eclipse (2013) Eclipse home page. Available at http:// mit.edu/~nelson/research/swarm/
www.eclipse.org/ North M, Macal C (2007) Managing business complexity:
Foxwell H (1999) Java 2 software development kit. Linux J discovering strategic solutions with agent-based
Gamma E, Helm R, Johnson R, Vlissides J (1995) Design modeling and simulation. Oxford University Press:
patterns: elements of reusable object-oriented software. New York, NY, USA
Addison-Wesley, Wokingham North M, Macal C (2011) Product design patterns for
Gaylord R, D’Andria L (1998) Simulating society: a agent-based modeling. In: Jain S, Creasey R,
Mathematica toolkit for modeling socioeconomic Himmelspach J (eds) Proceedings of the 2011 winter
behavior. Springer/TELOS, New York simulation conference. IEEE/ACM, Phoenix
Agent-Based Modeling and Computer Languages 889
North M, Macal C (2013) Product and process patterns for Stevens W, Meyers G, Constantine L (1974) Structured
agent-based modeling and simulation. J Simulat 8:25–36 design. IBM Syst J 13(2):115
North M, Collier N, Vos R (2006) Experiences creating three Stroustrup B (2008) Bjarne Stroustrup’s FAQ. Available at
implementations of the repast agent modeling toolkit. In: http://www.research.att.com/~bs/bs_faq.
ACM transactions on modeling and computer simula- html#invention
tion, vol 16, Issue 1. ACM, New York, pp 1–25 Swarm Development Group home page (2013) Available
North M, Collier N, Ozik J, Tatara E, Altaweel M, at http://www.swarm.org/
Macal M, Bragen M, Sydelko P (2013) Complex adap- Thorngate W (2000) Teaching social simulation with
tive systems modeling with repast simphony. In: Com- MATLAB. JASSS 3(1). Available at http://www.soc.
plex adaptive systems modeling. Springer, Heidelberg surrey.ac.uk/JASSS/3/1/forum/1.html
Object Management Group (2001) OMG unified modeling Thoyer S, Morardet S, Rio P, Simon L, Goodhue R,
language specification version 1.5. Object Manage- Rausser G (2001) A bargaining model to simulate
ment Group, Needham negotiations between water users. JASSS 4(2). Avail-
Object Management Group (2013) Object management able at http://www.soc.surrey.ac.uk/JASSS/4/2/6.html
group UML home page. Object Management Group, Van Roy P, Haridi S (2004) Concepts, techniques, and
Needham models of computer programming. MIT Press,
Pearson D, Boudarel M-R (2001) Pair interactions: real and Cambridge
perceived attitudes. JASSS 4(4). Available at http:// Wang Z, Thorngate W (2003) Sentiment and social mito-
www.soc.surrey.ac.uk/JASSS/4/4/4.html sis: implications of Heider’s balance theory. JASSS
Reynolds J (1998) Definitional interpreters for higher- 6(3) Available at http://jasss.soc.surrey.ac.uk/6/3/2.
order programming. In: Higher-order and symbolic html
computation. Kluwer, Dordrecht, pp 363–397 Watson D (1989) High-level languages and their com-
ROAD (2013) Repast home page. Available at http:// pilers. Addison-Wesley, Wokingham
repast.sourceforge.net/ Wilensky U (1999) NetLogo http://ccl.northwestern.edu/
Sedgewick R (1988) Algorithms, 2nd edn. Addison- netlogo/. Center for connected learning and computer-
Wesley, Reading, p 657 based modeling, Northwestern University, Evanston
Springer G, Freeman D (1989) Scheme and the art of Wolfram Research (2013) Mathematica home page. Avail-
programming. McGraw-Hill, New York able at http://www.wolfram.com/
more importance has been placed on the behavior
Computer Graphics and of virtual characters in applications such as
Games, Agent-Based games, movies and simulations set in these virtual
Modeling in worlds simulations. The behavior of these virtual
characters should be believable in order to create
Brian Mac Namee the illusion that virtual worlds are populated with
School of Computing, Dublin Institute of living characters. This has led to the application of
Technology, Dublin, Ireland agent-based modeling to the control of virtual
characters. There are a number of advantages of
using agent-based modeling techniques which
Article Outline include the fact that they remove the requirement
for hand controlling all agents in a virtual envi-
Glossary ronment, and allow agents in games to respond to
Definition of the Subject unexpected actions by players or users.
Introduction
Agent-Based Modelling in Computer Graphics
Agent-Based Modelling in CGI for Movies
Agent-Based Modelling in Games Introduction
Future Directions
Bibliography Advances in computer graphics technology in
recent years have allowed the creation of realistic
Glossary and believable virtual worlds. However, as such
virtual worlds have been developed for applica-
Computer generated imagery (CGI) The use tions spanning games, education and movies it has
of computer generated images for special become apparent that in order to achieve real
effects purposes in film production. believability, virtual worlds must be populated
Intelligent agent A hardware or (more usually) with life-like virtual characters. This is where the
software-based computer system that enjoys application of agent-based modeling has found a
the properties autonomy, social ability, reactiv- niche in the areas of computer graphics and, in a
ity and pro-activeness. huge way, computer games. Agent-based model-
Non-player character (NPC) A computer con- ing is a perfect solution to the problem of control-
trolled character in a computer game – as ling the behaviors of the virtual characters that
opposed to a player controlled character. populate a virtual world. In fact, because virtual
Virtual character A computer generated char- characters are embodied and autonomous these
acter that populates a virtual world. applications require an even stronger notion of
Virtual world A computer generated world in agency than many other areas in which agent-
which places, objects and people are based modeling is employed.
represented as graphical (typically three Before proceeding any further, and because
dimensional) models. there are so many competing alternatives, it is
worth explicitly stating the definition of an intel-
ligent agent that will inform the remainder of this
Definition of the Subject article. Taken from Wooldridge and Jennings
(1995) an intelligent agent is defined as “. . . a
As the graphics technology used to create virtual hardware or (more usually) software-based com-
worlds has improved in recent years, more and puter system that enjoys the following properties:
© Springer-Verlag 2009 891
M. Sotomayor et al. (eds.), Complex Social and Behavioral Systems,
https://doi.org/10.1007/978-1-0716-0368-0_90
Originally published in
R. A. Meyers (ed.), Encyclopedia of Complexity and Systems Science, © Springer-Verlag 2009
https://doi.org/10.1007/978-3-642-27737-5_90
892 Computer Graphics and Games, Agent-Based Modeling in
• autonomy: agents operate without the direct modeling techniques in order to generate CGI
intervention of humans or others, and have scenes containing large numbers of computer gen-
some kind of control over their actions and erated extras. Computer games developers have
internal state; also been using agent-based modeling techniques
• social ability: agents interact with other agents effectively for some time now for the control of
(and possibly humans) via some kind of agent- non-player characters (NPCs) in games. There is a
communication language; particularly fine match between the requirements
• reactivity: agents perceive their environment, of computer games and agent-based modeling due
(which may be the physical world, a user via a to the high levels of interactivity required.
graphical user interface, a collection of other Finally, the article will conclude with some
agents, the INTERNET, or perhaps all of these suggestions for the future directions in which
combined), and respond in a timely fashion to agent-based modeling technology in computer
changes that occur in it; graphics and games is expected to move.
• pro-activeness: agents do not simply act in
response to their environment, they are able
to exhibit goal-directed behavior by taking Agent-Based Modelling in Computer
the initiative.” Graphics
of birds. The system was first presented at the intervention of human animators. Terzopou-los
prestigious SIGGRAPH conference (www. has since gone on to apply similar techniques to
siggraph.org) in 1987 and was accompanied by the control of virtual humans (Shao and
the short movie “Stanley and Stella in: Breaking Terzopoulos 2005).
the Ice”. Taking influence from the area of artifi- Moving from animals to crowds of virtual
cial life (or aLife) (Thalmann and Thalmann humans, the Virtual Reality Lab at the Ecole Poly-
1994), Reynolds postulated that the individual technique Fédérale de Lausanne in Switzerland
members of a flock would not be capable of com- (vrlab.epfl.ch) led by Daniel Thalmann has been
plex reasoning, and so flocking behavior must at the forefront of this work for many years. They
emerge from simple decisions made by individual group currently has a highly evolved system,
flock members. This notion of emergent behavior VICrowd, for the animation of virtual crowds
is one of the key characteristics of aLife systems. (Musse and Thalmann 2001) which they model
In the original Boids system, each virtual agent as a hierarchy which moves from individuals to
(represented as a simple particle and known as a groups to crowds. This hierarchy is used to avoid
boid) used just three rules to control its move- some of the complications which arise from trying
ment. These were separation, alignment and cohe- to model large crowds in real time – one of the key
sion and are illustrated in Fig. 1. Based on just gaols of ViCrowd.
these three simple rules extremely realistic flock- Each of the levels in the ViCrowd hierarchy
ing behaviors emerged. This freed animators from can be modeled as an agent and this is done based
the laborious task of hand-scripting the behavior on beliefs, desires and intentions. The beliefs of an
of each creature within the flock and perfectly agent represent the information that the agent
demonstrates the advantage offered by agent- possesses about the world, including information
based modeling techniques for this kind of about places, objects and other agents. An agent’s
application. desires represent the motivations of the agent
The system created by Tu and Terzopoulos regarding objectives it would like to achieve.
took a more complex approach in that they created Finally, the intentions of an agent represent the
complex models of biological fish. Their models actions that an agent has chosen to pursue. The
took into account fish physiology, with a complex belief-desire-intention (BDI) model of agency
model of fish muscular structure, along with a was proposed by Rao and Georgeff (1991) and
perceptual model of fish vision. Using these they has been used in many other application areas of
created sophisticated simulations in which prop- agent-based modeling.
erties such as schooling and predator avoidance ViCrowd has been used in ambitious applica-
were displayed. The advantage of this approach tions including the simulation of a virtual city
was that it was possible to create unique, comprised of, amongst other things, a train station
unscripted, realistic simulations without the a park and a theater (Farenc et al. 2000). In all of
Computer Graphics and Games, Agent-Based Modeling in, Fig. 1 The three rules used by Reynolds’ original
Boids system to simulate flocking behaviors
894 Computer Graphics and Games, Agent-Based Modeling in
these environments the system was capable of improve their situation and display sophisticated
driving the believable behaviors of large groups and individualistic movements.
of characters in real-time. Hayes–Roth and Doyle focus on the differ-
It should be apparent to readers from the exam- ences between “animate characters” and tradi-
ples given thus far that the use of agent-based tional agents (Hayes-Roth and Doyle 1998).
modeling techniques to control virtual characters With this in mind they indicate that agents’ behav-
gives rise to a range of unique requirements when iors must be “variable rather than reliable”, “idi-
compared to the use of agent based modeling in osyncratic instead of predictable”, “appropriate
other application areas. The key to understanding rather than correct”, “effective instead of com-
these is to realize that the goal in designing agents plete”, “interesting rather than efficient”, and
for the control of virtual characters is typically not “distinctively individual as opposed to optimal”.
to design the most efficient or effective agent, but Perlin and Goldberg (1996) concern them-
rather to design the most interesting or believable selves with building believable characters “that
character. Outside of very practical applications respond to users and to each other in real-time,
such as evacuation simulations, when creating with consistent personalities, properly changing
virtual characters, designers are concerned with moods and without mechanical repetition, while
maintaining what Disney, experts in this field, always maintaining an author’s goals and
refer to as the illusion of life (Johnston and intentions”.
Thomas 1995). Finally, in characterizing believable agents,
This refers to the fact that the user of a system Bates (1992a) is quite forgiving requiring “only
must believe that virtual characters are living, that they not be clearly stupid or unreal”. Such
breathing creatures with goals, beliefs, desires, broad, shallow agents must “exhibit some signs of
and, essentially, lives of their own. Thus, it is not internal goals, reactivity, emotion, natural lan-
so important for a virtual human to always choose guage ability, and knowledge of agents . . . as
the most efficient or cost effective option available well as of the . . . micro-world”.
to it, but rather to always choose reasonable Considering these definitions, Isbister and
actions and respond realistically to the success or Doyle (2002) identify the fact that the consistent
failure of these actions. With this in mind, and themes which run through all of the requirements
following a similar discussion given in Isbister given above match the general goals of agency –
and Doyle (2002), some of the foremost virtual humans must display autonomy, reactivity,
researchers in virtual character research have the goal driven behavior and social ability – and again
following to say about the requirements of agents support the use of agent-based modeling to drive
as virtual characters. the behavior of virtual characters.
Loyall writes (Loyall 1997) that “Believable
agents are personality-rich autonomous agents The Spectrum of Agents
with the powerful properties of characters from The differences between the systems mentioned in
the arts.” Coming from a dramatic background it the previous discussion are captured particularly
is not surprising that Loyall’s requirements reflect well on the spectrum of agents presented by Aylett
this. Agents should have strong personality and be and Luck (2000). This positions agent systems on
capable of showing emotion and engaging in a spectrum based on their capabilities, and serves
meaningful social relationships. as a useful tool in differentiating between the
According to Blumberg (1996), “. . . an auton- various systems available. One end of this spec-
omous animated creature is an animated object trum focuses on physical agents which are mainly
capable of goal-directed and time-varying behav- concerned with simulation of believable physical
ior”. The work of Blumberg and his group is very behavior, (including sophisticated physiological
much concerned with virtual creatures, rather than models of muscle and skeleton systems), and of
humans in particular, and his requirements reflect sensory systems. Interesting work at this end of
this. Creatures must appear to make choices which the spectrum includes Terzopoulos’ highly
Computer Graphics and Games, Agent-Based Modeling in 895
realistic simulation of fish (Terzopoulos et al. have a negative response to environments that
1994) and his virtual stuntman project (Faloutsos are designed to be more true to life finding them
et al. 2001) which creates virtual actors capable of cramped. This is a perfect example of how,
realistically synthesizing a broad repertoire of although designers stay true to reality for many
lifelike motor skills. aspects of environment design, the particular
Cognitive agents inhabit the other end of the blend of virtual fidelity required by an application
agent spectrum and are mainly concerned with can dictate certain real world restrictions can be
issues such as reasoning, decision making, plan- ignored in virtual worlds.
ning and learning. Systems at this end of the With regard to virtual characters, virtual fidel-
spectrum include Funge’s cognitive modeling ity dictates that the set of capabilities which these
approach (Funge 1999) which uses the situation characters should display is determined by the
calculus to control the behavior of virtual charac- application which they are to inhabit. So, the
ters, and Nareyek’s work on planning agents for requirements of an agent-based modeling system
simulation (Nareyek 2001), both of which will be for CGI in movies would be very different to those
described later in this article. of a agent-based modeling system for controlling
While the systems mentioned so far sit com- the behaviors of game characters.
fortably at either end of the agent spectrum, many
of the most effective inhabit the middle ground.
Amongst these are c4 (Burke et al. 2002), used to
great effect to simulate a virtual sheep dog with Agent-Based Modelling in CGI for
the ability to learn new behaviors, Improv (Perlin Movies
and Goldberg 1996) which augments sophisti-
cated physical human animation with scripted With the success of agent-based modeling tech-
behaviors and the ViCrowd system (Musse and niques in graphics firmly established there was
Thalmann 2001) which sits on top of a realistic something of a search for application areas to
virtual human animation system and uses plan- which they could be applied. Fortunately, the suc-
ning to control agents’ behavior. cess of agent-based modeling techniques in com-
puter graphics was paralleled with an increase in
Virtual Fidelity the use of CGI in the movie industry, which
The fact that so many different agent-based offered the perfect opportunity. In many cases
modeling systems, for the control of virtual CGI techniques were being used to replace tradi-
humans exist gives rise to the question why? The tional methods for creating expensive, or difficult
answer to this lies in the notion of virtual fidelity, to film scenes. In particular, scenes involving
as described by Badler (Badler et al. 1999). Vir- large numbers of people or animals were deemed
tual fidelity refers to the fact that virtual reality no longer financially viable when set in the real
systems need only remain true to actual reality in world. Creating these scenes using CGI involved
so much as this is required by, and improves, the painstaking hand animation of each character
system. within a scene, which again was not financially
In Määta (2002) the point is illustrated viable.
extremely effectively. The article explains that The solution that agent-based modeling offers
when game designers are archi-tecting the envi- is to make each character within a scene an intel-
ronments in which games are set, the scale to ligent agent that drives its own behavior. In this
which these environments are created is not kept way, as long as the initial situation is set up cor-
true to reality. Rather, to ease players’ movement rectly scenes will play out without the interven-
in these worlds, areas are designed to a much tion of animators. The facts that animating for
larger scale, compared to character sizes, than in movies does not need to be performed in real-
the real world. However, game players do not time, and is in no way interactive (there are no
notice this digression from reality, and in fact human users involved in the scene), make the use
896 Computer Graphics and Games, Agent-Based Modeling in
of agent-based modeling a particularly fine match virtual extras that control their own behaviors.
for this application area. This system was put to particularly good use in
Craig Reynolds’ Boids system (Reynolds the large scale battle sequences that feature in all
1987) which simulates the flocking behaviors three of the Lord of the Rings films. Some of the
exhibited in nature by schools of fish, or flocks sequences in the final film of the trilogy, the Return
of birds and was discussed previously is one of the of the King, contain over 200,000 digital
seminal examples of agent-based modeling tech- characters.
niques being used in movie CGI. Reynold’s In order to create a large battle scene using the
approach was first used for CGI in the 1999 film Massive software, each virtual extra is represented
“Batman Returns” (Burton 1992) to simulate col- as an intelligent agent, making its own decisions
onies of bats. Reynold’s technologies have been about which actions it will perform based on its
used in “The Lion King” (Allers and Minkoff perceptions of the world around it. Agent control
1994) and “From Dusk ‘Till Dawn” (Rodriguez is achieved through the use of fuzzy logic based
1996) amongst other films. Reynolds’ approach controllers in which the state of an agent’s brain is
was so successful, in fact, that he was awarded an represented as a series of motivations, and knowl-
Academy Award for his work in 1998. edge it has about the world – such as the state of
Similar techniques to those utilized in the Boids the terrain it finds itself on, what kinds of other
system have been used in many other films to agents are around it and what these other agents
animate such diverse characters as ants, people are doing. This knowledge about the world is
and stampeding wildebeest. Two productions perceived through simple simulated visual, audi-
which were released in the same year, “Antz” tory and tactile senses. Based on the information
(Darnell and Johnson 1998) by Dreamworks and they perceive agents decide on a best course of
“A Bug’s Life” (Lasseter and Stanton 1998) by action. Designing the brains of these agents is
Pixar took great steps in using CGI effects to made easier that it might seem at first by the fact
animate large crowds for. For “Antz” systems that agents are developed for short sequences, and
were developed which allowed animators easily so a small range of possible tasks. So for example,
create scenes containing large numbers of virtual separate agent models would be used for a fight-
characters modeling each as an intelligent agent ing scene and a celebration scene.
capable of obstacle avoidance, flocking and other In order to create a large crowd scene using
behaviors. Similarly, the creators of “A Bug’s Life” Massive animators initially set up an environment
created tools which allowed animators easily com- populating it with an appropriate cast of virtual
bine pre-defined motions (known as alibis) to cre- characters where the brains of each character are
ate behaviors which could easily be applied to slight variations (based on physical and personal-
individual agents in scenes composed of hundreds ity attributes) of a small number of archetypes.
of virtual characters. The scene will then play itself out with each
However, the largest jump in the use of agent- character making it’s own decisions. Therefore
based modeling in movie CGI was made in the there is no need for any hand animation of virtual
recent Lord of the Rings trilogy (Jackson 2001, characters. However, directors can view the cre-
2002, 2003). In these films the bar was raised ated scenes and by tweaking the parameters of the
markedly in terms of the sophistication of the vir- brains of the virtual characters have a scene play
tual characters displayed and the sheer number of out in the exact way that they require.
characters populating each scene. To achieve the Since being used to such impressive effect in
special effects shots required by the makers of the Lord of the Rings trilogy (the developers of
these films, the Massive software system was the Massive system were awarded an academy
developed by Massive Software (www. award for their work), the Massive software sys-
massivesoftware.com). This system (Aitken et al. tem has been used in numerous other films such as
2004; Koeppel 2002) uses agent-based modeling “I, Robot” (Proyas 2004), “The Chronicles of
techniques, again inspired by aLife, to create Narnia: The Lion, the Witch and the Wardrobe”
Computer Graphics and Games, Agent-Based Modeling in 897
(Adamson 2005) and “Ratatouille” (Bird and information or to cooperate with a virtual charac-
Pinkava 2007) along with numerous television ter in order to accomplish some task that is key to
commercials and music videos. the plot of a game. Interactivity raises a massive
While the achievements of using agent-based challenge for practitioners as there is very little
modeling for movie CGI are extremely impres- restriction in terms of what the player might
sive, it is worth noting that none of these systems do. Virtual characters should respond in a believ-
run in real-time. Rather, scenes are rendered by able way at all times regardless of how bizarre and
banks of high powered computers, a process that unexpected the actions of the player might be.
can take hours for relatively simple scenes. For The second challenge comes from the fact that
example, the famous Prologue battle sequence in the vast majority of video games should run in real
the “Lord of the Rings: The Fellowship of the time. This means that the computational complexity
Ring” took a week to render. When agent-based must be kept to a reasonable level as there are only a
modeling is applied to the real-time world of finite number of processor cycles available for AI
computer games, things are very different. processing. This problem is magnified by the fact
that an enormous amount of CPU power it usually
dedicated to graphics processing. When compared
Agent-Based Modelling in Games to the techniques that can be used for controlling
virtual characters in films some of the techniques
Even more so than in movies, agent-based model- used in games are rudimentary due to this real-time
ing techniques have been used to drive the behav- constraint.
iors of virtual characters in computer games. As Finally, modern games resemble films in the fact
games have become graphically more realistic that creators go to great lengths to include intricate
(and in recent years they have become extremely storylines and control the building of tension in
so) game-players have come to expect that games much the way that film script writers do. This
are set in hugely realistic and believable virtual means that games are tested heavily in order to
worlds. This is particularly evident in the wide- ensure that the game proceeds smoothly and that
spread use of realistic physics modeling which is the level of difficulty is finely tuned so as to always
now commonplace in games (Sánchez-Crespo hold the interest of a player. In fact, this testing of
2006). In games that make strong use of physics games has become something of a science in itself
modeling, objects in the game world topple over (Thompson 2007). Using autonomous agents gives
when pushed, float realistically when dropped in game characters the ability to do things that are
water and generally respond as one would expect unexpected by the game designers and so upset
them to. Players expect the same to be true of the their well laid plans. This can often be a barrier to
virtual characters that populate virtual game the use of sophisticated techniques such as learning.
worlds. This can be best achieved by modeling Unfortunately there is also a barrier to the discussion
virtual characters as embodied virtual agents. of agent-based modeling techniques used in com-
However, there are a number of constraints mercial games. Because of the very competitive
which have a major influence on the use of nature of the games industry, game development
agent-based modeling techniques in games. houses often consider the details of how their
The first of these constraints stems from the games work as valuable trade secrets to be kept
fact that modern games are so highly interactive. well guarded. This can make it difficult to uncover
Players expect to be able to interact with all of the the details of how particularly interesting features of
characters they encounter within a game world. a game are implemented. While this situation is
These interactions can be as simple as having improving – more commercial game developers
something to shoot at or having someone to race are speaking at games conferences about how their
against; or involve much more sophisticated inter- games are developed and the release of game sys-
actions in which a player is expected to converse tems development kits for the development of game
with a virtual character to find out specific modifications (or mods) allows researchers to plumb
898 Computer Graphics and Games, Agent-Based Modeling in
the depths of game code – it is still often impossible any one time in a game. However, the fact that the
to find out the implementation details of very new ancient Chinese game of Go-Moku has not, to
games. date, been mastered by computer players (van
der Werf et al. 2002) illustrates the restrictions of
Game Genres such techniques.
Before discussing the use of agent-based model- The common thread linking together the kinds
ing in games any further, it is worth making a of games that this article focuses on is that they all
short clarification on the kinds of computer contain computer controlled virtual characters that
games that this article refers to. When discussing possess a strong notion of agency. Efforts are often
modern computer games, or video games, this made to separate the many different kinds of mod-
article does not refer to computer implementations ern video games that are the focus of this article
of traditional games such as chess, backgammon into a small set of descriptive genres. Unfortu-
or card games such as solitaire. Although these nately, much like in music, film and literature, no
games are of considerable research interest (chess categorization can hope to perfectly capture the
in particular has been the subject of extremely nuances of all of the available titles. However, a
successful research (Feng-Hsiung 2002)) they brief mention of some of the more important game
are typically not approached using agent-based genres is worth while (a more detailed description
modeling techniques. Typically, artificial intelli- of game genres, and artificial intelligence require-
gence approaches to games such as these rely ments of each is given in Laird and van Lent 2000).
largely on sophisticated searching techniques The most popular game genre is without doubt
which allow the computer player to search the action game in which the player must defeat
through a multitude of possible future situations waves of demented foes, typically (for increas-
dictated by the moves it will make and the moves ingly bizarre motivations) bent upon global
it expects its opponent to make in response. Based destruction. Illustrative examples of the genre
on this search, and some clever heuristics that include Half-Life 2 (www.half-life2.com) and the
indicate what constitutes a good game position Halo series (www.halo3.com). A screenshot of
for the computer player, the best sequence of the upcoming action game Rogue Warrior
moves can be chosen. This searching technique (www.bethsoft.com) is shown in Fig. 2.
relies on the fact that there are usually a relatively Strategy games allow players to control large
small number of moves that a player can make at armies in battle with other people, or computer
Computer Graphics and Games, Agent-Based Modeling in, Fig. 2 A screenshot of the upcoming action game
Rogue Warrior from Bethesda Softworks. (Image courtesy of Bethesda Softworks)
Computer Graphics and Games, Agent-Based Modeling in 899
Computer Graphics and Games, Agent-Based Modeling in, Fig. 3 A screenshot from Bethesda Softwork’s role
playing game The Elder Scrolls IV: Oblivion. (Image courtesy of Bethesda Softworks)
controlled opponents. Players do not have direct in which hundreds of human players can play
control over their armies, but rather issue orders together in an online world, would sound the
which are carried out by agent-based artificial death knell for the use of virtual non-player char-
soldiers. Well regarded examples of the genre acters in games. Examples of MMOGs include
include the Age of Empires (www. ageofempires. World of Warcraft (www.worldofwarcraft.com)
com) and Command & Conquer (www. and Battlefield 2142 (www.battlefield.ea.com).
commandandconquer.com) series. However, this has not turned out to be the case
Role playing games (such as the Elder Scrolls as there are still large numbers of single player
(www. elderscrolls.com) series) place game games being produced and even MMOGs need
players in expansive virtual worlds across which computer controlled characters for roles that
they must embark on fantastical quests which typ- players do not wish to play.
ically involve a mixture of solving puzzles, fighting Of course there are many games that simply do
opponents and interacting with non-player charac- not fit into any of these categorizations, but that
ters in order to gain information. Figure 3 shows a are still relevant for a discussion of the use of
screenshot of the aforementioned role-playing agent-based techniques – for example The Sims
game The Elder Scrolls IV: Oblivion. (www.thesims.ea.com) and the Microsoft Flight
Almost every sport imaginable has at this stage Simulator series (www.microsoft.com/games/
been turned into a computer based sports game. flightsimulatorx). However the categorization
The challenges in developing these games are still serves to introduce those unfamiliar with the
creating computer controlled opponents and subject to the kinds of games up for discussion.
team mates that play the games at a level suitable
to the human player. Interesting examples include Implementing Agent-Based Modelling
FIFA Soccer 08 (www.fifa08.ea.com) and Forza Techniques in Games
Motorsport 2 (www.forzamotorsport.net). One of the earliest examples of using agent-based
Finally, many people expected that the rise of modeling techniques in video games was its appli-
massively multi-player online games (MMOGs), cation to path planning. The ability of non-player
900 Computer Graphics and Games, Agent-Based Modeling in
characters (NPCs) to manoeuvre around a game Figure 4 shows a sample FSM for the control
world is one of the most basic competencies of an NPC in a typical action game. In this exam-
required in games. While in very early games it ple the behaviors of the character are determined
was sufficient to have NPCs move along pre- by just four states – CHASE, ATTACK, FLEE
scripted paths, this soon become unacceptable. and EXPLORE. Each of these states provides an
Games programmers soon began to turn to AI action that the agent should take. For example,
techniques which might be applied to solve some when in the EXPLORE state the character should
of the problems that were arising. The A path wander randomly around the world, or while in
planning algorithm (Stout 1996) was the first the FLEE state the character should determine a
example of such a technique to find wide-spread direction to move in that will take it away from its
use in the games industry. Using the A algorithm current enemy and move in that direction. The
NPCs can be given the ability to find their own way links between the states show how the behaviors
around an environment. This was put to particu- of the character should move between the various
larly fine effect early on in real-time strategy games available states. So, for example, if while in the
where the units controlled by players are semi- ATTACK state the agent’s health measure
autonomous and are given orders rather than becomes low, they will move to the FLEE state
directly controlled. In order to use the A algorithm and run away from their enemy.
a game world must be divided into a series of cells FSMs are widely used because they are so
each of which is given a rating in terms of the effort simple, well understood and extremely efficient
that must be expended to cross it. The A algo- both in terms of processing cycles required and
rithm then performs a search across these cells in memory usage. There have also been a number of
order to find the shortest path that will take a game highly successful augmentations to the basic state
agent from a start position to a goal. machine model to make them more effective, such
Since becoming widely understood amongst as the introduction of layers of parallel state
the game development community many interest- machines (Alexander 2003), the use of fuzzy
ing additions have been made to the basic A logic in finite state machines (Dybsand 2001)
algorithm. It was not long before three dimen- and the implementation of cooperative group
sional versions of the algorithm became com- behaviors through state machines (Snavely 2002).
monly used (Smith 2002). The basic notion of The action game Halo 2 is recognized as hav-
storing the energy required to cross a cell within ing a particularly good implementation of state
a game world has also been extended to augment
cells with a wide range of other useful information
(such as the level of danger in crossing a cell) that
can be used in the search process (Reed and
Geisler 2003).
The next advance in the kind of techniques
being used to achieve agent-based modeling in
games was the finite state machine (FSM)
(Houlette and Fu 2003). An FSM is a simple
system in which a finite number of states are
connected in a directed graph by transitions
between these states. When used for the control
of NPCs, the nodes of an FSM indicate the possi-
ble actions within a game world that an agent can
perform. Transitions indicate how changes in the
state of the game world or the character’s own
Computer Graphics and Games, Agent-Based Model-
attributes (such as health, tiredness etc) can move ing in, Fig. 4 A simple finite state machine for a soldier
the agent from one state to another. NPC in an action game
Computer Graphics and Games, Agent-Based Modeling in 901
ALife techniques have also been applied hero from boyhood to manhood. However, every
extensively in the control of game NPCs, as action the player took had an impact on the way in
much as a philosophy as any particular tech- which the game world’s population would react to
niques. The outstanding example of this is The him or her as they would remember every action
Sims (thesims.ea.com) a surprise hit of 2000 the next time they met the player. This notion of
which has gone on to become the best selling PC long-term consequences added an extra layer of
game of all time. Created by games guru Will believ-ability to the game-playing experience.
Wright the Sims puts the player in control of the
lives of a virtual family in their virtual home. Serious Games & Academia
Inspired by aLife, the characters in the game It will probably have become apparent to most
have a set of motivations, such as hunger, fatigue readers of the previous section that much of the
and boredom and seek out items within the game work done in implementing agent-based tech-
world that can satisfy these desires. Virtual char- niques for the control of NPCs in commercial
acters also develop sophisticated social relation- games is relatively simplistic when compared to
ships with each other based on common interest, the application of these techniques in other areas
attraction and the amount of time spent together. of more academic focus, such as robotics (Muller
The original system in the Sims has gone on to be 1996). The reasons for this have been discussed
improved in the sequel The Sims 2 and a series of already and briefly relate to the lack of available
expansion packs. processing resources and the requirements of
Some of the more interesting work in develop- commercial quality control. However, a large
ing techniques for the control of game characters amount of very interesting work is taking place
(particularly in action games) has been focused on in the application of agent-based technologies in
developing interesting sensing and memory academic research, and in particular the field of
models for game characters. Players expect when serious games. This section will begin by intro-
playing action games that computer controlled ducing the area of serious games and then go on to
opponents should suffer from the same problems discuss interesting academic projects looking at
that players do when perceiving the world. So, for agent-based technologies in games.
example, computer controlled characters should The term serious games (Michael and Chen
not be able to see through walls or from one floor 2005) refers to games designed to do more than
to the next. Similarly, though, players expect com- just entertain. Rather, serious games, while having
puter controlled characters to be capable of per- many features in common with conventional
ceiving events that occur in a world and so NPCs games, have ulterior motives such as teaching,
should respond appropriately to sound events or training, and marketing. Although games have
on seeing the player. been used for ends apart from entertainment, in
One particularly fine example of a sensing model particular education, for a long time, the modern
was in the game Thief: The Dark Project where serious games movement is set apart from these
players are required to sneak around an environment by the level of sophistication of the games it
without alerting guards to their presence (Leonard creates. The current generation of serious games
2003). The developers produced a relatively sophis- is comparable with main-stream games in terms of
ticated sensing model that was used by non-player the quality of production and sophistication of
characters which modeled visual effects such as not their design. Serious games offer particularly
being able to see the player if they were in shadows, interesting opportunities for the use of agent-
and moving some way towards modeling acoustics based modeling techniques due to the facts that
so that non-player characters could respond reason- they often do not have to live up to the rigorous
ably to sound events. testing of commercial games, can have the
2004s Fable (fable.lionhead.com) took the idea requirement of specialized hardware rather than
of adding memory to a game to new heights. In this being restricted to commercial games hardware
adventure game the player took on the role of a and often, by the nature of their application
Computer Graphics and Games, Agent-Based Modeling in 903
domains, require more in-depth interactions (Carless 2005), a game developed by the Enter-
between players and NPCs. tainment Technology Centre at Carnegie Mellon
The modern serious games movement can be University to train fire-fighters to deal with chem-
said to have begun with the release of America’s ical and hazardous materials emergencies; Your-
Army (www. americasarmy.com) in 2002 self!Fitness (www.yourselffitness.com) (Michael
(Nieborg 2004). Inspired by the realism of com- and Chen 2005) an interactive virtual personal
mercial games such as the Rainbow 6 series trainer developed for modern games consoles;
(www.rainbow6.com), the United States military and Serious Gordon (www.seriousgames.ie)
developed America’s Army and released it free of (Mac Namee et al. 2006) a game developed to
charge in order to give potential recruits a flavor of aid in teaching food safety in kitchens. A screen
army life. The game was hugely successful and is shot of Serious Gordon is shown in Fig. 5.
still being used today as both a recruitment tool Over the past decade, interest in academic
and as an internal army training tool. research that is directly focused on artificial intel-
Spurred on by the success of America’s Army the ligence, and in particular agent-based modelling
serious games movement began to grow, particu- techniques and their application to games
larly within academia. A number of conferences (as opposed to the general virtual character/com-
sprung up and notably the Serious Games Summit puter graphics work discussed previously) has
became a part of the influential Game Developer’s grown dramatically. One of the first major aca-
Conference (www.gdconf. com) in 2004. demic research projects into the area of Game-AI
Some other notable offerings in the serious was led by John Laird at the University of Mich-
games field include Food Force (www.food- igan, in the United States. The SOAR architecture
force.com) (DeMaria 2005), a game developed was developed in the early nineteen eighties in an
by the United Nations World Food Programme attempt to “develop and apply a unified theory of
in order to promote awareness of the issues sur- human and artificial intelligence” (Rosenbloom
rounding emergency food aid; Hazmat Hotzone et al. 1993). SOAR is essentially a rule based
Computer Graphics and Games, Agent-Based Modeling in, Fig. 5 A screenshot of Serious Gordon, a serious game
developed to aid in the teaching of food safety in kitchens
904 Computer Graphics and Games, Agent-Based Modeling in
inference system which takes the current state of a et al. 2004). The main issue that arises with the use
problem and matches this to production rules the SOAR architecture is that it is enormously
which lead to actions. resource hungry, with the NPC controllers running
After initial applications into the kind of simple on a separate machine to the actual game.
puzzle worlds which characterized early AI At Trinity College in Dublin in Ireland, the
research (Laird et al. 1984), the SOAR architec- author of this article worked on an intelligent agent
ture was applied to the task of controlling com- architecture, the Proactive Persistent Agent (PPA)
puter generated forces (Jones et al. 1999). This architecture, for the control of background charac-
work lead to an obvious transfer to the new ters (or support characters) in character-centric
research area of game-AI (Laird 2000). games (games that focus on character interactions
Initially the work of Laird’s group focused on rather than action, e.g. role-playing games) (Mac
applying the SOAR architecture to the task of Namee and Cunningham 2003; Mac Namee et al.
controlling NPC opponents in the action game 2003). The key contributions of this work were that
Quake (www.idsoftware. com) (Laird 2000). This it made possible the creation of NPCs that were
proved quite successful leading to opponents capable of behaving believably in a wide range of
which could successfully play against human situations and allowed for the creation of game
players, and even begin to plan based on anticipa- environments which it appeared had an existence
tion of what the player was about to do. More beyond their interactions with players. Agent behav-
recently Laird’s group have focused on the devel- iors in this work were based on models of personal-
opment of a game which requires more involved ity, emotion, relationships to other characters and
interactions between the player and the NPCs. behavioral models that changed according to the
Named Haunt 2, this game casts the player in the current role of an agent. This system was used to
role of a ghost that must attempt to influence the develop a stand alone game and as part of a simula-
actions of a group of computer controlled charac- tion of areas within Trinity College. A screenshot of
ters inhabiting the ghost’s haunted house (Magerko this second application is shown in Fig. 6.
Computer Graphics and Games, Agent-Based Modeling in, Fig. 6 Screenshots of the PPA system simulating parts
of a college
Computer Graphics and Games, Agent-Based Modeling in 905
At Northwestern University in Chicago the perform resource intensive planning within the
Interactive Entertainment group has also applied constraints of a typical computer game environ-
approaches from more traditional research areas ment. Following on from this work, the term
to the problems facing game-AI. Ian Horswill has anytime agent was coined to describe the process
led a team that are attempting to use architectures by which agents actively refine original plans
traditionally associated with robotics for the con- based on changing world conditions. In Nareyek
trol of NPCs. In Horswill and Zubek (1999) con- (2007) describes the directions in which he
sider how perfectly matched the behavior based intends to take this work in future.
architectures often used in robotics are with the Funge uses the situational calculus to allow
requirements of NPC control architectures. The agents reason about their world. Similarly to
group have demonstrated some of their ideas in a Nareyek he has addressed the problems of a
test-bed environment built on top of the game dynamic, ever changing world, plan refining and
Half-Life (Khoo and Zubek 2002). The group incomplete information. Funge’s work uses an
also looks at issues around character interaction extension to the situational calculus which allows
(Zubek and Horswill 2005) and the many psycho- the expression of uncertainty. Since completing
logical issues associated with creating virtual this work Funge has gone on to be one of the
characters asking how we can create virtual founders of AiLive (www.ailive.net), a mid-
game agents that display all of the foibles that dleware company specializing in AI for games.
make us relate to characters in human stories While the approaches of both of these projects
(Horswill 2007). have shown promise within the constrained envi-
Within the same research group a team led by ronments to which they have been applied during
Ken For-bus have extended research previously research, (and work continues on them) it remains
undertaken in conjunction with the military to be seen whether such techniques can be suc-
(Forbus et al. 1991) and applied it to the problem cessfully applied to a commercial game environ-
of terrain analysis in computer strategy games ment and all of the resource constraints that such
(Forbus et al. 2001). Their goal is to create strate- an environment entails.
gic opponents which are capable of performing One of the most interesting recent examples of
sophisticated reasoning about the terrain in a agent-based work in the field of serious games is
game world and using this knowledge to identify that undertaken by Barry Silverman and his group
complex features such as ambush points. This at the University of Pennsylvania in the United
kind of high level reasoning would allow AI States (Silverman et al. 2006a, b). Silverman
opponents play a much more realistic game, and models the protagonists in military simulations
even surprise human players from time to time, for use in training programmes and has taken a
something that is sorely missing from current very interesting approach in that his agent models
strategy games. are based on established cognitive science and
As well as this work which has spring-boarded behavioral science research. While Silverman
from existing applications, a number of projects admits that many of the models described in the
began expressly to tackle problems in game-AI. cognitive science and behavioral science literature
Two which particularly stand out are the Excali- are not well quantified enough to be directly
bur Project, led by Alexander Nareyek (2001) and implemented, he has adapted a number of well
work by John Funge (1999). Both of these pro- respected models for his purposes. Sil-verman’s
jects have attempted to applying sophisticated work is an excellent example of the capabilities
planning techniques to the control of game that can be explored in a serious games setting
characters. rather than a commercial game setting, and as
Nareyek uses constraint based planning to such merits an in depth discussion. A high-level
allow game agents reason about their world. By schematic diagram of Silverman’s approach is
using techniques such as local search Nareyek has shown in Fig. 7 and shows the agent architecture
attempted to allow these sophisticated agents used by Silverman’s system, PMFserv.
906 Computer Graphics and Games, Agent-Based Modeling in
Computer Graphics and Games, Agent-Based Modeling in, Fig. 7 A schematic diagram of the main components of
the PMFserv system. (With kind permission of Barry Silverman)
The first important component of the PMFserv emotions. In this case the well known OCC
system is the biology module which controls bio- model (Ortony et al. 1988), which has been used
logical needs using a metaphor based on the flow in agent-based applications before (Bates 1992b),
of water through a system. Biological concepts is used. The OCC model provides for 11 pairs of
such as hunger and fatigue are simulated using a opposite emotions such as pride and shame, and
series of reservoirs, tanks and valves which model hope and fear. The emotional state of an agent
the way in which resources are consumed by the with regard to past, current and future actions
system. This biological model is used in part to heavily influences the decisions that the agent
model stress which has an important impact on the makes.
way in which agents make decisions. To model The second portion of the Personality, Culture,
the way in which agent performance changes Emotion module uses a value tree in order to
under pressure Silverman uses performance mod- capture the values of an agent. These values are
erator functions (PMFs). An example of one of divided into a Preference Tree which captures
the earliest PMFs used is the Yerkes–Dodson long term desired states for the world, a Standards
“inverted-u” curve (Yerkes and Dodson 1908) Tree which relates to the actions that an agent
which illustrates that as mental arousal is believes it can or cannot follow in order to achieve
increased performance initially improves, peaks these desired states and a Goal Tree which cap-
and then trails off again. In PMFserv a range of tures short term goals.
PMFs are used to model the way in which behav- PMFserv also models the relationships
ior should change depending on stress levels and between agents (Social Model, Relations, Trust
biological conditions. in Fig. 7). The relationship of one agent to another
The second important module of PMFserv is modeled in terms of three axes. The first is the
attempts to model how personality, culture and degree to which the other agent is thought of as a
emotion affect the behavior of an agent. In keep- human rather than an inanimate object – locals
ing with the rest of their system PMFserv uses tend to view American soldiers as objects rather
models inspired by cognitive science to model than people. The second axis is the cognitive
Computer Graphics and Games, Agent-Based Modeling in 907
Computer Graphics and Games, Agent-Based Modeling in, Fig. 8 A screenshot of the PMFserv system being used
to simulate the Black Hawk Down scenario. (With kind permission of Barry Silverman)
grouping (ally, foe etc) to which the other agent in Mogadishu, Somalia in which a United States
belongs and whether this is also a group to which military Black Hawk helicopter crashed, as made
the first agent has an affinity. Finally, the valence, famous by the book and film “Black Hawk Down”
or strength, of the relationship is stored. Relation- (Bowden 2000). In this example, which was
ships continually change based on actions that developed as a military training aid as part of a
occur within the game world. Like the other mod- larger project looking at agent implementations
ules of the system this model is also based on within such systems (Toth et al. 2003; van Lent
psychological research (Ortony et al. 1988). et al. 2004) the player took on the role of a US
The final important module of the PMFserv army ranger on a mission to secure the helicopter
architecture is the Cognitive module which is wreck in a modification (or “mod”) of the game
used to decide on particular actions that agents Unreal Tournament (www.unreal.com). A screen-
will undertake. This module uses inputs from all shot of this simulation is shown in Fig. 8.
of the other modules to make these decisions and The PMFserv system was used to control the
so the behavior of PMFserv agents is driven by behaviors of characters within the game world
their stress levels, relationships to other agents such as Somali militia, and Somali civilians.
and objects within the game world, personality, These characters were imbued with physical attri-
culture and emotions. The details of the PMFserv butes, a value system and relationships with other
cognitive process are beyond the scope of this characters and objects within the game environ-
article, so it will suffice to say that action selection ment. The sophistication of PMFserv was appar-
is based on a calculation of the utility of a partic- ent in many of the behaviors of the simulations
ular action to an agent, with this calculation mod- NPCs. One particularly good example was the
ified by the factors listed above. fact that Somali women would offer themselves
The most highly developed example using the as human shields for militia fighters. This behav-
PMF-serv model is a simulation of the 1993 event ior was never directly programmed into the agents
908 Computer Graphics and Games, Agent-Based Modeling in
make-up, but rather emerged as a result of their that the player could teach in a reinforcement
values and assessment of their situation. PMFserv manner (Evans 2002). While this was particularly
remains one of the most sophisticated current successful in the game, such techniques have not
agent implementations and shows the possibilities been more widely applied. One interesting aca-
when the shackles of commercial game con- demic project in this area is the NERO project
straints are thrown off. (www.nerogame. org) which allows a player to
train an evolving army of soldiers and have them
battle the armies of other players (Stanley et al.
Future Directions 2006). It is expected that these kinds of capabili-
ties will become more and more common in com-
There is no doubt that with the increase in the mercial games.
amount of work being focused on the use of One new feature of the field of virtual character
agent-based modeling in computer graphics and control in games is the emergence of specialized
games there will be major developments in the middleware. Middleware has had a massive
near future. This final section will attempt to pre- impact in other areas of game development
dict what some of these might be. including character modeling (for example Maya
The main development that might be expected available from www.autodesk. com) and physics
in all of the areas that have been discussed in this modeling (for example Havok available from
article is an increase in the depth of simulation. www.havok.com). AI focused middleware for
The primary driver of this increase in depth will be games is now becoming more common with nota-
the development of more sophisticated agent ble offerings including AI-Implant (www.ai-
models which can be used to drive ever more implant.com) and Kynogon (www.kynogon.
sophisticated agent behavior. The PMFserv sys- com) which perform path finding and state
tem described earlier is one example of the kinds machine based control of characters. It is expected
of deeper systems that are currently being devel- that more sophisticated techniques will over time
oped. In general computer graphics applications find their way into such software.
this will allow for the creation of more interesting To conclude the great hope for the future is that
simulations including previously prohibitive fea- more and more sophisticated agent-based model-
tures such as automatic realistic facial expressions ing techniques from other application areas and
and other physical expressions of agents’ internal other branches of AI will find their way into the
states. This would be particularly use in CGI for control of virtual characters.
movies in which, although agent based modeling
techniques are commonly used for crowd scenes
and background characters, main characters are Bibliography
still animated almost entirely by hand.
In the area of computer games it can be Primary Literature
expected that many of the techniques being used Adamson A (Director) (2005) The chronicles of Narnia:
the Lion, the Witch and the Wardrobe. Motion Picture.
in movie CGI will filter over to real-time game
http://adisney.go.com/disneypictures/narnia/lb_main.
applications as the processing power of game html
hardware increases – this is a pattern that has Aitken M, Butler G, Lemmon D, Saindon E, Peters D,
been evident for the past number of years. In Williams G (2004) The Lord of the Rings: the visual
effects that brought middle earth to the screen. In:
terms of depth that might be added to the control
International conference on computer graphics and
of game characters one feature that has mainly interactive techniques (SIGGRAPH), course notes
been conspicuous by its absence in modern Alexander T (2003) Parallel-state machines for believable
games is genuine learning by game agents. characters. In: Massively multiplayer game develop-
ment. Charles River Media
2000s Black & White and its sequel Black &
Allers R, Minkoff R (Directors) (1994) The Lion King.
White 2 (www.lionhead.com) featured some Motion picture. http://disney.go.com/disneyvideos/
learning by one of the game’s main characters animatedfilms/lionking/
Computer Graphics and Games, Agent-Based Modeling in 909
Aylett R, Luck M (2000) Applying artificial intelligence to Forbus K, Nielsen P, Faltings B (1991) Qualitative spatial
virtual reality: intelligent virtual environments. Appl reasoning: the CLOCK project. Artif Intell 51:1–3
Artif Intell 14(1):3–32 Forbus K, Mahoney J, Dill K (2001) How qualitative
Badler N, Bindiganavale R, Bourne J, Allbeck J, Shi J, spatial reasoning can improve strategy game AIs. In:
Palmer M (1999) Real Time virtual humans. In: Pro- Proceedings of the AAAI spring symposium on AI and
ceedings of the international conference on digital interactive entertainment
media futures Funge J (1999) AI for games and animation: a cognitive
Bates J (1992a) The nature of characters in interactive modeling approach. A.K. Peters, Natick
worlds and the Oz project. Technical report CMU-CS- Hayes-Roth B, Doyle P (1998) Animate characters. Auton
92–200. School of Computer Science, Carnegie Melon Agents Multi-Agent Syst 1(2):195–230
University Horswill I (2007) Psychopathology, narrative, and cogni-
Bates J (1992b) Virtual reality, art, and entertainment. tive architecture (or: why NPCs should be just as
Presence J Teleoper Virtual Environ 1(1):133–138 screwed up as we are). In: Proceedings of AAAI fall
Berger L (2002) Scripting: overview and code generation. symposium on intelligent narrative technologies
In: Rabin S (ed) AI game programming wisdom. Horswill I, Zubek R (1999) Robot architectures for believ-
Charles River Media Hingham able game agents. In: Proceedings of the 1999 AAAI
Bird B, Pinkava J (Directors) (2007) Ratatouille. Motion spring symposium on artificial intelligence and com-
picture. http://disney.go.com/disneyvideos/animatedfilms puter games
/ratatouille/ Houlette R, Fu D (2003) The ultimate guide to FSMs in
Blumberg B (1996) Old tricks, new dogs: ethology and games. In: Rabin S (ed) AI game programming wisdom
interactive creatures. Ph.D. Thesis, Media Lab, Massa- 2. Charles River Media, Hingham
chusetts Institute of Technology IGDA (2003) Working group on rule-based systems report.
Bowden M (2000) Black Hawk Down. Corgi Adult International Games Development Association
Burke R, Isla D, Downie M, Ivanov Y, Blumberg B (2002) Isbister K, Doyle P (2002) Design and evaluation of
Creature smarts: the art and architecture of a virtual embodied conversational agents: a proposed taxonomy.
brain. In: Proceedings of game-on 2002: the 3rd inter- In: Proceedings of the AA-MAS02 workshop on
national conference on intelligent games and simula- embodied conversational agents: lets specify and com-
tion, pp 89–93 pare them! Bologna
Burton T (Director) (1992) Batman returns. Motion picture. Jackson P (Director) (2001) The lord of the rings: the
http://www.warnervideo.com/batmanmoviesondvd/ fellowship of the ring. Motion picture. http://www.
Carless S (2005) Postcard from SGS 2005: Hazmat: hot- lordoftherings.net/
zone – first-person first responder gaming. Retrieved Jackson P (Director) (2002) The lord of the rings: the two
Oct 2007, from Gamasutra: www.gamasutra.com/fea towers. Motion picture. http://www.lordoftherings.net/
tures/ 20051102/carless_01b.shtml Jackson P (Director) (2003) The lord of the rings: the return of
Christian M (2002) A simple inference engine for a rule the king. Motion picture. http://www.lordoftherings.net/
based architecture. In: Rabin S (ed) AI game program- Johnston O, Thomas F (1995) The illusion of life: Disney
ming wisdom. Charles River Media, Hingham animation. Disney Editions, New York
Darnell E, Johnson T (Directors) (1998) Antz. Motion Jones R, Laird J, Neilsen P, Coulter K, Kenny P, Koss
picture. http://www.dreamworksanimation.com/ F (1999) Automated intelligent pilots for combat flight
DeMaria R (2005) Postcard from the serious games sum- simulation. AI Mag 20(1):27–42
mit: how the United Nations fights hunger with food Khoo A, Zubek R (2002) Applying inexpensive AI tech-
force. Retrieved Oct 2007, from Gamasutra: www. niques to computer games. IEE Intell Syst Spec Issue
gamasutra.com/ features/20051104/demaria_01.shtml Interact Entertain 17(4):48–53
Dybsand E (2001) A generic fuzzy state machine in C++. Koeppel D (2002) Massive attack. http://www.popsci.com/
In: Rabin S (ed) Game programming gems 2. Charles popsci/science/d726359b9fa84010vgnvcm1000004ee
River Media, Hingham cbccdr crd.html. Accessed Oct 2007
Evans R (2002) Varieties of learning. In: Rabin S (ed) AI Laird J (2000) An exploration into computer games
game programming wisdom. Charles River Media, and computer generated forces. In: The 8th confer-
Hingham ence on computer generated forces and behavior
Faloutsos P, van de Panne M, Terzopoulos D (2001) The representation
virtual stuntman: dynamic characters with a repetoire of Laird J, van Lent M (2000) Human-level AI’s killer appli-
autonomous motor skills. Comput Graph cation: interactive computer games. In: Proceedings of
25(6):933–953 the 17th national conference on artificial intelligence
Farenc N, Musse S, Schweiss E, Kallmann M, Aune O, Laird J, Rosenbloom P, Newell A (1984) Towards
Boulic R et al (2000) A paradigm for controlling virtual Chunking as a general learning mechanism. In: The
humans in urban environment simulations. Appl Artif 1984 national conference on artificial intelligence
Intell J Special Issue Intell Virtual Environ 14(1):69–91 (AAAI), pp 188–192
Feng-Hsiung H (2002) Behind deep blue: building the Laramée F (2002) A rule based architecture using
computer that defeated the world chess champion. Dempster-Schafer theory. In: Rabin S (ed) AI game
Princeton University Press, Princeton programming wisdom. Charles River Media, Hingham
910 Computer Graphics and Games, Agent-Based Modeling in
Lasseter J, Stanton A (Directors) (1998) A Bug’s life; In: Rabin S (ed) AI game programming wisdom
Motion picture. http://www.pixar.com/featurefilms/abl/ 2. Charles River Media, Hingham
Leonard T (2003) Building an AI sensory system: exam- Reynolds C (1987) Flocks, herds and schools: a distributed
ining the design of thief: the dark project. In: Proceed- behavioral model. Comput Graph 21(4):25–34
ings of the 2003 game developers’ conference, San Jose Rodriguez R (Director) (1996) From Dusk ’Till Dawn.
Loyall B (1997) Believable agents: building interactive Motion picture
personalities. Ph.D. thesis, Carnegie Melon University Rosenbloom P, Laird J, Newell A (1993) The SOAR
Määta A (2002) Realistic level design for Max Payne. In: papers: readings on integrated intelligence. MIT Press
Proceedings of the 2002 game developer’s conference, Sánchez-Crespo D (2006) GDC: physical gameplay in
GDC 2002 Half-Life 2. Retrieved Oct 2007, from gamasutra.
Mac Namee B, Cunningham P (2003) Creating socially com: http://www. gamasutra.com/features/20060329/
interactive non player characters: the m-SIC system. Int sanchez_01.shtml
J Intell Games Simul 2(1) Shao W, Terzopoulos D (2005) Autonomous pedestrians.
Mac Namee B, Dobbyn S, Cunningham P, O’Sullivan In: Proceedings of SIGGRAPH/EG symposium on
C (2003) Simulating virtual humans across diverse computer animation, SCA’05, pp 19–28
situations. In: Proceedings of intelligent virtual Silverman BG, Bharathy G, O’Brien K, Cornwell J (2006a)
agents’03, pp 159–163 Human behavior models for agents in simulators and
Mac Namee B, Rooney P, Lindstrom P, Ritchie A, games: part II: Gamebot engineering with PMFserv.
Boylan F, Burke G (2006) Serious gordon: using seri- Presence Teleoper Virtual Worlds 15(2):163–185
ous games to teach food safety in the kitchen. In: The Silverman BG, Johns M, Cornwell J, O’Brien K (2006b)
9th international conference on computer games: AI, Human behavior models for agents in simulators and
animation, mobile, educational & serious games games: part I: enabling science with PMFserv. Presence
CGAMES06, Dublin Teleoper Virtual Environ 15(2):139–162
Magerko B, Laird JE, Assanie M, Kerfoot A, Stokes Smith P (2002) Polygon soup for the programmer’s soul:
D (2004) AI characters and directors for interactive 3D path finding. In: Proceedings of the game devel-
computer games. In: The 2004 innovative applications oper’s conference 2002, GDC2002
of artificial intelligence conference. AAAI Press, San Snavely P (2002) Agent cooperation in FSMs for baseball.
Jose In: Rabin S (ed) AI game programming wisdom.
Michael D, Chen S (2005) Serious games: games that Charles River Media, Hingham
educate, train, and inform. Course Technology PTR Stanley KO, Bryant BD, Karpov I, Miikkulainen R (2006)
Muller J (1996) The design of intelligent agents: a layered RealTime evolution of neural networks in the NERO
approach. Springer, Berlin video game. In: Proceedings of the twenty-first national
Musse RS, Thalmann D (2001) A behavioral model for real conference on artificial intelligence, AAAI-2006.
time simulation of virtual human crowds. IEEE Trans AAAI Press, pp 1671–1674
Vis Comput Graph 7(2):152–164 Stout B (1996) Smart moves: intelligent path-finding.
Nareyek A (2001) Constraint based agents. Springer, Game Dev Mag Oct
Berlin Takahashi TS (1992) Behavior simulation by network
Nareyek A (2007) Game AI is dead. Long live game AI! model. Mem Kougakuin Univ 73, pp 213–220
IEEE Intell Syst 22(1):9–11 Terzopoulos D, Tu X, Grzeszczuk R (1994) Artificial fishes
Nieborg D (2004) America’s army: more than a game. with autonomous locomotion, perception, behavior and
Bridging the gap; Transforming knowledge into action learning, in a physical world. In: Proceedings of the
through gaming and simulation. In: Proceedings of the artificial life IV workshop. MIT Press
35th conference of the international simulation and Thalmann MN, Thalmann D (1994) Artificial life and
gaming association (ISAGA), Munich virtual reality. Wiley, Chichester
Ortony A, Clore GL, Collins A (1988) The cognitive Thompson C (2007) Halo 3: how Microsoft labs invented a
structure of emotions. Cambridge University Press, new science of play. Retrieved Oct 2007, from wired.
Cambridge com: http://www.wired.com/gaming/virtualworlds/
Perlin K, Goldberg A (1996) Improv: a system for scripting magazine/15-09/ff_halo
interactive actors in virtual worlds. In: Proceedings of Toth J, Graham N, van Lent M (2003) Leveraging gaming
the ACM computer graphics annual conference, in DOD modelling and simulation: integrating perfor-
pp 205–216 mance and behavior moderator functions into a general
Proyas A (Director) (2004) I, Robot. Motion picture. http:// cognitive architecture of playing and non-playing char-
www. irobotmovie.com acters. In: Twelfth conference on behavior representa-
Rao AS, Georgeff MP (1991) Modeling rational agents tion in modeling and simulation (BRIMS, formerly
within a BDI-architecture. In: Proceedings of knowl- CGF), Scotsdale
edge representation and reasoning (KR&R-91). Mor- Valdes R (2004) In the mind of the enemy: the artificial
gan Kaufmann, pp 473–484 intelligence of Halo 2. Retrieved Oct 2007, from How-
Reed C, Geisler B (2003) Jumping, climbing, and tactical StuffWorks.com: http://entertainment.howstuffworks.
reasoning: how to get more out of a navigation system. com/ halo2-ai.htm
Computer Graphics and Games, Agent-Based Modeling in 911
van der Werf E, Uiterwijk J, van den Herik J (2002) Books and Reviews
Programming a computer to play and solve DeLoura M (ed) (2000) Game programming gems. Charles
Ponnuki-go. In: Proceedings of game-on 2002: the River Media, Hingham
3rd international conference on intelligent games and DeLoura M (ed) (2001) Game programming gems
simulation, pp 173–177 2. Charles River Media, Hingham
van Lent M, McAlinden R, Brobst P (2004) Enhancing the Dickheiser M (ed) (2006) Game programming gems
behavioral fidelity of synthetic entities with human 6. Charles River Media, Hingham
behavior models. In: Thirteenth conference on behavior Kirmse A (ed) (2004) Game programming gems 4. Charles
representation in modeling and simulation (BRIMS) River Media, Hingham
Woodcock S (2000) AI roundtable moderator’s report. In: Pallister K (ed) (2005) Game programming gems 5. Charles
Proceedings of the game developer’s conference 2000 River Media, Hingham
(GDC2000) Rabin S (ed) (2002) Game AI wisdom. Charles River
Wooldridge M, Jennings N (1995) Intelligent agents: the- Media, Boston
ory and practice. Know Eng Rev 10(2):115–152 Rabin S (ed) (2003) Game AI wisdom 2. Charles River
Yerkes RW, Dodson JD (1908) The relation of strength of Media, Boston
stimulus to rapidity of habit formation. J Comp Neurol Rabin S (ed) (2006) Game AI wisdom 3. Charles River
Psychol 18:459–482 Media, Boston
Zubek R, Horswill I (2005) Hierarchical parallel Markov Russell S, Norvig P (2002) Artificial intelligence: a modern
models of interaction. In: Proceedings of the artificial approach. Prentice Hall
intelligence and interactive digital entertainment con- Treglia D (ed) (2002) Game programming gems 3. Charles
ference, AIIDE 2005 River Media, Hingham
Complexity Complexity and complex systems
Agent-Based Modeling, pertain to ideas of randomness and irregularity
Large-Scale Simulations in a system, where individual-scale interactions
may result in either very complex or surpris-
Hazel R. Parry ingly simple patterns of behavior at the larger
Central Science Laboratory, York, UK scale. Complex agent-based systems are there-
fore usually made up of agents interacting in a
non-linear fashion. The agents are capable of
Article Outline generating emergent behavioral patterns, of
deciding between rules and of relying upon
Glossary data across a variety of scales. The concept
Definition of the Subject allows for studies of interaction between hierar-
Introduction chical levels rather than fixed levels of analysis.
Large Scale Agent Based Models: Guidelines for Cyclic Mapping A method of partitioning an
Development array of elements between nodes of a distrib-
Parallel Computing uted system, where the array elements are
Example partitioned by cycling through each node and
Future Directions assigning individual elements of the array to
Bibliography each node in turn.
Grid Computer ‘Grids’ are comprised of a large
Glossary number of disparate computers (often desktop
PCs) that are treated as a virtual cluster when
Agent A popular definition of an agent, particu- linked to one another via a distributed commu-
larly in AI research, is that of Wooldridge nication infrastructure (such as the internet or
(Wooldridge 1999), pp. 29: “an agent is a com- an intranet). Grids facilitate sharing of comput-
puter system that is situated in some environ- ing, application, data and storage resources.
ment, and that is capable of autonomous action Grid computing crosses geographic and insti-
in this environment in order to meet its design tutional boundaries, lacks central control, and
objectives”. In particular, it is the autonomy, is dynamic as nodes are added or removed in
flexibility, inter-agent communication, reactivity an uncoordinated manner. BOINC computing
and proactive-ness of the agents that distin- is a form of distributed computing is where idle
guishes the paradigm and gives power to agent- time on CPUs may be used to process infor-
based models and multi-agent simulation mation (http://boinc.berkeley.edu/).
(Heppenstall 2004; Jennings 2000). Multi-agent Ising-type Model Ising-type models have been
systems (MAS) comprise of numerous agents, primarily used in the physical sciences. They
which are given rules by which they act and simulate behavior in which individual ele-
interact with one another to achieve a set of goals. ments (e.g., atoms, animals, social behavior,
Block Mapping A method of partitioning an etc.) modify their behavior so as to conform
array of elements between nodes of a distributed to the behavior of other individuals in their
system, where the array elements are partitioned vicinity. Conway’s Game of Life is a Ising-
as evenly as possible into blocks of consecutive type model, where cells are in one of two
elements and assigned to processors. The size of states: dead or alive. In biology, the technique
the blocks approximates to the number of array is used to model neural networks and flocking
elements divided by the number of processors. birds, for example.
Message Passing (MP) Message passing type needs the same instruction performed on
(MP) is the principle way by which parallel it. An example is a vector or array processor.
clusters of machines are programmed. It is a An application that may take advantage of
widely-used, powerful and general method of SIMD is one where the same value is being
enabling distribution and creating efficient pro- added (or subtracted) to a large number of data
grams (Pacheco 1997). Key advantages of points.
using MP architectures are an ability to scale Vector Computer/Vector Processor Vector
to many processors, flexibility, ‘future- computers contain a CPU designed to run
proofing’ of programs and portability mathematical operations on multiple data ele-
(Openshaw and Turton 2000). ments simultaneously (rather than sequen-
Message Passing Interface (MPI) A computing tially). This form of processing is essentially
standard that is used for programming parallel a SIMD approach. The Cray Y-MP and the
systems. It is implemented as a library of code Convex C3880 are two examples of vector
that may be used to enable message passing in processors used for supercomputing in the
a parallel computing system. Such libraries 1980s and 1990s. Today, most recent commod-
have largely been developed in C and Fortran, ity CPU designs include some vector pro-
but are also used with other languages such as cessing instructions.
Java (MPIJava http://www.hpjava.org). It
enables developers of parallel software to
write parallel programs that are both portable Definition of the Subject
and efficient.
Multiple Instruction Multiple Data (MIMD) ‘Large scale’ simulations in the context of agent-
Parallelization where different algorithms are based modelling are not only simulations that are
applied to different data items on different large in terms of the size of the simulation
processors. (number of agents simulated), but they are also
Parallel Computer Architecture A parallel complex. Complexity is inherent in agent-based
computer architecture consists of a number of models, as they are usually composed of dynamic,
identical units that contain CPUs (Central Pro- heterogeneous, interacting agents. Large scale
cessing Units) which function as ordinary agent-based models have also be referred to as
serial computers. These units, called nodes, ‘Massively Multi-agent Systems (MMAS)’
are connected to one another (Fig. 1). They (Ishida et al. 2005). MMAS is defined as “‘beyond
may transfer information and data between resource limitation”: the number of agents
one another (e.g. via MPI) and simultaneously exceeds local computer resources, or the situa-
perform calculations on different data. tions are too complex to design/program given
Single Instruction Multiple Data (SIMD) human cognitive resource limits’ (Ishida et al.
SIMD techniques exploit data level parallel- 2005), Preface. Therefore, for agent-based model-
ism: when a large mass of data of a uniform ling ‘large scale’ is not simply a size problem, it is
Agent-Based Modeling,
Large-Scale Simulations,
Fig. 1 A network with
interconnected separate
memory and processors.
(After Pacheco 1997,
pp. 19)
Agent-Based Modeling, Large-Scale Simulations 915
Agent-Based Modeling, Large-Scale Simulations, Table 1 Potential solutions to implement when faced with a large
number of agents to model
Solution Pro Con
Reduce the number of agents No reprogramming of model Unrealistic population. Alters model behavior
in order for model to run
Revert to a population based Could potentially handle any Lose insights from agent approach. Unsuitable
modelling approach number of individuals for research questions. Construction of entirely
new model (non-agent-based)
Invest in an extremely No reprogramming of model High cost
powerful computer
Run the model on a vector Potentially more efficient as This approach only works more efficiently with
computer more calculations may be SIMD, probably unsuitable for agent-based
performed in a given time models
Super-individuals (Scheffer Relatively simple solution, Reprogramming of model. Inappropriate in a
et al. 1995) keeping model formulation the spatial context (Parry 2006; Parry and Evans In
same press)
Invest in a powerful computer Makes available high levels of High cost
network and reprogram the memory and processing power Advanced computing skills required for
model in parallel restructuring of model
916 Agent-Based Modeling, Large-Scale Simulations
types) would mean that running a simulation on a individual particles on massively parallel computer
vector computer may make little difference to the systems have been successfully developed in the
simulation performance. This is because an agent physical sciences of fluid dynamics, meteorology
model typically has few elements that could take and materials science. In the early 1990s, work in
advantage of SIMD: rarely the same value will be the field of molecular-dynamics (MD) simulations
added (or subtracted) to a large number of data proved parallel platforms to be highly successful in
points. Vector processors are less successful when enabling large-scale MD simulation of up to 131 mil-
a program does not have a regular structure, and lion particles (Lomdahl et al. 1993). Today the same
they don’t scale to arbitrarily large problems (the code has been tested and used to simulate up to
upper limit on the speed of a vector program will 320 billion atoms on the BlueGene/L architecture
be some multiple of the speed of the CPU containing 131,072 IBM PowerPC440 processors
(Pacheco 1997)). (Kadau et al. 2006). These simulations include cal-
Another relatively simple option is to imple- culations based upon the short-range interaction
ment an aggregation of the individual agents into between the individual atoms, thus in some ways
‘super-agents’, such as the ‘super-individual’ approximate to agent simulations although in other
approach of Scheffer et al. (1995). The basic con- ways molecular-dynamic simulations lack the com-
cept of this approach is shown in Fig. 2. These plexity of most agent-based models (see Table 2).
‘super-agents’ are formed from individual agents There are significant decisions to be made
that share the same characteristics, such as age and when considering the application of a computing
sex. However, it may not be possible to group solution such as parallel programming to solve the
agents in a simulation in this way and, importantly, problem of large numbers of agents. In addition to
this method has been proven ineffective in a spatial the issue of reprogramming the model to run on a
context (Parry 2006; Parry and Evans In press). parallel computer architecture, it is also necessary
The most challenging solution, to reprogram the to consider the additional complexity of agents
model in parallel, is a popular solution due to the (as opposed to atoms), so that programming
shortcomings of the other approaches outlined models and tools facilitate the deployment, man-
above. A parallel solution may also have some agement and control of agents in the distributed
monetary cost and require advanced computing simulation (Gasser et al. 2005). For example, dis-
skills to implement, but it can potentially greatly tributed execution resources and timelines must
increase the scale of the agent simulation. Extremely be managed, full encapsulation of agents must be
large scale object-oriented simulations that simulate enforced, and tight control over message-based
Agent-Based Modeling,
Large-Scale Simulations,
Fig. 2 ‘Super-agents’:
Grouping of individuals
into single objects that
represent the collective
Agent-Based Modeling, Large-Scale Simulations 917
Agent-Based Modeling, Large-Scale Simulations, simulations tend to be complex (to the right of the table),
Table 2 Key elements of a ‘bottom-up’ simulation that though may have some elements that are less complex,
may affect the way in which it may scale. Agent such as local or fixed interactions
Element Least complex Most complex
Spatial structure Aspatial or Lattice of cells (1d, Continuous space
2d or 3d +)
Internal state Simple representation Complex representation (many states from an enumerable
(boolean true or false) set) or fuzzy variable values
Agent No Yes
heterogeneity
Interactions Local and fixed (within a Multiple different ranges and stochastic
neighborhood)
Synchrony of Synchronous update Not synchronous: asynchrony due to state-transition rules
model updates
multi-agent interactions is necessary (Gasser et al. literature in a number of fields sets out some
2005). Agent models can vary in complexity, but design protocols, e.g. Gilbert (2007) and Grimm
most tend to be complex especially in the key et al. (2006)). Thus, there can be no standard
model elements of spatial structure and agent het- method to develop a large scale agent-based
erogeneity. Table 2 gives an indication of the model. However, there are certain things to con-
relative complexity of model elements found in sider when planning to scale up a model. Some
models that focus on individual interactions key questions to ask about the model are as
(which encompasses both multi-agent models follows:
and less complex, ‘Ising’-type models).
The following sections detail guidelines for the 1. What program design do you already have and
development of a large scale agent-based model, what is the limitation of this design?
highlighting in particular the challenges faced in (a) What is it the memory footprint for any
writing large scale, high performance Agent based existing implementation?
Modelling (ABM) simulations and giving a (b) What are your current run times?
suggested development protocol. Following this, 2. What are your scaling requirements?
an example is given of the parallelization of a (a) How much do you need to scale now?
simple agent-based model, showing some of the (b) How far do you need to scale eventually?
advantages but also the pitfalls of this most popular (c) How soon do you need to do it?
solution. Key challenges, including difficulties that 3. How simple is your model and how is it
may arise in the analysis of agent-based models at a structured?
large scale, are highlighted. Alternative solutions 4. What are your agent complexities?
are then discussed and some conclusions are drawn 5. What are your output requirements?
on the way in which large scale agent-based simu-
lation may develop in coming years. The first question is to identify the limitations in
the program design that you are using and to focus
on the primary ‘bottlenecks’ in the model. These
Large Scale Agent Based Models: limitations will either be due to memory or speed
Guidelines for Development (or perhaps both). Therefore it will be necessary to
identify the memory footprint for your existing
Key Considerations model, and analyze run times, identifying where
There is no such thing as a standard agent-based the most time is taken or memory used by the
model, or even a coherent methodology for agent simulation. It is primarily processor power that
simulation development (although recent controls the speed of the simulation. Runtime will
918 Agent-Based Modeling, Large-Scale Simulations
also increase massively once Random Access (e.g. if agents are grouped together output will be
Memory (RAM) is used up, as most operating at an aggregate level). Thus, an important consid-
systems will resort to virtual memory (i. e. hard eration is to ensure that output data is comparable
drive space), thus a slower mechanism with to the original model and that it is feasible to
mechanical parts rather than solid-state technology output once the model structure is altered.
engages. At this stage, it may be such that simple
adjustments to the code may improve the scalabil- A Protocol
ity of the model. However, if the code is efficient, In relation to the key considerations highlighted
other solutions will then need to be sought. above, a simple protocol for developing a large
The second question is how much scaling is scale agent-based simulation can be defined as
actually necessary for the model. It may be such follows:
that a simple or interim solution (e.g. upgrading
computer hardware) may be acceptable whilst 1. Optimize existing code.
only moderate scaling is required, but longer 2. Clearly identify scaling requirements (both for
term requirements should also be considered – a now and in the future).
hardware upgrade may be a quick fix but if the 3. Consider simple solutions first (e.g. a hardware
model may eventually be used for much larger upgrade).
simulations it is necessary to plan for the largest 4. Consider more challenging solutions.
scaling that will potentially be required. 5. Evaluate the suitability of the chosen scaling
The third question, relating to model simplicity solution on a simplified version of the model
and structure, is key to deciding a methodology before implementing on the full model.
that can be used to scale a model up. A number of
factors will affect whether a model will be easy to The main scaling solution to implement
distribute in parallel, for example. These include (e.g. from Table 1) is defined by the requirements
whether the model iterates at each time step or is of the model. Implementation of more challenging
event driven, whether it is aspatial or spatial and solutions should be done in stages, where perhaps
the level/type of agent interaction (both with one a simplified version of the model is implemented
another and with the environment). More detail on on a larger scale. Agent simulation development
the implications of these factors is given in section should originate with a local, flexible ‘prototype’
“Parallel Computing”. and then as the model development progresses
Agent complexity, in addition to model structure, and stabilizes larger scale implementations can
may limit the options available for scaling up a be experimented with Gasser et al. (2005). This
model. For example, a possible scaling solution is necessary for a parallel implementation of a
may be to group individual agents together as model, for example, as a simplified model enables
‘super-individuals’ (Scheffer et al. 1995). However, and assessment of whether it is likely to provide
if agents are too complex it may not be possible to the desired improvements in model efficiency.
determine a simple grouping system (such as by This is particularly the case for improvements in
age), as agent behavior may be influenced heavily model speed, as this depends on improved pro-
by numerous other state variables. cessing performance that is not easily calculated
Output requirements are also important to con- in advance.
sider. These may already be limiting the model, in
terms of memory for data storage. Even if they are
not currently limiting the model in this way, once Parallel Computing
the model is scaled up output data storage needs
may be an issue, for example, if the histories of Increasing the capacity of an individual computer
individual agents need to be stored. In addition, in terms of memory and processing power has
the way that output data is handled by the model limited ability to perform large scale agent simu-
may be altered if the model structure is altered lations, particularly due to the time the machine
Agent-Based Modeling, Large-Scale Simulations 919
would take to run the model using a single pro- Load Balancing
cessor. However, by using multiple processors In order to ensure the most efficient use of mem-
and a mix of distributed and shared memory ory and processing resources in a parallel comput-
working simultaneously, the scale of the problem ing system the data load must be balanced
for each individual computer is much reduced. between processors and the work load equally
Subsequently, simulations can run in a fraction of distributed. If this is not the case then one com-
the time that would be taken to perform the same puter may be idle as others are working, resulting
complex, memory intensive, operations. This is in time delays and inefficient use of the system’s
the essence of parallel computing. ‘Parallel com- capacity. There are a number of ways in which
puting’ encompasses a wide range of computer data can be ‘mapped’ to different nodes and the
architectures, from a HPC (High performance most appropriate depends on the model structure.
computing) Linux box, to dedicated multi- Further details and examples are given in Pacheco
processor/multi-core systems (such as a Beowulf (1997), including ‘block mapping’ and ‘cyclic
cluster), super clusters, local computer clusters or mapping’. An example of ‘block mapping’ load
Grids and public computing facilities (e.g. Grid balancing is given below, in section “Example”.
computers, such as the White Rose Grid, UK In many simulations the computational
http://www.wrgrid.org.uk/). The common factor demands on the nodes may alter over time, as
is that these systems consist of a number of the intensity of the agents’ or environment’s pro-
interconnected ‘nodes’ (processing units), that cessing requirements varies on each node over
may perform simultaneous calculations on dif- time. In this case, dynamic load balancing tech-
ferent data. These calculations may be the same niques can be adopted to further improve the
or different, depending whether a ‘Single parallel model performance. For example, Jang
Instruction Multiple Data’ (SIMD) or ‘Multiple (2006) and Jang and Agha (2005), use a form of
Instruction Multiple data’ (MIMD) approach is dynamic load balancing with object migration
implemented. they term “Adaptive Actor Architecture”. Each
In terms of MAS, parallel computing has been agent platform monitors the workload of its com-
used to develop large scale agent simulations in a puter node and the communication patterns of
number of disciplines. These range from ecology, agents executing on it in order to redistribute
e.g. Abbott et al. (1997), Immanuel et al. (2005), agents according to their communication locali-
Lorek and Sonnenschein (1995), and Wang et al. ties as agent platforms become overloaded. How-
(2004, 2005b, 2006a, 2006b) and biology, ever, this approach does introduce additional
e.g. Castiglione et al. (1997) and Da-Jun et al. processing overheads, so is only worth
(2004) to social science, e.g. Takeuchi (2005) implementing for large scale agent simulations
and computer science, e.g. Popov et al. (2003), where some agents communicate with one
including artificial intelligence and robotics, another more intensely than other agents
e.g. Bokma et al. (1994) and Bouzid et al. (2001). (communication locality is important) or commu-
Several key challenges arise when nication patterns are continuously changing so
implementing an agent model in parallel, which static agent allocation is not efficient.
may affect the increase in performance achieved.
These include load balancing between nodes, syn- Communication Between Nodes
chronizing events to ensure causality, monitoring It is important to minimize inter-node communi-
of the distributed simulation state, managing com- cation when constructing a parallel agent simula-
munication between nodes and dynamic resource tion, as this may slow the simulation down
allocation (Timm and Pawlaszczyk 2005). Good significantly if the programmer is not careful
load balancing and inter-node communication (Takahashi and Mizuta 2006; Takeuchi 2005).
with event synchronisation are central to the The structure of the model itself largely deter-
development of an efficient parallel simulation, mines the way in which data should be split and
and are further discussed below. information transferred between nodes to
920 Agent-Based Modeling, Large-Scale Simulations
maximize efficiency. Agent simulations generally because the system is event-driven). In addition,
by definition act spatially within an environment. agents may update asynchronously but the nodes
Thus, an important first consideration is whether may be synchronized at each time step or key
to split the agents or the environment between model stage. Asynchronous updating may be a
nodes. The decision as to whether to split the problem if there is communication between
agents between processors or elements of the nodes, as some nodes may have to wait for others
environment such as grid cells largely depends to finish processes before communication takes
upon the complexity of the environment, the place and further processing is possible, resulting
mobility of the agents, and the number of interac- in blocking (see below). Communication between
tions between the agents. If the environment is nodes then becomes highly complex (Wang et al.
relatively simple (thus information on the whole 2005a). It is important that messages communi-
environment may be stored on all nodes), it is cating between agents are received in the correct
probably most efficient to distribute the agents. order, however a common problem in distributed
This is particularly the case if the agents are highly simulations is ensuring that this is so as other
mobile, as a key problem when dividing the envi- factors, such as latency in message transmission
ronment between processors is the transfer of across the network, may affect communication
agents or information between processors. How- (Wang et al. 2005a).
ever, if there are complex, spatially defined inter- A number of time management mechanisms
actions between agents, splitting agents between exist that may be implemented to manage mes-
nodes may be problematic, as agents may be sage passing in order to ensure effective node to
interacting with other agents that are spatially node communication, e.g. Fujimoto (1998).
local in the context of the whole simulation but
are residing on different processors. Therefore Blocking and Deadlocking
conversely, if the agents are not very mobile but Deadlock occurs when two or more processes are
have complex, local interactions and/or the agents waiting for communication from one of the other
reside in a complex environment, it is probably processes. When programming a parallel simula-
best to split the environment between nodes tion it is important to avoid deadlock to ensure the
(Logan and Theodoropolous 2001). Further effi- simulation completes. The simplest example is
ciency may be achieved by clustering the agents when two processors are programmed to receive
which communicate heavily with each other from the other processor before that processor has
(Takahashi and Mizuta 2006). sent. This may be simply resolved by changing the
In models where there is high mobility and order that tasks are executed, or to use ‘non-
high interaction it is often possible, especially blocking’ message passing. Where blocking is
for ecological models, to find a statistical com- used, processing on nodes waits until a message
monality that can be used as a replacement for is transmitted. However, when ‘non-blocking’ is
more detailed interaction. For example, as will be used, processing continues even if the message
shown in our example, if the number of local hasn’t been transmitted yet. The use of a non-
agent interactions is the only important aspect of blocking MPI may reduce computing times, and
the interactions, a density map of the agents, work can be performed while communication is in
transferred to a central node, aggregated and progress.
redistributed, might allow agents to be divided
between nodes without the issue of having to do
detailed inter-agent communication between Example
nodes a large number of times.
The way in which the simulation iterates may To demonstrate some of the benefits and pitfalls of
influence the approach taken when parallelizing parallel programming for a large scale agent-
the model. The model may update synchronously based model, a simple example is given here.
at a given time step or asynchronously (usually This summarizes a simplified agent-based model
Agent-Based Modeling, Large-Scale Simulations 921
of aphid population dynamics in agricultural land- overheads for message passing between the
scapes of the UK, which was parallelized to cope nodes. The model was simple enough that specific
with millions of agents, as described in detail in interagent communication between nodes was not
Parry (2006), Parry and Evans (In press) and Parry necessary.
et al. (2006). Even distribution of data between nodes was
A key problem with the original, non-parallel, achieved by splitting immigrant agents evenly
aphid simulation was that it was hindered by across the system, with each node containing
memory requirements, which were far larger information on the environment and local densi-
than could be accommodated at any individual ties passed from the control node. The number of
processing element. This is a common computing immigrants to be added to each node was calcu-
problem (Chalmers and Tidmus 1996). The data lated by a form of ‘block mapping’, pp. 35 in
storage required for each aphid object in a land- Pacheco (1997), which partitioned the number of
scape scale simulation quickly exceeded the stor- immigrants into blocks which were then assigned
age capacity of a PC with up to 2097 MB of RAM. to each node. So, if there were three nodes (n D 3)
The combined or ‘virtual shared’ memory of sev- and thirteen immigrants (i D 13), the immigrants
eral computers was used to cope with the amount mapped to each node would be as follows:
of data needed, using a Single Instruction
Multiple-Data approach (SIMD). i0 , i1 , i2 , i3 ! n1
Message-passing techniques were used to
transfer information between processors, to dis- i4 , i5 , i6 , i7 ! n2
tribute the agents in the simulation across a Beo-
wulf cluster (a 30-node distributed memory i8 , i9 , i10 , i11 , i12 ! n3 :
parallel computer). A Message-passing Interface
(MPI) for Java was used, MPIJava (http://www. As thirteen does not divide evenly by three, the
hpjava.org). ‘MPIJava wraps around the open- thirteenth agent is added to the final node.
source’ open-source native MPI ‘LAM’ (http://
www.lam-mpi.org/). Further details on the Benefits
methods used to incorporate the MPI into the Simulation runtime and memory availability was
model are given in Parry (2006) and Parry greatly improved by implementing the simple
et al. (2006). aphid model in parallel across a large number of
Effective parallelization minimizes the passing nodes. The greatest improvement in simulation
of information between nodes, as it is processor runtime and memory availability was seen when
intensive. In the example model, only the envi- the simulation was run across the maximum num-
ronment object and information on the number of ber of nodes (25) (Figs. 3 and 4). The largest
agents to create on each node are passed from a improvement in speed given by the parallel
single control node to each of the other nodes in model in comparison to the non-parallel model is
the cluster, and only density information is when more than 500,000 agents are run across
returned to the control node for redistribution 25 nodes, although the parallel model is slower
and display. The control node manages the pro- by comparison for lower numbers. This means
gress of the model, acts as a central communica- that additional processing power is required in
tion point for the model and handles any code that the parallel simulation compared to the original
may not be distributed to all nodes (such as librar- model, such that only when very large numbers of
ies from an agent toolkit or a GUI). Structuring a agents are run does it become more efficient.
model without a control node is possible, or the
control node may also be used to process data, Pitfalls
depending on the requirements of the simulation. Although there are clear benefits of distributing
Transfer of density values, rather than agents, the example simulation across a large number of
significantly reduced the computational nodes, the results highlight that the parallel
922 Agent-Based Modeling, Large-Scale Simulations
Agent-Based Modeling,
Large-Scale Simulations,
Fig. 3 Plot of the mean
maximum memory used
(per node) against number
of agents for the model:
comparison between
simulations using 2, 5 and
25 nodes and the non-
parallel model (single
processor)
Agent-Based Modeling,
Large-Scale Simulations,
Fig. 4 Plot of the
percentage speed up (per
node) from the non-parallel
model against number of
agents for the model:
comparison between
simulations using 2, 5 and
25 nodes and the non-
parallel model (single
processor)
approach is not always more efficient than the greatest improvement in speed is when more
original single processor implementation of a than 500,000 agents are run across 25 nodes, how-
model. In the example, the two node simulation ever when lower numbers of nodes are used the
used more memory on the worker node than the relationship between the number of nodes and
non-parallel model when the simulation had speed is complex: for 100,000 agents five nodes
100,000 agents or above. This is due to additional are faster than the non-parallel model, but for
memory requirements introduced by message 500,000 the non-parallel model is faster. Overall,
passing and extra calculations required in the par- these results suggest that when memory is suffi-
allel implementation (which are less significant cient on a single processor, it is unlikely to ever be
when more nodes are used as these requirements efficient to parallelize the code, as when the num-
remain relatively constant). ber of individuals was low the parallel simulation
The results also highlight that adding more took longer and was less efficient than the non-
processors does not necessarily increase the parallel model run on a single node.
model speed. The example model shows that for This demonstrates that in order to effectively
simulations run on two nodes (one control node, parallelize an agent model, the balance between
one worker node) the simulation takes longer to the advantage of increasing the memory availabil-
run in parallel compared to the non-parallel ity and the cost of communication between
model. Message passing time delay and the mod- nodes must be assessed. By following an iterative
ified structure of the code are responsible. The development process as suggested in section
Agent-Based Modeling, Large-Scale Simulations 923
upon MPI, it maps agent-based models onto par- Making the transition from a serial application to
allel computers, where agents are written based a parallel version is a process that requires a fair
upon their graph topography to minimize commu- degree of formalism and program restructuring, so
nication overhead. Another example is is not to be entered into lightly without exploring
HLA_GRID_Repast (Zhang et al. 2005), ‘a sys- the other options and the needs of the simulation
tem for executing large scale distributed simula- first.
tions of agent-based systems over the Grid’, for Overall, it is clear that disparate work is being
users of the popular Repast agent toolkit. done in a number of disciplines to facilitate large
HLA_GRID_Repast is a middleware layer scale agent-based simulation, and knowledge is
which enables the execution of a federation of developing rapidly. Some of this work is innova-
multiple interacting instances of Repast models tive and highly advanced, yet inaccessible to
across a grid network with a High Level Archi- researchers in other disciplines who may unaware
tecture (HLA). This is a ‘centralized coordination of key developments outside of their field. This
approach’ to distributing an agent simulation chapter synthesizes and evaluates large scale
across a network (Timm and Pawlaszczyk 2005). agent simulation to date, providing a reference
Examples of algorithms designed to enable for a wide range of agent simulation developers.
dynamic distribution of agent simulations are
given in Scheutz and Schermerhorn (Scheutz
Acknowledgments Many thanks to Andrew Evans
and Schermerhorn 2006). (Multi-Agent Systems and Simulation Research Group,
Although parallel computing is often the most University of Leeds, UK) and Phil Northing (Central Sci-
effective way of handling large scale agent-based ence Laboratory, UK) for their advice on this chapter.
simulations, there are still some significant obsta-
cles to the use of parallel computing for MAS. As
shown with the simple example given here, this Bibliography
may not always be the most effective solution
depending upon the increase in scale needed and Primary Literature
the model complexity. Other possible methods Abbott CA, Berry MW, Comiskey EJ, Gross LJ, Luh H-K
were suggested in section “Introduction”, but (1997) Parallel individual-based modeling of ever-
these may also be unsuitable. Another option glades deer ecology. IEEE Comput Sci Eng 4(4):60–78
Anderson J (2000) A generic distributed simulation system
could be to deconstruct the model and simplify for intelligent agent design and evaluation. In: Proceed-
only certain elements of the model using either ings of the AI, simulation & planning in high autonomy
parallel computing or one of the other solutions systems, Arizona
suggested in section “Introduction”. Such a Bokma A, Slade A, Kerridge S, Johnson K (1994) Engi-
neering large-scale agent- based systems with consen-
‘hybrid’ approach is demonstrated by Zhang and sus. Robot Comput-Integr Manuf 11(2):81–91
Lui (2005), who combine equation-based Bouzid M, Chevrier V, Vialle S, Charpillet F (2001) Paral-
approaches and multi-agent simulation with a lel simulation of a stochastic agent/environment inter-
Cellular Automata to simulate the complex inter- action model. Integr Comput-Aided Eng 8(3):189–203
Castiglione F, Bernaschi M, Succi S (1997) Simulating the
actions in the process of human immune response immune response on a distributed parallel computer. Int
to HIV. The result is a model where equations are J Mod Phys C 8(3):527–545
used to represent within-site processes of HIV Chalmers A, Tidmus J (1996) Practical parallel processing:
infection, and agent-based simulation is used to an introduction to problem solving in parallel. Interna-
tional Thomson Computer Press, London
represent the diffusion of the virus between sites. Da-Jun T, Tang F, Lee TA, Sarda D, Krishnan A,
It is therefore important to consider primarily Goryachev A (2004) Parallel computing platform for
the various ways in which the model may be the agent-based modeling of multicellular biological
altered, hybridized or simplified yet still address systems. In: Parallel and distributed computing: appli-
cations and technologies. Lecture notes in computer
the core research questions, before investing science, vol 3320, pp 5–8
money in hardware or investing time in the devel- Fujimoto RM (1998) Time management in the high level
opment of complex computational solutions. architecture. Simulation 71:388–400
Agent-Based Modeling, Large-Scale Simulations 925
Gasser L, Kakugawa K (2002) MACE3J: fast flexible Lees M, Logan B, Theodoropoulos G (2002) Simulating
distributed simulation of large, large-grain multi-agent agent-based systems with HLA: the case of
systems. In: Proceedings of AAMAS SIM_AGENT. In: Proceedings of the 2002 European
Gasser L, Kakugawa K, Chee B, Esteva M (2005) Smooth simulation interoperability workshop, pp 285–293
scaling ahead: progressive MAS simulation from single Lees M, Logan B, Oguara T, Theodoropoulos G (2003)
PCs to Grids. In: Davidsson P, Logan B, Takadama Simulating agent-based systems with HLA: the case of
K (eds) Multi-agent and multi-agent-based simulation. SIM_AGENT – Part II. In: Proceedings of the 2003
Joint Workshop MABS 2004, New York, 19 July 2004. European simulation interoperability workshop
Springer, Berlin Logan B, Theodoropolous G (2001) The distributed simu-
Gilbert N (2007) Agent-based models. Quantitative appli- lation of multi-agent systems. Proc IEEE
cations in the social sciences. SAGE, London 89(2):174–185
Grimm V, Railsback SF (2005) Individual-based modeling Lomdahl PS, Beazley DM, Tamayo P, Gronbechjensen
and ecology. Princeton series in theoretical and com- N (1993) Multimillion particle molecular-dynamics
putational biology. Princeton University Press, on the CM-5. Int J Mod Phys C Phys Comput
Princeton, 480 pp 4(6):1075–1084
Grimm V, Berger U, Bastiansen F, Eliassen S, Ginot V, Lorek H, Sonnenschein M (1995) Using parallel com-
Giske J, Goss-Custard J, Grand T, Heinz S, Huse G, puters to simulate individual-oriented models in ecol-
Huth A, Jepsen JU, Jorgensen C, Mooij WM, Muller B, ogy: a case study. In: Proceedings: ESM’95 European
Pe’er G, Piou C, Rails-back SF, Robbins AM, Robbins simulation multiconference, Prague, June 1995
MM, Rossmanith E, Ruger N, Strand E, Souissi S, Luke S, Cioffi-Revilla C, Panait L, Sullivan K (2004)
Stillman RA, Vabo R, Visser U, DeAngelis DL MASON: a new multi-agent simulation toolkit. In:
(2006) A standard protocol for describing individual- Proceedings of the 2004 SwarmFest workshop
based and agent-based models. Ecol Model Openshaw S, Turton I (2000) High performance comput-
198(1–2):115–126 ing and the art of parallel programming: an introduction
Guessoum Z, Briot J-P, Faci N (2005) Towards fault- for geographers, social scientists, engineers. Routledge,
tolerant massively multiagent system. In: Ishida T, London
Gasser L, Nakashima H (eds) Massively multi-agent Pacheco PS (1997) Parallel programming with MPI. Mor-
systems I: first international workshop MMAS 2004, gan Kauffman Publishers, San Francisco
Kyoto, Dec 2004. Springer, Berlin Parry HR (2006) Effects of land management upon species
Heppenstall AJ (2004) Application of hybrid intelligent population dynamics: a spatially explicit, individual-
agents to modelling a dynamic, locally interacting retail based model. Ph D thesis, University of Leeds
market. Ph D thesis, University of Leeds Parry HR, Evans AJ (In press) A comparative analysis of
Horling B, Lesser V (2005) Quantitative organizational parallel processing and super-individual methods for
models for large-scale agent systems. In: Ishida T, improving the computational performance of a large
Gasser L, Nakashima H (eds) Massively multi-agent individual-based model. Ecol Model
systems I: first international workshop MMAS 2004, Parry HR, Evans AJ, Heppenstall AJ (2006) Millions of
Kyoto, Japan, Dec 2004. Springer, Berlin agents: parallel simulations with the repast agent-based
Immanuel A, Berry MW, Gross LJ, Palmer M, Wang toolkit. In: Trappl R (ed) Cybernetics and systems
D (2005) A parallel implementation of ALFISH: sim- 2006, Proceedings of the 18th European meeting on
ulating hydrological compartmentalization effects on cybernetics and systems research
fish dynamics in the Florida Everglades. Simul Model Popov K, Vlassov V, Rafea M, Holmgren F, Brand P,
Pract Theory 13:55–76 Haridi S (2003) Parallel agent-based simulation on a
Ishida T, Gasser L, Nakashima H (eds) (2005) Massively cluster of workstations. In: EURO-PAR 2003 parallel
multi-agent systems I. First international workshop, processing, vol 2790, pp 470–480
MMAS 2004, Kyoto. Springer, Berlin Scheffer M, Baveco JM, DeAngelis DL, Rose KA, van Nes
Jang MW (2006) Agent framework services to reduce EH (1995) Super-individuals: a simple solution for
agent communication overhead in large-scale agent- modelling large populations on an individual basis.
based simulations. Simul Model Pract Theory Ecol Model 80:161–170
14(6):679–694 Scheutz M, Schermerhorn P (2006) Adaptive algorithms for
Jang MW, Agha G (2005) Adaptive agent allocation for the dynamic distribution and parallel execution of agent-
massively multi-agent applications. In: Ishida T, based models. J Parallel Distrib Comput 66(8):1037–1051
Gasser L, Nakashima H (eds) Massively multi-agent Takahashi T, Mizuta H (2006) Efficient agent-based simu-
systems I: first international workshop MMAS 2004, lation framework for multi-node supercomputers. In:
Kyoto, Dec 2004. Springer, Berlin Perrone LF, Wieland FP, Liu J, Lawson BG, Nicol DM,
Jennings NR (2000) On agent-based software engineering. Fujimoto RM (eds) Proceedings of the 2006 winter
Artif Intell 117:277–296 simulation conference
Kadau K, Germann TC, Lomdahl PS (2006) Molecular Takeuchi I (2005) A massively multi-agent simulation
dynamics comes of age: 320 billion atom simulation system for disaster mitigation. In: Ishida T, Gasser L,
on BlueGene/L. Int J Mod Phys C 17(12):1755 Nakashima H (eds) Massively multi-agent systems I:
926 Agent-Based Modeling, Large-Scale Simulations
first international workshop MMAS 2004, Kyoto, Dec distributed network of agents over an MPI-based com-
2004. Springer, Berlin puter cluster.
Timm IJ, Pawlaszczyk D (2005) Large scale multiagent Graphcode system. http://parallel.hpc.unsw.edu.au/rks/
simulation on the grid. In: Veit D, Schnizler B, Eymann graphcode/.
T (eds) Proceedings of the workshop on agent-based MACE3J: http://www.isrl.uiuc.edu/amag/mace/an experi-
grid economics (AGE 2005) at the IEEE international mental agent platform supporting deployment of agent
symposium on cluster computing and the grid simulations across a variety of system architectures
(CCGRID). Cardiff University, Cardiff (Gasser and Kakugawa 2002; Gasser et al. (2005).
Wang D, Gross L, Carr E, Berry M (2004) Design and MASON. http://cs.gmu.edu/~eclab/projects/mason/. MA
implementation of a parallel fish model for South Flor- SON was ‘not intended to include parallelization of a
ida. In: Proceedings of the 37th annual Hawaii interna- single simulation across multiple networked proces-
tional conference on system sciences (HICSS’04) sors’ (Luke et al. 2004). However, it does provide two
Wang F, Turner SJ, Wang L (2005a) Agent communication kinds of simple parallelization:
in distributed simulations. In: Davidsson P, Logan B, 1. Any given step in the simulation can be broken into
Takadama K (eds) Multi-agent and multi-agent-based parallel sub-steps each performed simultaneously.
simulation. Joint workshop MABS 2004, New York, 2. A simulation step can run asynchronously in the back-
19 July 2004. Springer, Berlin ground independent of the simulation.
Wang D, Carr E, Gross LJ, Berry MW (2005b) Toward Repast. http://repast.sourceforge.net and HLA_GRID_Rep
ecosystem modeling on computing grids. Comput Sci ast (Zhang et al. 2005). The Repast toolkit has in-built
Eng 7:44–52 capabilities for performing batch simulation runs.
Wang D, Berry MW, Carr EA, Gross LJ (2006a) A parallel http://www.cs.bham.ac.uk/research/projects/poplog/pack
fish landscape model for ecosystem modeling. Simula- ages/simagent.html. Two developments support dis-
tion 82(7):451–465 tributed versions of SimAgent:
Wang D, Berry MW, Gross LJ (2006b) On parallelization 1. The use of HLA to distribute SimAgent (Lees et al. 2002,
of a spatially-explicit structured ecological model for 2003)
integrated ecosystem simulation. Int J High Perform 2. The SWAGES package: http://www.nd.edu/~airolab/
Comput Appl 20(4):571–581 software/index.html. This allows SimAgent to be dis-
Wooldridge M (1999) Intelligent agents. In: Weiss G (ed) tributed over different computers and interfaced with
Multiagent systems: a modern approach to distributed other packages.
artificial intelligence. MIT Press, Cambridge, pp 27–78 Message Passing Interfaces (MPI):
Zhang S, Lui J (2005) A massively multi-agent system for Background and tutorials. http://www-unix.mcs.anl.gov/
discovering HIV-immune interaction dynamics. In: mpi/
Ishida T, Gasser L, Nakashima H (eds) Massively MPICH2. http://www-unix.mcs.anl.gov/mpi/mpich/
multi-agent systems I: first international workshop MPI forum. http://www.mpi-forum.org/
MMAS 2004, Kyoto, Dec 2004. Springer, Berlin MPIJava. http://www.hpjava.org
Zhang Y, Thedoropoulos G, Minson R, Turner SJ, Cai W, OpenMP. http://www.openmp.org
Xie Y, Logan B (2005) Grid-aware Large Scale Dis- OpenMPI. http://www.open-mpi.org/
tributed Simulation of Agent-based Systems. In: 2004 Parallel computing and distributed agent simulation
European simulation interoperability workshop websites:
(EuroSIW 2005), 05E-SIW-047, Toulouse Further references and websites http://www.cs.rit.edu/~ncs/
parallel.html.
Introduction to parallel programming. http://www.mhpcc.
Books and Reviews
edu/training/workshop/parallel_intro/MAIN.html
Agent libraries and toolkits with distributed or parallel
http://www.agents.cs.nott.ac.uk/research/simulation/simulators/
features:
(implementations of HLA_GRID_Repast and distributed
Distributed GenSim: Supports distributed parallel execu-
SimAgent).
tion (Anderson 2000).
Globus Grid computing resources. http://www.globus.org/.
Ecolab: http://ecolab.sourceforge.net/. EcoLab models
Beowulf computer clusters. http://www.beowulf.org/
may also use the Graphcode library to implement a
Index
Farsighted stable set, 9, 28–37 Genetic algorithms (GA) model, 726, 735
applications, 30 Genetic programming (GP), 726, 735
characteristic function form games and coalitional Genotype, 726
sovereignty, 35–37 Gibbard-Satterthwaite theorem, 515–516, 529, 535–536
coalition formation games, 39–41 Glioma cell migration
cooperative games, 35 cellular mechanisms of, 857
in cooperative games, 35 fiber tracts on, 857
discrete public goods, provision of, 32–33 Global climate change
duopoly market games, 33–34 issues, 73–75
enforceability relation, 30 models, 75–76
house barter games, 43–45 results, 76
largest consistent set and largest farsighted Global pareto optimum (GPO), 76–77
conservative stable set, 29–30 Glycolytic phenotype, 859
marriage games and roommate games, 42–43 Goto statement, 866
network formation games, 37–39 Graph, lattice, tree, 767
n-person prisoner’s dilemma, 32 Grid, 913
prisoner’s dilemma, 31–32 Grid computing, 923
Feasible assignment, 371 Groovy language, 885
Feasible network, 614
Feasible set network, 611, 617
Fiber tracts, 851, 857 H
Fictitious play, 487 Hamilton Jacobi Isaacs equation, 309, 310
Field, 866 Handicap principle, 251
Finite dynamical system, 683, 689 Haptotaxis, 851, 857
agent-based simulation as, 692–693 Harsanyi game revisited, 132–134
category of sequential dynamical systems, 699–700 Harsanyi’s model, 121–122
definition, 694 and hierarchies of beliefs, 122–124
mathematical results on, 696–698 Headless, 866
sequential update systems, 698 Heavy trading volume, 841–842
as theoretical and computational tools, 693–694 Hedonic games, 40–41
Finite field, 691 Heterogeneous directed network, 616
Finite state machine (FSM), 726, 900 Heterogeneous markets, 654–656
First-generation models of computation, 706 Heterogeneous networks, 609
First-price auction, 335, 336, 338–341, 343 Heuristic apparition of bimartingale, 174
First-price auction equilibrium, 340 Hidden stochastic games, 180
First-price sealed-bid auction, 338–340 Higher-order programming, 866
Flow payoff, 185 Homogeneous directed network, 616
Folk theorems, 243–244 Homogeneous markets, 652–654
F-optimal stable matching, 372 Homogeneous networks, 609
Formal model, 318–320 Hopfield networks, 688–689
Formation games, 626 Hotelling model, 536
Friedman-Moulin rule, 451 House Barter games, 25–28, 43–45
Frustration, 781 House exchange, 407
Functional language, 866 Hybrid two-sided matching model, 372
Function pointers, 866 Hypercycles, 726, 733
Hyperplastic phenotype, 859
G
Gale-Shapley algorithm, 376–377 I
Game genres, 898–899 Illegal production, 272–273
Game horizon, 61 ImmSim, 687
Game of life, 662, 726, 791 Imperative language, 866
Games with incomplete information, see Bayesian games Imperfect monitoring, 185, 241–243
Game theory, 3–6, 49, 421, 639 Implementable social choice rule, 349
definition, 639 Implementation theory
dynamic games, 640 credibility, 362
General voting games, 14 definition, 349
Generative social science, 726 dynamic implementation, 364
Generics, 866 environment, 351
Index 933
equilibrium, 352–353 L
game-theoretical concerns, 361 Langton’s ant, 726
history, 350–351 Langton’s loop, 726, 731
mechanisms, 352 Large scale agent based models, 917
monotonicity, 357 Large scale simulations
multiple, 363 benefits, 921
renegotiation, 362 definition, 914
revelation principle, 354 pitfalls, 921–922
reverted preferences, 362 Largest consistent set (LCS), 30
social objectives, 352 Lattice Boltzmann equation, 784
sociological factors/bounded rationality, 363–364 Lattice-gas cellular automata (LGCA), 784, 855, 857
Impossibility theorems, 511, 515 Lattice property, 372
Impulsive differential games, 311 Learning
Impulsive games, 311 definition, 485
Imputation, 9 sophisticated, 492–493
Incomplete information game, 185, 251 stochastic, 495–496
Indirect domination, 9 Learning classifier system (LCS), 726, 735
Indirect interaction models, 676–677 Lenz-Ising model, 780
Individual-based/interaction-based simulation, 684 Liberalism, 511, 514, 515, 520
Individual-based modeling (IBM), 661, 726 Lindenmeyer system (L-system), 726, 732
Inference approach, 344 Linear bounded automaton (LBA), 695
Information, 305 Linear logic, 705
Information economics, 287 Linear systems, 772
Information flow, 705 LLS model, 830–835
Information processes, 732 Load balancing, 919
Inheritance, 866 Logic and geometry of interaction, 708
Inspection games, 4, 269 Logic programming language, 866
Inspector leadership, 269 Long-lived player, 185
Intelligence, 791 Lux and Marchesi model, 829–830
Intelligent agent, 891 Lux model, 829–830
Interaction, 667, 705 Lyapunov exponent, 783
computation as, 706
Interaction-based computing in physics, 768
Interdependent valuations, 330 M
Interdependent values, 337, 342 Machine learning, 726
Internal stability, 10 Macro language, 866
Inter-node communication, 919 Macrovariables, 778
Intuitive criterion, 251 Majority relation, strict, 543
Isaacs condition, 308, 312 Majority voting, 529
Isaacs equations, 308 Manipulable mechanism, 372
Isaacs verification theorem, 309 Many particle systems, 770
Ising model, 779 Marginal cost pricing, 441
Ising-type model, 913 Market anomalies, 825
Market clearance, 836–837
Market design
J
Market games, 463
Jackson-van den Nouweland enforceability, 38
equivalence, 469–472
Jackson-van den Nouweland rules, 620
Markov approximation, 774–775
Jackson-Wolinsky enforceability, 38
Markov chain games, 178
Jackson-Wolinsky network formation games, 633
Markov chains, 774
Jackson-Wolinsky rules, 611, 619
Markov-perfect equilibrium, 77–78
Java, 866, 885
Marriage games, 20–25, 42–43
Jointly controlled lotteries, 174
Massively multi-agent systems (MMAS), 664, 914
Matching μ, 372
K Matching mechanism, 372
Kalai-Lehrer learning, 491 Matching theory
Kidney exchange, 413 applications, 411–417
Kim and Markowitz portfolio insurers model, 827–828 one-sided matching, 406–411
Ky Fan-Sion minimax theorems, 218–219 two-sided matching, 401–406
934 Index
O
N Object, 867
Nash equilibrium, 107, 139, 160, 185, 251, 269, 433 Objective-C, 867
Nash reversion, 185 Object-oriented languages, 867, 872
Natural asynchrony, 791 Observer, 867
Neologism-proof equilibrium, 251 ODD protocol, 867, 874
NetLogo, 867, 881 Off-the-equilibrium-path play, 656
Network formation games, 37–38 Ontogenesis, 747
Networks, 778 Optimization algorithms, 791
abstract games, 624–626 Ordinary differential equations (ODEs), 685
Bala-Goyal rules, 611 Out-of-equilibrium phenomena, 785
definition, 610
directed, 616
dominance relations, 612 P
farsightedly consistent, 632 Pairwise majority voting rule, 532
feasible, 614 Parallel computer architecture, 914
feasible set, 611, 617 Parallel computing, 918–919
Index 935