ch2 PDF
ch2 PDF
ch2 PDF
You are deciding whether or not to contribute your resources to a shared com-
puter cluster. The amount you contribute, along with the amount that others
contribute, will a↵ect the resources you can consume. What should you do?
Game theory provides a formal framework with which to reason about situations of strate-
gic interdependence. In introducing game theory, we start with preferences and utility the-
ory, and then define the normal form representation of a simultaneous-move game. We
introduce important solution concepts, including Nash equilibrium and dominant-strategy
equilibrium, and also consider the class of potential games and congestion games.
Many of the examples that we adopt in introducing game theory are quite simple, but
the techniques can be applied much more generally.
For example, game theory is a useful tool for the design and analysis of:
• Reputation systems (will buyers provide negative feedback, or worry about retaliatory
negative feedback from the seller?),
• Internet security (will firms adopt new standards or reason that there will be no benefit
unless others follow?), and
• Meeting scheduling systems (will users game the outcomes by submitting false informa-
tion about preferences and constraints?)
For now, think about game theory as providing a mathematical way of reasoning about
settings where each participant is self interested and takes actions to obtain an outcome
that is the best possible for himself, given how others are acting.
2.1 Introduction
To fix ideas, let’s consider the often told story of the Prisoner’s Dilemma. This is a typical,
simple example that nevertheless reveals some of the interesting aspects of reasoning about
situations of strategic interdependence.
Example 2.1 (Prisoner’s Dilemma.). Two people are arrested and accused of a crime. Each
can cooperate C and not admit to the crime, or defect D and admit. If both cooperate, then
they receive a minor charge and stay in prison for 2 years. If one person defects while the
other cooperates, the defector is released (0 years in prison) while the other serves a 5-year
sentence. If both defect, then they both go to prison but with early parole and serve a 4-year
sentence. Figure 2.1 shows the payo↵ table for this game. Player one is the row player and
player two the column player. In each entry, the first number represents the “payo↵ ” to row
and the second number is the “payo↵ ” to column.
17
2 Game Theory I: Simultaneous-Move Games
Player 2
C D
C 2, 2 5, 0
Player 1
D 0, 5 4, 4
Figure 2.1: The Prisoner’s Dilemma Game. Each year in prison results in a payo↵ of -1.
In this particular game, the payo↵ corresponds to the number of years spent in prison:
each year results in a negative payo↵ of -1. More generally, we can think of payo↵ tables as
encoding the preference of participants for di↵erent outcomes, with higher payo↵s indicating
more preferred outcomes.
What should a player do in this game? A moment’s reflection suggests an obvious answer:
play D! If column plays C then row’s best response is D (0 years in prison instead of 2).
If column plays D then row’s best response is also D (4 years in prison instead of 5). In
particular, D is a dominant action, it is the best action whatever the action of the other
player. The dilemma is that by both players defecting the outcome is worse for both people
(namely 4 years in prison) than if they both cooperated (2 years in prison).
The Prisoner’s Dilemma illustrates how game-theory provides a way to model a situation
of strategic interdependence. We refer to the participants in a game as agents or players.
Crucially, each agent is free to make its own decision about how to act. In the real-world,
an agent could be a person deciding whether to leave feedback on eBay or Amazon, a firm
deciding whether to install a new security protocol, or an automated bidding system such
as those that caused cyclic behavior in early sponsored search auctions.
It will often make sense to model agents as selfish, for example minimizing payments in
an auction or participating on a social network in order to become an influencer, and turn
this influence into personal profit. However, selfish preferences are not essential to game
theory and the payo↵s to an individual player can also represent social (or other-regarding)
preferences. For example, game theory can model the behavior of a user providing feedback
about a hotel on a reputation platform, where the user’s motivation might be to help other
users or help the owners of the hotel.
Example 2.2. Suppose a student is trying to decide whether he prefers (o1 ) a larger apart-
ment with plenty of light in the suburbs, (o2 ) a small, modern studio, centrally located, or
(o3 ) a shared, older house, close to campus, with people that he knows reasonably well. These
outcomes di↵er along many di↵erent dimensions.
We insist that agents have a complete preference order. For every outcome pair, o1 , o2 , we
18 Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission.
2.3 Simultaneous-Move Games
must have at least o1 ⌫ o2 , or o2 ⌫ o1 , and both if the agent is indi↵erent. This precludes
a partial order; e.g., with o1 ⌫ o3 and o2 ⌫ o3 but no preference defined between o1 and
o2 . We further require transitivity, which ensures that if o1 ⌫ o2 and o2 ⌫ o3 then o1 ⌫ o3 .
This precludes cyclic preferences, such as o1 ⌫ o2 , o2 ⌫ o3 and o3 o1 .
But a preference order need not provide, by itself, enough information to explain how a
rational agent should behave. Returning to the example, suppose that o3 o2 o1 , and
that there are two available actions:
• action 1, which leads to the shared house (o3 ) with probability 0.7 and otherwise the
large apartment (o1 ).
• action 2, which leads to the studio (o2 ) with probability 1.
Given the uncertainty of the outcome of action 1, the best decision is not clear just from
the preference order, because it depends on the intensity of preference for o3 over o2 , and
for o2 over o1 .
In addressing this, utility theory associates a utility u(o) with each outcome o 2 O, and
insists that the utility is consistent with an agent’s preference order. Consistency requires
u(o1 ) u(o2 ) if and only if o1 ⌫ o2 , for two outcomes o1 , o2 , and u(o1 ) = u(o2 ) if and
only if o1 ⇠ o2 . A utility function u : O 7! < (where < denotes the set of real numbers)
assigns a utility to every outcome. Given this, the decision of a rational agent is the one
that maximizes the expected utility. If outcome oj 2 O occurs with probability pj 0, and
P
there are k possible outcomes, then the expected utility is kj=1 pj u(oj ).
In the example, suppose that the student’s utility is u(o1 ) = 700, u(o2 ) = 800, and
u(o3 ) = 1000, which is consistent with preference order o3 o2 o1 . The utility for an
outcome represents some combination of the cost of rent and the intrinsic happiness from
the living situation. Based on this, action 1 has expected utility 0.3u(o1 ) + 0.7u(o3 ) = 910,
compared to expected utility u(o2 ) = 800 for action 2, and the rational decision is action 1.
There is no unique utility function to ‘explain’ an agent’s preferences, but rather a family
of possible functions. If u(o) is a utility function, then u0 (o) = a · u(o) + b, for constants
a 2 <>0 and b 2 <, is another utility function that provides the same preference order
on outcomes, and the same preference order on distributions on outcomes; notation <>0 is
the set of strictly positive real numbers. We say that utility function u0 is a positive affine
transformation of the original utility function.
In addition, it is worth emphasizing that it is not necessary that agents know their utility
functions for utility theory to be a predictive theory. Rather, it is sufficient that agents act
as if they are maximizing some utility function.
Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission. 19
2 Game Theory I: Simultaneous-Move Games
action game.
The action sets in the Prisoner’s Dilemma are A1 = A2 = {C, D}, there are two agents
N = {1, 2}, and the utility functions u1 (a1 , a2 ) and u2 (a1 , a2 ) are as defined in Figure 2.1.
For example, u1 (C, C) = 2, u1 (C, D) = 5, u1 (D, C) = 0, u1 (D, D) = 4. The joint
action set A = A1 ⇥ A2 = {C, D} ⇥ {C, D} and examples of action profiles are a = (C, C)
and a = (D, C).
A finite game has a finite number of players, each with a finite set of actions. However,
in general, the action sets can be infinite (e.g., the action might represent the fraction of a
resource demanded by a player) and we might even want to model games with an infinite
number of players, capturing limiting behavior as the number of players becomes very large.
By assumption, every agent knows the available actions and all utility functions, knows
that every agent knows this, knows that every agent knows that every agent knows this, and
so on, ad nauseum. Formally, the actions and utilities are said to be common knowledge.
On the other hand, an action is selected by each agent without knowledge about the
actions selected by other agents. This is what makes this a simultaneous-move game. It
is not important that the actions are taken exactly at the same time, only that one agent
does not know the action selected by another agent when selecting its own action.
20 Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission.
2.3 Simultaneous-Move Games
Figure 2.2: The normal form representation for a game with 3 agents and 2 actions per
agent.
The number number of actions available to agents can also grow exponentially in the size
of a natural description of the game. This occurs, for example, in a network flow game such
as that used to illustrate Braess’ Paradox in Chapter 1. In that setting, there is an action
for every possible path from start to end.
Certainly we can imagine more succinct representations of games, and we will see examples
in the model of congestion games in Section 2.7 and also in graphical games in Chapter 5.
In particular, an action profile is Pareto optimal if no agent can be made better o↵ without
making some other agent worse o↵. For example, outcomes (C, C), (D, C) and (C, D) are
all Pareto optimal in the Prisoner’s Dilemma. Pareto optimality cannot be used to make
predictions regarding how agents will behave. Rather, Pareto optimality provides a minimal
criterion for whether or not an outcome is good from the perspective of social welfare.
Pareto optimality also extends to distributions on action profiles. A distribution on
action profiles is Pareto optimal if there is no other distribution that provides one agent
with strictly greater expected utility without giving another agent strictly less.
Example 2.3. A distribution on action profiles in the Prisoner’s Dilemma where agents
play (C, C) with probability 0.5 and (D, C) with probability 0.5 is Pareto optimal. To prove
this, we must show that any other distribution provides strictly less expected utility to at
Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission. 21
2 Game Theory I: Simultaneous-Move Games
least one player. The current expected utility is -1 to player 1 and -3.5 to player 2. First,
any increase in probability on (C, C) is worse for player 1, and any increase in probability
on (D, C) is worse for player 2. Second, any shift in probability from (C, C) or (D, C) to
one or both of (C, D) or (D, D) is worse for player 1, since player 1’s payo↵s are -2 and 0
for (C, C) and (D, C) but -5 and -4 for (C, D) and (D, D).
We will soon return to randomized play in games, but for now we will continue assuming
that each agent just picks a particular action.
We adopt the convention that equilibrium action profiles are denoted with ⇤ . In words,
an action profile a⇤ is a dominant-strategy equilibrium if every agent i maximizes its utility
with its action a⇤i whatever the other agents do. For example, the action profile all-defect
is a dominant-strategy equilibrium in the Prisoner’s Dilemma because each player prefers
action D over C whatever the action of the other player.
Example 2.4. Figure 2.3 depicts a two player, three action game. Row plays {U, M, D}
(up, middle, down) and column {L, M, R} (left, middle, right). A moment’s inspection
reveals there is no dominant-strategy equilibrium. But, there is another way to understand
how rational players should act. Action M is dominated by action R for the column player:
whatever row does, R is a better response for column than M . Based on this, action M can
be eliminated from consideration, and now row can reason that action U dominates action
M and D as long as column only selects L or R. Finally, if row will play U then column’s
best action is L and we identify (U, L) as the predicted outcome. But outcome (U, L) is not
22 Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission.
2.5 Nash Equilibrium
Player 2
L M R
U 4, 3 5, 1 6, 2
Player 1 M 2, 1 8, 4 3, 6
D 3, 0 9, 6 2, 8
Figure 2.3: A game without a dominant strategy equilibrium but solvable by iterated elim-
ination of strictly-dominated actions.
a dominant-strategy equilibrium. Certainly, if column plays M then row would not want to
play U .
The procedure of iterated elimination of strictly-dominated actions is illustrated in Algo-
rithm 2.1. This procedure is the first algorithmic tool we introduce for the strategic analysis
of games. There are other variations, including iterated elimination of weakly-dominated
actions. The procedure, its properties, and properties of variations are developed in Exer-
cise 2.1.
Unfortunately, iterated elimination will not always terminate with a single action profile.
Indeed, the same exercise asks for an example where it does not eliminate even a single
action. For this reason, this iterated elimination procedure does not provide a way to
predict the behavior in general simultaneous-move games.
For this, we need the concept of a Nash equilibrium. For now we focus on pure strategies,
in which agents act without randomizing over actions.
Definition 2.5 (Pure strategy Nash equilibrium). Action profile a⇤ = (a⇤1 , . . . , a⇤n ) is a
pure-strategy Nash equilibrium (PSNE) of simultaneous-move game (N, A, u) if, for all i,
Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission. 23
2 Game Theory I: Simultaneous-Move Games
In words, an action profile is a Nash equilibrium if every agent maximizes its utility given
that the other agents behave according to the action profile. Every agent is best-responding
to the behavior of every other agent, and no agent has a useful deviation. The crucial
distinction from a dominant-strategy equilibrium is that each agent’s action is only sure to
be a best response when the other agents play the equilibrium.
For example, action profile (U, L) is a Nash equilibrium of the game in Figure 2.3. Each
player is best-responding to the other player. In contrast to a dominant-strategy equilibrium,
for a Nash equilibrium to be a sensible prediction of behavior in a game, agents must believe
that other agents are rational. For example, if column was to play M then row certainly
wouldn’t want to play U . In fact, each player needs to believe that the other players believe
that all players are rational, and so on. Why else should a rational column player play L?
For Nash equilibrium to make sense there must be common knowledge of rationality.
Another difficulty with the concept of Nash equilibrium is that there can be multiple Nash
equilibria. We see this in the game of Chicken in Section 2.6.3. In games with multiple
Nash equilibria it is often unclear how agents should reason about which equilibria will be
played.
Player 2
H T
H 1, 1 1, 1
Player 1
T 1, 1 1, 1
The Matching Pennies game illustrates that there may not exist a pure-strategy Nash
equilibria. But what if we allow agents to randomize over actions and adopt a mixed
strategy?
Definition 2.6 (Mixed strategy). A mixed strategy si : AiP7! [0, 1] for agent i assigns a
probability si (ai ) 0 to each action ai 2 Ai , with the sum ai 2Ai si (ai ) = 1, so that si is
a well defined probability distribution on actions.
In words, a mixed strategy si assigns a probability si (ai ) to each action ai . For example,
in Matching Pennies, a mixed strategy for agent 1 is s1 (H) = 0.4, s1 (T ) = 0.6, such that
the agent plays H with probability 0.4 and T with probability 0.6. We can represent this
24 Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission.
2.6 Mixed-Strategy Nash Equilibrium
strategy through a vector of probabilities, writing s1 = (0.4, 0.6). Mixed strategies include
pure strategies as a special case; e.g., s2 = (1, 0) is a strategy for agent 2 that plays action
H with probability 1.
Given mixed strategies s1 = (0.4, 0.6) for agent 1 and s2 = (1, 0) for agent 2, the expected
utility to agent 1 is:
u1 (s1 , s2 ) = p(H, H)u1 (H, H) + p(H, T )u1 (H, T ) + p(T, H)u1 (T, H) + p(T, T )u1 (T, T )
= (0.4)(1)(1) + (0.4)(0)( 1) + (0.6)(1)( 1) + (0.6)(0)(1) = 0.2,
where p(a1 , a2 ) is the probability that action profile (a1 , a2 ) is played for strategies (s1 , s2 ).
The probability of an action profile such as (H, H), where both players 1 and 2 play H, is
given by the product p(H, H) = s1 (H)s2 (H) = (0.4)(1) = 0.4.
Given strategy profile s = (s1 , . . . , sn ), let
X
ui (s) = p(a1 , . . . , an )ui (a1 , . . . , an ), (2.3)
(a1 ,...,an )2A
denote the expected utility to agent i, with probability p(a1 , . . . , an ) = s1 (a1 ) · s2 (a2 ) . . . ·
sn (an ) for each action profile (a1 , . . . , an ). A Nash equilibrium can now be defined for mixed
strategies:
In words, every agent i maximizes its expected utility by adopting strategy s⇤i , given that
the other agents play their mixed strategies s⇤ i .
The following theorem, due to John Nash in 1951, provides the main theoretical grounding
for game theory.
The proof is beyond the scope of this book, but references are provided in the chapter
notes. Given this seminal result, we can model agents as best-responding to the play of
other agents and know that this is always possible, in the sense that such a strategy profile
always exists.
Note: An agent’s preferences, even on distributions of outcomes, are invariant to positive
affine transformations of utility (see Section 2.2). Because of this, the Nash equilibria of
a game are unchanged under these transformations. Multiplying any player’s payo↵s by a
positive number, and adjusting them up or down by a constant, leaves the equilibria of the
game unchanged.
Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission. 25
2 Game Theory I: Simultaneous-Move Games
Figure 2.5: Best-response correspondences: (a) Prisoner’s Dilemma. (b) Matching Pennies.
For this, let p denote the probability with which row (player 1) plays H, and q denote
the probability with which column (player 2) plays H.
For player 1, the best-response p 2 f (q) is a utility-maximizing probability for action
H, given that 2 plays H with probability q. For player 2, the best-response q 2 g(p) is a
utility-maximizing probability for action H, given that 1 plays H with probability p.
For a (mixed) Nash equilibrium, we require probabilities (p⇤ , q ⇤ ) such that,
p⇤ 2 f (q ⇤ ), q ⇤ 2 g(p⇤ ), (2.5)
Example 2.6. Consider Figure 2.5 (a), for the Prisoner’s Dilemma. This plots player
2’s best response g(p) on the y-axis and player 1’s best response f (q) on the x-axis. The
lines intersect at (p⇤ , q ⇤ ) = (0, 0), corresponding to (D, D). Indeed, this is the unique Nash
equilibrium of Prisoner’s Dilemma. Since each player has a dominant strategy, the best-
response correspondences take on the same value for all strategies of the other player.
Consider Figure 2.5 (b), for the game of Matching Pennies. In this case, we see one
intersection at (p⇤ , q ⇤ ) = (0.5, 0.5), corresponding to each player mixing 50:50 over H and
26 Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission.
2.6 Mixed-Strategy Nash Equilibrium
T . This is the unique Nash equilibrium. When player 1 plays p = 0.5 then player 2 is
indi↵erent between H and T and has as a best-response q = 0.5. Similarly for player 2.
In particular, each player’s mixed strategy makes the other player indi↵erent over both pure
actions, and thus willing to play each action with some probability.
A well known real-world example to motivate mixed strategies comes from penalty kicks
in soccer, where the goal-keeper dives left or right and the kicker simultaneously kicks left or
right. The goal-keeper is like the row player in Matching Pennies and wants to match, the
kicker like the column player. Any fixed action could be anticipated and exploited by the
other player. By randomizing, neither player can exploit knowledge of the strategy adopted
by the other player.
Definition 2.8 (Support). The support of mixed strategy si , (si ) = {ai : si (ai ) > 0, ai 2
Ai }, is all actions played with strictly positive probability.
Given this, a strategy profile s⇤ is a mixed-strategy Nash equilibrium if, and only if, for
all agents i,
Example 2.7. Looking for a mixed-strategy Nash equilibrium in Matching Pennies in which
both players mix over both actions, we need a probability p for player 1 such that player 2
is indi↵erent across H and T . This is p = 0.5. Similarly, we need to find a probability
q for player 2 such that player 1 is indi↵erent across H and T . This is q = 0.5. We
can conclude that (p⇤ , q ⇤ ) = (0.5, 0.5) is a mixed-strategy Nash equilibrium. Each player is
indi↵erent across its two actions given the strategy of the other player, and thus both players
are best-responding.
We will make extensive use of this concept of the support of a strategy and this definition
of a mixed-strategy Nash equilibrium in Chapter 5 when discussing algorithmic approaches
to finding the equilibrium of games.
Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission. 27
2 Game Theory I: Simultaneous-Move Games
Player 2
Y S
Y 0, 0 0, 2
Player 1
S 2, 0 4, 4
The existence of multiple equilibria can make it difficult to predict how players will act
in a game. Certainly, when every player has an action that strictly dominates every other
action (as in the Prisoner’s Dilemma), then there is a unique Nash equilibrium. Similarly,
when iterated elimination of strictly-dominated actions yields a single action profile (as in
the example in Figure 2.3), then there is a unique Nash equilibrium. But many games have
multiple equilibria.
Approaches to reconcile this difficulty include identifying an equilibrium that seems more
likely because it Pareto dominates other equilibria, or is more stable to small, random
mistakes by other agents. But these details are beyond the scope of this book.
28 Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission.
2.7 Congestion Games
always exists.
A congestion game is a simultaneous-move game in which there are resources, each agent
selects one or more resources, and the utility of an agent is the negated total congestion
cost associated with the resources that it selects. Given resources E, the power set 2E is
the set of all possible subsets; e.g., if E = {1, 2} then 2E = {;, {1}, {2}, {1, 2}}.
Definition 2.10 (Congestion game). A congestion game (N, A, E, c) has:
• N = {1, . . . , n} agents, indexed by i
• E = {1, . . . , m} resources, indexed by j
• joint action set A = A1 ⇥ . . . ⇥ An , where Ai is a set of actions available to agent i and
Ai ✓ 2E , where 2E is the power set on the set of resources. Action ai 2 Ai selects a
subset of resources.
• cost function ce (x) 2 < for resource e which depends on the number of agents x that
select the resource
Let xe be the total number of agents that select resource e given action profile a. The cost
to agent i, given action profile a 2 A, is
X
ci (a) = ce (xe ), (2.7)
e2ai
where the summation is taken over all resources ai ✓ E selected by agent i. The utility to
agent i for action profile a is just the negated cost: ui (a) = ci (a).
In words, each player selects some subset of resources, this induces congestion on each
resource, and the total cost experienced by a player is the sum over the cost on the resources
he selects. It is often natural for the cost function ce (xe ) to be non-decreasing and positive,
but neither restriction is necessary.
In regard to succinctness, recall that the normal-form representation is exponential in the
number of agents. In comparison, congestion games have a succinct representation because
the cost (or negated utility) depends only on the number of players who select each resource
and not the particular subset of players.
To see the modeling power of congestion games, let’s consider two illustrative examples.
The first is the network flow problem that illustrated Braess’ Paradox in Chapter 1.
Example 2.9 (Network flow). See Figure 2.7. There are n = 2000 agents, and resources
E = {12, 13, 23, 24, 34} corresponding to the edges in the network. Each edge has a cost
function, with c12 (x) = c34 (x) = x/100, c13 (x) = c24 (x) = 25, and c23 (x) = 0. Each agent’s
available actions are: {{12, 24}, {12, 23, 34}, {13, 34}},P and correspond to the three possible
paths. The cost function of agent i is ci (ai , a i ) = e2ai ce (xe ), where e 2 ai enumerates
the edges on its selected path and xe is the number of agents that select edge e given action
profile a = (ai , a i ). Each agent’s utility is its negated cost. From Chapter 1, we know
the unique pure-strategy Nash equilibrium (in fact a dominant strategy equilibrium) is for
every agent to select action {12, 23, 34}, and take the path that includes zero-cost edge 2-
3. This has cost ci (a) = c12 (x12 ) + c23 (x23 ) + c34 (x34 ) = 2000/100 + 0 + 2000/100 = 40.
In comparison, the social optimal flow has 1000 agents taking path 1-2-4 and 1000 agents
taking path 1-3-4, with cost 35 to each agent.
Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission. 29
2 Game Theory I: Simultaneous-Move Games
Figure 2.7: Network flow problem: there are 2000 units to flow from location 1 to 4 over
edges, each of which has an associated cost function.
Figure 2.8: Network connection problem: n agents need to connect location 1 with 2 and
can select edge T or B.
Compared with the succinct congestion game formulation, a normal form representation
for this network flow game would require enumerating the payo↵s for each player for each
of the 32000 possible action profiles.
Example 2.10 (Network connection game). See Figure 2.8. Consider a connection game,
where each of n agents must choose to connect locations 1 and 2 by edge T or edge B.
The agents that select T share cost n and the agents that select B share cost 1 + ✏ for
some 0 < ✏ < 1. For example, the setting could be multiple firms each choosing a mode of
transport that their employees will share to get across a city. The social optimal outcome is
that everyone uses connection B with total cost 1 + ✏.
Modeling this as a congestion game, the resources are {T, B} and the cost functions are
cT (xT ) = xnT and cB (xB ) = 1+✏
xB , where xT and xB are the number of agents who select T and
B respectively. The action set is {T, B} for each agent. One Nash equilibrium is “all B,”
because an agent’s cost 1+✏ n < n, which is the cost to deviate. Another Nash equilibrium is
“all T ,” because an agent’s cost nn = 1 < 1 + ✏, which is the cost to deviate. One equilibrium
30 Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission.
2.8 Potential Games
In words, a game is a potential game if there is a function (the potential function) such
that the di↵erence in potential between any two action profiles, that di↵er only in the
action of a single agent, is exactly the di↵erence in utility to the agent whose action changes
between the profiles. Potential functions are not unique: an arbitrary constant can always
be added to the potential value of every action profile.
Example 2.11. In Figure 2.9 (a) we provide a potential function for Prisoner’s Dilemma.
To check this, just verify that the di↵erence in potential for all action profiles satisfy the
potential property. For example, going from (C, C) to (D, C), the action that changes is
that of agent 1, and agent 1’s utility increases by 2, which is exactly the change in potential
function Pot(D, C) Pot(C, C) = 2.
Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission. 31
2 Game Theory I: Simultaneous-Move Games
Player 2
✓ ◆
C D 0 2
C 2, 2 5, 0 P =
(a) Player 1 2 3
D 0, 5 4, 4
Player 2
H T ✓ ◆
H 1, 1 1, 1 0 2
(b) Player 1 P =
T 1, 1 1, 1 6 4
Figure 2.9: (a) The Prisoner’s Dilemma game and a potential function for the game. (b)
The Matching Pennies game, and an attempt to construct a potential function
(the bottom-left and top-left entries are incorrect for a deviation by player 1.)
Figure 2.10: An illustration of the potential function in a potential game and the action
profile with the maximum potential.
See Figure 2.10 for an illustration of the potential function in a potential game, illustrated
here for an arbitrary ordering on action profiles. There is no reason to expect the potential
to vary smoothly as suggested in the figure. What is significant is the existence of an action
profile a⇤ with maximum potential. From the Prisoner’s Dilemma example, we see that
(D, D) has maximum potential, and corresponds to the pure-strategy Nash equilibrium in
the game.
This property holds generally in potential games:
Theorem 2.3. Every potential game has a pure-strategy Nash equilibrium.
Proof. Consider action profile a⇤ 2 arg maxa2A Pot(a). By construction, for any other
action profile (a0i , a i ), then Pot(a0i , a i ) Pot(a) and therefore ui (a0i , a i ) ui (a) =
Pot(a0i , a i ) Pot(a) 0, and there can be no beneficial unilateral deviation. We con-
clude that action profile a⇤ is a pure-strategy Nash equilibrium.
In particular, every congestion game is a potential game, and thus has a pure-strategy
Nash equilibrium.
Theorem 2.4. Every congestion game is a potential game.
32 Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission.
2.8 Potential Games
Proof. Given a congestion game, we construct a potential function, and show that the
congestion game is a potential game. For this, consider the following potential function:
xe
XX
Pot(a) = ce (j), (2.9)
e2E j=1
where xe is the number of agents who select resource e given action profile a, and the second
summation is zero when xe = 0. Fix any actions a i chosen by all agents except i, and
consider the change in potential from action ai to action a0i :
xe 0
xe
XX XX
Pot(a0i , a i ) Pot(ai , a i ) = ce (j) ce (j) (2.10)
e2E j=1 e2E j=1
0 0
1
xe xe
X X X X X
= @ ce (j) ce (j)A = ce (xe ) ce (xe + 1) (2.11)
e2E j=1 j=1 e2ai \a0i e2a0i \ai
where xe and x0e denote the count on resource e at action profile a and (a0i , a i ) respectively.
The third equality follows by recognizing that the sums for a resource e that is in both ai
and a0i cancel (since xe = x0e ). For resources e 2 ai \ a0i selected in ai but not a0i , there is an
additional term in the first summation. For resources e 2 a0i \ ai selected in a0i but not ai ,
there is an additional term in the second summation. The final equality holds because the
increase in utility is the decrease in cost to agent i, which is exactly (2.11).
A sequence of action profiles form a path if the sequence has the single-deviation property,
such that only one agent changes its action at each step. A path a(0) , a(1) , a(2) , . . . , is im-
(k+1) (k) (k) (k)
proving if ui (ai , a i ) > ui (ai , a i ), where agent i’s action changes in step k. Potential
games have the finite-improvement property:
Theorem 2.5 (Finite-improvement property). Any improving path on action profiles in a
potential game with a finite number of actions terminates in a finite number of steps with a
pure-strategy Nash equilibrium.
Proof. Consider an improving path on action profiles a(0) , a(1) , a(2) , . . .. The potential
Pot(a(k+1) ) > Pot(a(k) ) for all steps k, and thus no action profile is repeated, and the
path must terminate after a finite number of steps because there is a finite number of ac-
tions and thus a finite number of action profiles. Upon termination the action profile is a
pure-strategy Nash equilibrium because no improvement is possible, and thus every agent
is simultaneously maximizing its utility.
An improving path need not reach the action profile with maximum potential. Rather,
it can terminate at a local maxima in the potential landscape; i.e., an action profile where
no deviation by a single agent can increase the potential.
The finite-improvement property suggests a natural better-response dynamic for finding
a Nash equilibrium, in which players continually select improving actions given the actions
of others. However, one caution is that this is not guaranteed to find a Nash equilibrium in
a small number of steps.
Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission. 33
2 Game Theory I: Simultaneous-Move Games
What is required of the payo↵ matrix for a game to be a potential game? For this, define
the value of a path as the total change in utility, summed over the change incurred by the
agent that changes its action at each step on the path. A cycle is a path that starts and
ends at the same action profile.
Example 2.13. The value of the cycle (C, C), (C, D), (D, D), (D, C), (C, C) in the Pris-
oner’s dilemma is, (u2 (C, D) u2 (C, C)) + (u1 (D, D) u1 (C, D)) + (u2 (D, C) u2 (D, D)) +
(u1 (C, C) u1 (D, C)) = 0.
Certainly, the value of all cycles in a potential game must be zero. We have seen this idea
in Example 2.12. Say that a cycle is a 2-by-2 cycle if it involves 2 agents, each of which
changes its action twice. For example, Figure 2.11 illustrates a 2-by-2 cycle involving agents
1 and 2 and actions a1 , a01 and a2 , a02 . Exercise 2.5 establishes that it is sufficient that all
2-by-2 cycles have zero value for a game to be a potential game.
2.9 Notes
For a detailed introduction to game theory, a comprehensive reference is provided by “A
Course in Game Theory” (Osborne and Rubinstein, MIT Press 2001). Gibbons “Game
Theory for Applied Economists” (Princeton University Press 1992) provides a more accessi-
ble introduction. For an advanced reference, Fudenberg and Tirole’s “Game Theory” (MIT
Press, 1991) is recommended. A large number of refinements have been proposed to the
basic equilibrium concept, each of which imposes additional requirements on the outcome
and seeks to identify a particular equilibrium prediction. We will see an example of such a
refinement, in the context of games with sequential moves, in Chapter 3.
Chapters 1 and 17-20 in “Algorithmic Game Theory” (Nisan, Roughgarden, Tardos and
Vazirani, eds, Cambridge University Press 2007) expands on some of the themes related
to representational issues, as well as congestion games and potential games. “Essentials
of Game Theory: A Concise, Multidisciplinary Introduction” (Leyton-Brown and Shoham,
Morgan Claypool 2008) provides an accessible proof of the existence of a mixed-strategy
Nash equilibrium in finite games, and develops utility theory within the von Neumann-
Morgenstern axiomatic framework.
Congestion games were introduced by R. W. Rosenthal “A class of games possessing pure-
strategy Nash equilibria” Int. J. Game Theory 2 (1973), 65-67. Later, D. Monderer and L.
34 Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission.
2.9 Notes
S. Shapley “Potential games” Games and Economic Behavior 14: 124-143, 1996 formalized
the equilibrium properties of congestion games from the viewpoint of potential functions.
In fact, every finite potential game is a congestion game. The development of potential
games in Exercise 2.5 follows Monderer and Shapley. See T. Roughgarden “Computing
Equilibria: A Computational Complexity Perspective” Economic Theory 42 193-236 (2010)
for a discussion of the complexity of finding equilibrium in congestion games. See N. Nisan,
M. Schapira and A. Zohar “Asynchronous Best-Reply Dynamics” Proc. WINE 2008 for an
example of a potential game in which best-response dynamics need not converge to a Nash
equilibrium when players move at the same time and perhaps with delayed information
about earlier moves.
Example 2.10, the network connection game, is introduced in E. Anshelevich, A. Das-
gupta, J. Kleinberg, Tardos, T. Wexler, and T. Roughgarden, “The price of stability for
network design with fair cost allocation”, Proceedings of IEEE Symposium on Foundations
of Computer Science, 2004, pp. 295-304. Exercise 2.6 is developed from material in B.
Awerbuch, Y. Azar, and A. Epstein “The price of routing unsplittable flow” in Proc. 37th
ACM Sympos. on Theory of Computing, ACM Press, New York, 2005, pp. 57-66.
The existence of pure-strategy Nash equilibria in weighted congestion games (see Exer-
cise 2.7) is due to D. Fotakis, S. Kontogiannis, and P. Spirakis “Selfish unsplittable flows,”
Proc. 31st ICALP, LNCS 3142, Springer-Verlag, Berlin, 2004, pp. 593-605. The proof
follows the same approach as that of Theorem 2.2, adopting a modified potential function.
The 3-player normal form game in Exercise 2.2 is from CS 224 (Stanford) Homework
#1 (game theory). The scheduling game in Exercise 2.3 was introduced in Y. Azar, K.
Jain and V. Mirrokni “(Almost) Optimal Coordination Mechanisms for Unrelated Machine
Scheduling” Proc. Annual ACM-SIAM Symp. on Discrete Algorithms (2008). The agenda
of designing coordination mechanisms (such as shortest-first precedence orders) for selfish
scheduling was introduced by G. Christodoulou, E. Koutsoupias, and A. Nanavati “Coor-
dination mechanisms”, Proc. 31st International Colloquium on Automata, Languages and
Programming, pages 345-357 (2004).
The auction game in Exercise 2.4 (b) and (c) are based on A. Hassidim, H. Kaplan, M.
Mansour, and N. Nisan. “Non-price equilibria in markets of discrete goods,” Proc. 12th
ACM Conference on Electronic Commerce (EC), pages 295-296, 2011. The second-price
auction game in Exercise 2.4 (d) is from K. Bhawalkar and T. Roughgarden, “Welfare
Guarantees for Combinatorial Auctions with Item Bidding,” Proc. SODA (2011).
The load balancing game in Exercise 2.7 is from Chapter 20 “Selfish Load Balancing”
by B. Vöcking in “Algorithmic Game Theory” (Nisan, Roughgarden, Tardos and Vazirani,
eds, CUP 2007), which also provides an extensive discussion of this and related problems.
The study of load balancing in Nash equilibrium was introduced in an influential paper by
E. Koutsoupias and C. Papadimitriou, “Worst-case equilibria” in Proc. 16th Sympos. on
Theoretical Aspects of Computer Science, 404-413 (1999). The load balancing game can be
interpreted as a selfish routing game where the underlying network consists of two nodes,
a source and a sink, and there are a set of parallel links from the source to the sink. Each
machine corresponds to a link, and each task to a flow of a di↵erent size. The e↵ect of
selfish behavior on social welfare in the worst-case equilibria of games was later coined the
Price of Anarchy by C. H. Papadimitriou in “Algorithms, games, and the Internet,” Proc.
33rd Annual ACM Symposium on Theory of Computing (STOC), pp. 749-753, 2001.
Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission. 35
2 Game Theory I: Simultaneous-Move Games
c2.2 Why is there no pure-strategy Nash Equilibrium in the Matching Pennies game?
c2.4 Why must all actions in the support of a mixed strategy that is part of a Nash
equilibrium have the same expected utility?
c2.5 What do you see as two fundamental challenges in the application of game theory?
2.10.2 Exercises
2.1 Iterated elimination of dominated actions
(a) Prove that iterated elimination of strictly-dominated actions never removes an
action that is part of any mixed-strategy Nash equilibrium, and that the set of
equilibria in the reduced game is equal to that in the original game.
(b) Give an example of a game where no action can be eliminated by iterated elimi-
nation of strictly-dominated actions.
(c) What is the complexity of iterated elimination of strictly-dominated actions?
(d) Consider a variation of iterated elimination that will remove an action ai 2 Ri
if there is weak dominance, with some a0i 2 Ri such that ui (ai , a i ) ui (a0i , a i )
for all a i 2 R i and ui (ai , a i ) < ui (a0i , a i ) for at least one a i 2 R i .
(i) Construct an example that shows that the order of elimination a↵ects the set
of eliminated actions under this notion of weak dominance.
(ii) Construct an example that shows that a Nash equilibrium can be eliminated.
(iii) Prove that any equilibrium of the game that results from iterated elimination
of weakly dominated strategies will be an equilibrium of the original game.
36 Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission.
2.10 Comprehension Questions and Exercises
Player 2
Stag Hare
Stag 400, 400 0, 100
Player 1
Hare 100, 0 100, 100
Player 2 Player 2
L R L R
T (5, 5, 5) (2, 6, 2) T (2, 2, 6) ( 1, 3, 3)
Player 1
B (6, 2, 2) (3, 3, 1) B (3, 1, 3) (0, 0, 0)
N F
Figure 2.13: A three-player normal form game. Player 3 plays N or F , and players 1 and
2 select T or B and L or R respectively.
(d) In the game of Stag there are two hunters, and they can decide to hunt for a Stag
or a Hare. The Stag is hard to catch and they both need to agree, while the Hare
is less valuable. The payo↵ matrix is in Figure 2.12. Plot the best-responses and
identify the Nash equilibria of the game.
(e) Consider the 3-player normal form game in Figure 2.13. Each player has two
actions: (T, B) for player 1, (L, R) for player 2 and (N, F ) for player 3. Player
3 gets to select the left or right payo↵ matrix, player 2 the column and player 1
the row. For example, if they play (T, L, F ) the payo↵s are 2, 2, 6 to players 1,
2 and 3 respectively. List all of the pure strategy Nash equilibria and list all the
Pareto optimal outcomes of the game.
Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission. 37
2 Game Theory I: Simultaneous-Move Games
(d) The make-span is the time that the last task is completed. What is the make-
span in the Nash equilibrium? What is the socially optimal assignment; i.e., the
one that minimizes the make-span?
(e) Provide a precedence ordering for machine 1 and 2 in the scheduling game for
which there is no pure-strategy Nash equilibrium. Explain.
38 Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission.
2.10 Comprehension Questions and Exercises
(b) Construct a 2 player, 2 action game that has the finite-improvement property
but is not a potential game. (Hint: it will need to have a pure-strategy Nash
equilibrium in which an agent is indi↵erent between playing the equilibrium and
deviating.)
(c) Construct a 2 player game with a dominant-strategy equilibrium that is not a
potential game.
(d) Consider a potential game G, and a game G0 in which every player’s utility is
ui (a) = Pot(a), and equal to the potential function in game G. Do games G and
G0 have the same set of Nash equilibria? Why or why not?
(e) Recall that a path on action profiles satisfies the single-deviation property, and
is a cycle if it starts and ends on the same action profile. A cycle may pass
through the same action profile more than once. The value of a path is the total
incremental change in utility along the path. Prove that if all cycles in a game
have zero value then the game is a potential game. [Hint: fix any action profile
z, and as a first step, establish by the zero-value-cycle property that any two
paths from z to an action profile a 6= z have the same value. Second, show that
Pot(a) = I(z ! a), where I(z ! a) is the value of any path from z to a, satisfies
potential property (2.8).]
(f) Prove that if all 2-by-2 cycles (see Figure 2.11) have zero value then all cycles
have zero value. [Hint: Assume for contradiction that there is a cycle =
(a(0) , a(1) , . . . , a(` 1) , a(`) ), where a(`) = a(0) , of length ` 5, with value I( ) 6= 0,
and that this positive-value cycle is minimal, in that all cycles of length < ` have
zero value. Assume WLOG that agent 1 moves in step 0, and let j denote another
step in which 1 must move (this is required for it to be a cycle). First, argue
by minimality (or the 2-by-2 assumption if ` = 5) that j is not step 1 or ` 1.
WLOG, suppose agent 2 moves in step j 1. Now consider cycle 0 , which
di↵ers from only in that agent 1 now deviates in step j 1 and agent 2 in
step j; i.e., steps a(j 1) , a(j) , a(j+1) in become steps a(j 1) , z (j) , a(j+1) in 0 ,
where z (j) is obtained from a(j 1) by agent 1’s deviation. Second, argue by the
zero-value-2-by-2 property that I( ) = I( 0 ). Third, by considering minimality,
and recognizing I( 0 ) 6= 0, complete the proof.]
2.6 Network routing game
Consider a network routing game where each agent has to route a unit flow on a
directed graph from one node to another. Each edge has a delay that depends on the
total flow on the edge. Each agent wants to minimize cost, which is the total delay
on the edges on its selected route. See Figure 2.14. There are four players, with start
and end nodes as indicated. Each edge is annotated with its cost function, either
ce (xe ) = 0 or ce (xe ) = xe .
(a) Formulate this as a congestion game.
(b) Identify two pure strategy Nash equilibria of the game. Argue why there are
equilibria.
(c) What is the socially optimal flow, i.e. the flow that minimizes the total cost to
all agents?
Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission. 39
2 Game Theory I: Simultaneous-Move Games
Figure 2.14: A network routing game, indicating the origin-destination pairs for each player
and the delay functions on each edge.
(d) Following the approach in Theorem 2.4, formulate the potential function and de-
termine which of the two Nash equilibrium corresponds to the maximum potential
in the game.
40 Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission.
2.10 Comprehension Questions and Exercises
(d) What is the socially optimal assignment; i.e., the one that minimizes the maxi-
mum completion time across both machines (the “make-span”)?
(e) What do you observe about the minimum and maximum ratio of make-span
in a pure-strategy Nash equilibrium to the socially optimal make-span in this
example?
Copyright © 2013 D.C. Parkes & S. Seuken. Do not distribute without permission. 41