Distributed Deadlocks

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Distributed deadlocks

AOS – Abdul Haleem – L1F09MSCS0017

Abstract
Distributed systems, in general, exhibit a high degree of resource and data sharing, a
situation in which deadlocks may happen. Deadlocks arise when members of a group of
processes which hold resources are blocked indefinitely from access to resources held by
other processes within the group.

Deadlocks are a fundamental problem in distributed systems. In distributed systems, a


process may request resources in any order, which may not be known before hand, and a
process can request a resource while holding others. If the allocation sequence of process
resources is not controlled in such environments, deadlocks can occur.
Deadlocks can be dealt with using any one of the following three strategies:

Deadlock prevention is commonly achieved by either having a process acquire all the
needed resources simultaneously before it begins execution or by pre-empting a process
that holds the needed resource.

Deadlock avoidance approach in distributed systems, a resource is granted to a process if


the resulting global system is safe.

Deadlock detection requires an examination of the status of the process–resources


interaction for the presence of a deadlock condition.

To resolve the deadlock, we have to abort a deadlocked process.

System Model
1. A distributed system consists of a set of processors that are connected by a
communication network. The communication delay is finite but unpredictable.
2. A distributed program is composed of a set of n asynchronous processes that
communicate by message passing over the communication network.
3. Each process is running on a different processor.
4. The processors do not share a common global memory and communicate solely
by passing messages over the communication network.
5. There is no physical global clock in the system to which processes have
instantaneous access.
6. The communication medium may deliver messages out of order, messages may be
lost, garbled, or duplicated due to timeout and retransmission, processors may
fail, and communication links may go down.
Comparison of Three Approaches
Here is a simple comparison of the approaches to handle deadlocks in distributed
systems:

Deadlock Prevention
Deadlock prevention is commonly achieved either by having a process acquire all the
needed resources simultaneously before it begins executing or by pre-empting a process
that holds the needed resource. This approach is highly inefficient because it decreases
system concurrency and impractical in distributed systems.

Deadlock Avoidance
In deadlock avoidance approach to distributed systems, a resource is granted to a process
if the resulting global system state is safe (a global state includes all the processes and
resources of the distributed system).

Although deadlock avoidance strategies are often used in centralized systems they are
rarely used in a distributed system. This is because checking for safe states is
computationally expensive due to the large number of processes and resources in
distributed systems without a global clock

Due to such problems, however, deadlock avoidance is impractical in distributed systems.

Deadlock detection
Deadlock detection requires an examination of the status of process–resource interactions
for the presence of cyclic wait. Deadlock detection in distributed systems seems to be the
best approach to handle deadlocks in distributed systems

Deadlock Detection Considerations


Before looking into deadlock detection algorithms, we need to understand following
preliminaries.

Wait-For Graph
In distributed systems, the state of the system can be modeled by directed graph, called a
wait-for graph (WFG). In a WFG, nodes are processes and there is a directed edge from
node P1 to node P2 if P1 is blocked and is waiting for P2 to release some resource. A
system is deadlocked if and only if there exists a directed cycle in the WFG.

Steps to Follow
For handling deadlocks, following steps are required:
1. Maintaining a wait-for graph and searching it for the presence of deadlock. A
cycle may consist of several sites.
2. Breaking existing wait-for dependencies between the processes to resolve the
deadlock. It involves rolling back one or more deadlocked processes and
assigning their resources to blocked processes so that they can resume execution
Correctness Criteria
A deadlock detection algorithm must satisfy the following two conditions:

1. Progress (no undetected deadlocks): The algorithm must detect all existing
deadlocks in a finite time. Once a deadlock has occurred, the deadlock detection
activity should continuously progress until the deadlock is detected.

2. Safety (no false deadlocks): The algorithm should not report deadlocks that do
not exist (called phantom or false deadlocks).

Resource Request Models


Distributed systems allow many kinds of resource requests. A process might require a
single resource or a combination of resources for its execution. The deadlock detection
algorithms work for specific resource request model.

The single-resource model


The single-resource model is the simplest resource model in a distributed system, where a
process can have at most one outstanding request for only one unit of a resource. Since
the maximum out-degree of a node in a WFG for the single resource model can be 1, the
presence of a cycle in the WFG shall indicate that there is a deadlock.

The AND Model


In the AND model, a process can request more than one resource simultaneously. The
process can proceed when it has acquired all the requested resources.

The OR Model
In the OR model, a process can make a request for numerous resources simultaneously.
The process that requires resources for execution can proceed when it has acquired at
least one of those resources.

AND-OR Model
In the AND-OR model, a request may specify any combination of and and or in the
resource request. For example, in the ANDOR model, a request for multiple resources
can be of the form x and (y or z). The requested resources may exist at different
locations.

P-out-of-Q Model
P-out-of-Q which means that a process simultaneously requests Q resources and remains
blocked until it is granted any P of those resources. Every request in AND-OR model can
be expressed in P-out-of-Q model and vice versa.
Distributed Deadlock Detection Algorithms
Classification
Distributed deadlock detection algorithms can be divided into four classes path-pushing,
edge-chasing, diffusion computation, and global state detection.

Path-pushing algorithms
In path-pushing algorithms, distributed deadlocks are detected by maintaining an explicit
global WFG. The basic idea is to build a global WFG for each site of the distributed
system. In this class of algorithm, whenever deadlock computation is performed, each
site sends its local WFG to all the neighboring sites. After the local data structure of each
site is updated, this updated WFG is then passed along to other sites, and the procedure is
repeated until one site has a sufficiently complete picture of the global state to announce
deadlock or to establish that no deadlocks are present.

Edge-chasing algorithms
In an edge-chasing algorithm, the presence of a cycle in a distributed graph structure is
verified by propagating special messages called probes along the edges of the graph.
These probe messages are different to the request and reply messages. The formation of a
cycle can be detected by a site if it receives the matching probe sent by it previously.
Whenever a process that is executing receives a probe message, it simply discards this
message and continues. Only blocked processes propagate probe messages along their
outgoing edges.
The main advantage of edge-chasing algorithms is that probes are fixed size messages
that are normally very short

Diffusing computation-based algorithms


In diffusion computation-based distributed deadlock detection algorithms to detect a
deadlock, a process sends out query messages along all the outgoing edges in the WFG.
These queries are successively propagated (i.e., diffused) through the edges of the WFG.
Queries are discarded by a running process and are echoed back by blocked processes in
the following way: when a blocked process first receives a query message for a particular
deadlock detection initiation, it does not send a reply message until it has received a reply
message for every query it sent (to its successors in the WFG). For all subsequent queries
for this deadlock detection initiation, it immediately sends back a reply message. The
initiator of the deadlock detection detects a deadlock when it has received a reply for
every query it has sent out.

Global state detection-based algorithms


Global state detection-based deadlock detection algorithms exploit the following facts:
1. A consistent snapshot of a distributed system can be obtained without freezing the
underlying computation.
2. A consistent snapshot may not represent the system state at any moment in time,
but if a stable property holds in the system before the snapshot collection is
initiated, this property will still hold in the snapshot.
Therefore, distributed deadlocks can be detected by taking a snapshot of the system and
examining it for the condition of a deadlock.

Mitchell and Merritt’s algorithm for the single-resource


model
Mitchell and Merritt’s algorithm belongs to the class of edge-chasing algorithms where
probes are sent in the opposite direction to the edges of the WFG. When a probe initiated
by a process comes back to it, the process declares deadlock. The algorithm has many
good features, such as:
1. Only one process in a cycle detects the deadlock. This simplifies the deadlock
resolution as this process can abort itself to resolve the deadlock.
2. In this algorithm, a process that is detected in deadlock is aborted spontaneously,
even though under this assumption phantom deadlocks cannot be excluded. It can
be shown, however, that only genuine deadlocks will be detected in the absence
of spontaneous aborts.

Each node of the WFG has two local variables, called labels: a private label, which is
unique to the node at all times, though it is not constant, and a public label, which can be
read by other processes and which may not be unique. Each process is represented as u/v,
where u and v are the public and private labels, respectively. Initially, private and public
labels are equal for each process. A global WFG is maintained and it defines the entire
state of the system. The algorithm is defined by the four state transitions shown in the
figure:

z = inc(u, v), and inc(u, v) yields a unique label greater than both u and v. Labels that are
not shown do not change. Block creates an edge in the WFG. Two messages are needed:
one resource request and one message back to the blocked process to inform it of the
public label of the process it is waiting for. Activate denotes that a process has acquired
the resource from the process it was waiting for. Transmit propagates larger labels in the
opposite direction to the edges by sending a probe message. Whenever a process receives
a probe that is less than its public label, it simply ignores that probe. Detect means that
the probe with the private label of some process has returned to it, indicating a deadlock.
Whenever a process receives a signal, it compares its id with the one associated with the
signal and keeps the larger one in the outgoing signal. A process detects a deadlock when
it receives its own id.

Chandy–Misra–Haas algorithm for the AND model


Chandy–Misra–Haas’s distributed deadlock detection algorithm for the AND model is
based on edge-chasing. The algorithm uses a special message called probe, which is a
triplet (i, j, k), denoting that it belongs to a deadlock detection initiated for process Pi and
it is being sent by the home site of process Pj to the home site of process Pk. A probe
message travels along the edges of the global WFG graph, and a deadlock is detected
when a probe message returns to the process that initiated it. A process Pj is said to be
dependent on another process Pk if there exists a sequence of processes Pj , Pi1, Pi2 , . .,
Pim, Pk such that each process except Pk in the sequence is blocked and each process,
except the Pj , holds a resource for which the previous process in the sequence is waiting.
Process Pj is said to be locally dependent upon process Pk if Pj is dependent upon Pk and
both the processes are on the same site.

Data structures
Each process Pi maintains a boolean array, dependent-i, where dependent-i(j) is true only
if Pi knows that Pj is dependent on it. Initially, dependent-i(j) is false for all i and j.

The algorithm
Following algorithm is executed to determine if a blocked process is deadlocked.
Therefore, a probe message is continuously circulated along the edges of the global WFG
graph and a deadlock is detected when a probe message returns to its initiating process.

if Pi is locally dependent on itself


then declare a deadlock
else for all Pj and Pk such that
(a) Pi is locally dependent upon Pj , and
(b) Pj is waiting on Pk, and
(c) Pj and Pk are on different sites,
send a probe (i, j, k) to the home site of Pk

On the receipt of a probe (i, j, k), the site takes the following actions:

if
(d) Pk is blocked, and
(e) dependentk i is false, and
(f) Pk has not replied to all requests Pj ,
Then begin
Dependent-k(i) = true;
if k = i
then declare that Pi is deadlocked
else for all Pm and Pn such that
(a’) Pk is locally dependent upon Pm, and
(b’) Pm is waiting on Pn, and
(c’) Pm and Pn are on different sites,
send a probe (i, m, n) to the home site of Pn
end.

Performance analysis
In the algorithm, one probe message (per deadlock detection initiation) is sent on every
edge of the WFG which connects processes on two sites. Thus, the algorithm exchanges
at most m(n−1)/2 messages to detect a deadlock that involves m processes and spans over
n sites. The size of messages is fixed and is very small (only three integer words). The
delay in detecting a deadlock is O(n).

Chandy–Misra–Haas algorithm for the OR model


Chandy–Misra–Haas’s distributed deadlock detection algorithm for the OR model is
based on the approach of diffusion computation. A blocked process determines if it is
deadlocked by initiating a diffusion computation. Two types of messages are used in a
diffusion computation: query(i, j, k) and reply(i, j, k), denoting that they belong to a
diffusion computation initiated by a process Pi and are being sent from process Pj to
process Pk.

Basic idea
A blocked process initiates deadlock detection by sending query messages to all
processes in its dependent set (i.e., processes from which it is waiting to receive a
message). If an active process receives a query or reply message, it discards it. When a
blocked process Pk receives a query(i, j, k) message, it takes the following actions:
1. If this is the first query message received by Pk for the deadlock detection
initiated by Pi (called the engaging query), then it propagates the query to all the
processes in its dependent set and sets a local variable num-k(i) to the number of
query messages sent.
2. If this is not the engaging query, then Pk returns a reply message to it immediately
provided Pk has been continuously blocked since it received the corresponding
engaging query. Otherwise, it discards the query.

Process Pk maintains a boolean variable wait-k(i) that denotes the fact that it has been
continuously blocked since it received the last engaging query from process Pi. When a
blocked process Pk receives a reply(i, j, k) message, it decrements num-k(i) only if
wait-k(i) holds. A process sends a reply message in response to an engaging query only
after it has received a reply to every query message it has sent out for this engaging
query.
The initiator process detects a deadlock when it has received reply messages to all the
query messages it has sent out.
The algorithm
The algorithm works as shown in Algorithm 10.2. For ease of presentation, we have
assumed that only one diffusion computation is initiated for a process. In practice, several
diffusion computations may be initiated for a process (a diffusion computation is initiated
every time the process gets blocked), but at any time only one diffusion computation is
current for any process. However, messages for outdated diffusion computations may still
be in transit. The current diffusion computation can be distinguished from outdated ones
by using sequence numbers.

Initiate a diffusion computation for a blocked process Pi:


Send query(i, i, j) to all processes Pj in the dependent set DSi of Pi;
num-i(i) := |DSi| ; wait-i(i) :=true;

When a blocked process Pk receives a query(i, j, k):


If this is the engaging query for process Pi then
send query(i, k, m) to all Pm in its dependent set DSk;
num-k(i) := |DSk|; wait-k(i) :=true
else if waitk_i_ then send a reply(i, k, j) to Pj .

When a process Pk receives a reply(i, j, k):


if wait-k(i) then
num-k(i) := num-k(i) −1;
if num-k(i) = 0 then
if i = k then declare a deadlock
else send reply(i, k, m) to the process Pm
which sent the engaging query.

Performance analysis
For every deadlock detection, the algorithm exchanges e query messages and e reply
messages, where e = n_n−1_ is the number of edges.

Kshemkalyani–Singhal algorithm for the P-out-of-Q


model
The Kshemkalyani–Singhal algorithm to detect deadlocks in the P-out-of-Q model (also
called the generalized distributed deadlocks) is based on the global state detection
approach. The Kshemkalyani–Singhal algorithm is a single-phase algorithm, which
consists of a fan-out sweep of messages outwards from an initiator process and a fan-in
sweep of messages inwards to the initiator process. A sweep of a WFG is a traversal of
the WFG in which all messages are sent in the direction of the WFG edges (outward
sweep) or all messages are sent against the direction of the WFG edges (inward sweep).
In the outward sweep, the algorithm records a snapshot of a distributed WFG. In the
inward sweep, the recorded distributed WFG is reduced to determine if the initiator is
deadlocked.
Both the outward and the inward sweeps are executed concurrently in the algorithm.
Complications are introduced because the two sweeps can overlap in time at a process,
i.e., the reduction of the WFG at a process can begin before the WFG at that process has
been completely recorded. The algorithm deals with these complications.

System model
The system has n nodes, and every pair of nodes is connected by a logical channel. An
event in a computation can be an internal event, a message send event, or a message
receive event.
The computation messages can be either REQUEST, REPLY, or CANCEL messages. To
execute a p(i)-out-of-q(i) request, an active node i sends q(i) REQUESTs to q(i) other
nodes and remains blocked until it receives sufficient number of REPLY messages.
When node i blocks on node j, node j becomes a successor of node i and node i becomes
a predecessor of node j in the WFG. A REPLY message denotes the granting of a request.
A node i unblocks when p(i) out of its q(i) requests have been granted. When a node
unblocks, it sends CANCEL messages to withdraw the remaining qi–pi requests it had
sent.
Sending and receiving of REQUEST, REPLY, and CANCEL messages are computation
events. The sending and receiving of deadlock detection algorithm messages are
algorithmic or control events.

Description of the algorithm


When a node init blocks on a P-out-of-Q request, it initiates the deadlock detection
algorithm. The algorithm records the part of the WFG that is reachable from init
(henceforth, called the init’s WFG) in a distributed snapshot; the distributed snapshot
includes only those dependency edges and nodes that form init’s WFG.

The distributed WFG is recorded using FLOOD messages in the outward sweep and the
recorded WFG is examined for deadlocks using ECHO messages in the inward sweep. To
detect a deadlock, the initiator init records its local state and sends FLOOD messages
along all of its outward dependencies. When node i receives the first FLOOD message
along an existing inward dependency, it records its local state. If node i is blocked at this
time, it sends out FLOOD messages along all of its outward dependencies to continue the
recording of the WFG in the outward sweep. If node i is active at this time (i.e., it does
not have any outward dependencies and is a leaf node in the WFG), then it initiates
reduction of the WFG by returning an ECHO message along the incoming dependency
even before the states of all incoming dependencies have been recorded in the WFG
snapshot at the leaf node.

ECHO messages perform reduction of the recorded WFG by simulating the granting of
requests in the inward sweep. A node i in the WFG is reduced if it receives ECHOs along
pi out of its q(i) outgoing edges indicating that p(i) of its requests can be granted. An
edge is reduced if an ECHO is received on the edge indicating that the request it
represents can be granted. After a local snapshot has been recorded at node i, any
transition made by i from idle to active state is captured in the process of reduction. The
nodes that can be reduced do not form a deadlock whereas the nodes that cannot be
reduced are deadlocked. The order in which reduction of the nodes and edges of the WFG
is performed does not alter the final result. Node init detects the deadlock if it is not
reduced when the deadlock detection algorithm terminates.

In general, WFG reduction can begin at a non-leaf node before recording of the WFG has
been completed at that node; this happens when an ECHO message arrives and begins
reduction at a non-leaf node before all the FLOODs have arrived at it and recorded the
complete local WFG at that node. Thus, the activities of recording and reducing the WFG
snapshot are done concurrently in a single phase. Unlike the algorithm in, no serialization
is imposed between the two activities. Since a reduction is done on an incompletely
recorded WFG at nodes, the local snapshot at each node has to be carefully manipulated
so as to give the effect that WFG reduction is initiated after WFG recording has been
completed.

When multiple nodes block concurrently, they may each initiate the deadlock detection
algorithm concurrently. Each invocation of the deadlock detection algorithm is treated
independently and is identified by the initiator’s identity and initiator’s timestamp when it
blocked. Every node maintains a local snapshot for the latest deadlock detection
algorithm initiated by every other node.

The problem of termination detection


The algorithm requires a termination detection technique so that the initiator can
determine that it will not receive any more ECHO messages. The algorithm uses a
termination detection technique based on weights in conjunction with SHORT messages
to detect the termination of the algorithm. A weight of 1.0 at the initiator node, when the
algorithm is initiated, is distributed among all FLOOD messages sent out by the initiator.
When the first FLOOD is received at a non-leaf node, the weight of the received FLOOD
is distributed among the FLOODs sent out along outward edges at that node to expand
the WFG further. Since any subsequent FLOOD arriving at a non-leaf node does not
expand the WFG further, its weight is returned to the initiator in a SHORT message.
When a FLOOD is received at a leaf node, its weight is piggybacked to the ECHO sent
by the leaf node to reduce the WFG. When an ECHO arriving at a node unblocks the
node, the weight of the ECHO is distributed among the ECHOs that are sent by that node
along the incoming edges in its WFG snapshot. When an ECHO arriving at a node does
not unblock the node, its weight is sent directly to the initiator in a SHORT message.
Note that the following invariant holds in an execution of the algorithm:
the sum of the weights in FLOOD, ECHO, and SHORT messages plus the weight at the
initiator (received in SHORT and ECHO messages) is always 1.0. The algorithm
terminates when the weight at the initiator becomes 1.0, signifying that all WFG
recording and reduction activity has completed.
FLOOD, ECHO, and SHORT messages carry weights for termination detection.
Variable w, a real number in the range [0, 1], denotes the weight in a message.
Ref:
1. Distributed Computing Principles, Algorithms, and Systems – By Ajay D. Kshemkalyani And Mukesh
Singhal
2. Distributed System Design – By Jie Wu

You might also like