0% found this document useful (0 votes)
16 views

Unit V Correct

Uploaded by

maskon.alien
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Unit V Correct

Uploaded by

maskon.alien
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 46

UNIT –V DBMS

UNIT-V
Transactions: Transaction concept – Transaction State – Implementation of Atomicity and Durability
– Concurrent Executions – Serializability – Testing for Serializability. Concurrency Control: Lock-
Based Protocols – Timestamp-Based Protocols. Recovery System: Failure Classification – Storage
Structure – Recovery and Atomicity – Log-Based Recovery – Shadow Paging

TRANSACTIONS

 Collections of operations that form a single logical unit of work is called transactions
.A database system must ensure proper execution of transactions despite failures—
either the entire transaction executes, or none of it does.

Transaction Concept:

 A transaction is a unit of program execution that accesses and possibly updates


various data items.
 Usually, a transaction is initiated by a user program written in a
High-level data-manipulation language or programming language, where it is
delimited by statements (or function calls) of the form begin transaction and end
transaction.
 To ensure integrity of the data, we require that the database system maintain the
following properties of the transactions:
 Atomicity. Either all operations of the transaction are reflected properly in
the database, or none are.
 Consistency. Execution of a transaction in isolation (that is, with no other
transaction executing concurrently) preserves the consistency of the
database.
 Isolation. Even though multiple transactions may execute concurrently, the
system guarantees that, for every pair of transactions Ti and Tj, it appears
to Ti that either Tj finished execution before Ti started, or Tj started
execution after Ti finished. Thus, each transaction is unaware of other
transactions
executing concurrently in the system.
 Durability. After a transaction completes successfully, the changes it has
made to the database persist, even if there are system failures.

 These properties are often called the ACID properties; the acronym is derived from
the first letter of each of the four properties.

 Transactions access data using two operations:


• read(X), which transfers the data item X from the database to a local buffer
belonging to the transaction that executed the read operation.
• write(X), which transfers the data item X from the local buffer of the
transaction that executed the write back to the database.
Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 1
UNIT –V DBMS

 Let Ti be a transaction that transfers $50 from account A to account B. This


transaction can be defined as
Ti: read (A);
A: = A − 50;
write (A);
read (B);
B: = B + 50;
write (B).

 Let us now consider each of the ACID requirements.

• Consistency: The consistency requirement here is that the sum of A


and B be unchanged by the execution of the transaction. If the database is
consistent before an execution of the transaction, the database remains consistent
after the execution of the transaction.

• Atomicity: Suppose that, just before the execution of transaction Ti the values
of accounts A and B are $1000 and $2000, respectively. Suppose that the failure
happened after the write (A) operation but before the write (B) operation. In this
case, the values of accounts A and B reflected in the database are $950 and $2000.
The system destroyed $50 as a result of this failure. We term such a state an
inconsistent state. If the atomicity property is present, all actions of the
transaction are reflected in the database, or none are. Ensuring atomicity is the
responsibility of the database system itself; specifically, it is handled by a
component called the transaction-management component.

• Durability: The durability property guarantees that, once a transaction completes


successfully, all the updates that it carried out on the database persist, even if there
is a system failure after the transaction completes execution. We can guarantee
durability by ensuring that either
1. The updates carried out by the transaction have been written to disk before
the transaction completes.
2. Information about the updates carried out by the transaction and written to
disk is sufficient to enable the database to reconstruct the update when the
database system is restarted after the failure.
Ensuring durability is the responsibility of a component of the database system
called the recovery-management component.

• Isolation: Even if the consistency and atomicity properties are ensured for each
transaction, if several transactions are executed concurrently, their operations may
interleave in some undesirable way, resulting in an inconsistent state. A way to avoid
the problem of concurrently executing transactions is to execute transactions serially
—that is, one after the other. Ensuring the isolation property is the responsibility of a
component of the database system called the concurrency-control component
executes transactions serially—that is, one after the other.

Transaction State:
Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 2
UNIT –V DBMS

A transaction may not always complete its execution successfully. Such a transaction
is termed aborted. Any changes that the aborted transaction made to the database must be
undone. Once the changes caused by an aborted transaction have been undone, we say that
the transaction has been rolled back. A transaction that completes its execution successfully
is said to be committed.
A committed transaction that has performed updates transforms the database into a new
consistent state, which must persist even if there is a system failure. The only way to undo the
effects of a committed transaction is to execute a compensating transaction. A transaction
must be in one of the following states:
• Active, the initial state; the transaction stays in this state while it is
executing.
• Partially committed, after the final statement has been execute.
• Failed, after the discovery that normal execution can no longer
Proceed.
• Aborted, after the transaction has been rolled back and the
database has been restored to its state prior to the start of the
transaction.
• Committed, after successful completion.

 A transaction starts in the active state.


 When it finishes its final statement, it enters the partially committed state.
 The database system then writes out enough information to disk.
 If a transaction enters the failed state such a transaction must be rolled back.
 Then, it enters the aborted state. At this point, the system has two options:
• It can restart the transaction.

• It can kill the transaction.

 The state diagram corresponding to a transaction appears as follows:

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 3


UNIT –V DBMS

Implementation of Atomicity and Durability:

 The recovery-management component of a database system can support atomicity and


durability by a variety of schemes.
 We first consider a simple, but extremely inefficient, scheme called the shadow copy
scheme.
 This scheme, which is based on making copies of the database, called shadow copies,
assumes that only one transaction is active at a time.
 A pointer called db-pointer is maintained on disk; it points to the current copy of the
database.
 In the shadow-copy scheme, a transaction that wants to update the database first
creates a complete copy of the database.
 All updates are done on the new database copy, leaving the original copy, the shadow
copy, untouched.
 If at any point the transaction has to be aborted, the system merely deletes the new
copy.
 The old copy of the database has not been affected.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 4


UNIT –V DBMS

Shadow-copy technique for atomicity and durability.


 The transaction is said to have been committed at the point where the updated db-
pointer is written to disk.
 If the transaction fails at any time before db-pointer is updated, the old contents of the
database are not affected.
 Suppose that the system fails at any time before the updated db-pointer is written to
disk.
 Then, when the system restarts, it will read db-pointer and will thus see the original
contents of the database, and none of the effects of the transaction will be visible on
the database.
 Next, suppose that the system fails after db-pointer has been updated on disk.
 Before the pointer is updated, all updated pages of the new copy of the database were
written to disk.
 Thus, the atomicity and durability properties of transactions are ensured by the
shadow-copy implementation of the recovery-management component.

Concurrent Executions:

 Transaction-processing systems usually allow multiple transactions to run


concurrently.
 Allowing multiple transactions to update data concurrently causes several
complications with consistency of the data. There are two good reasons for
allowing concurrency:

• Improved throughput and resource utilization. A transaction consists of


many steps. Some involve I/O activity; others involve CPU activity.
Therefore, I/O activity can be done in parallel with processing at the CPU. All
of this increases the throughput of the system correspondingly; the processor
and disk utilization also increases.
• Reduced waiting time. There may be a mix of transactions running on a
system, some short and some long. Concurrent execution reduces the
unpredictable delays in running transactions. Moreover, it also reduces the
average response time: the average time for a transaction to be completed
after it has been submitted.
Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 5
UNIT –V DBMS

 The motivation for using concurrent execution in a database is essentially the same as
the motivation for using multiprogramming in an operating system.
 Let T1 andT2 be two transactions that transfer funds from one account to another.
Transaction T1 transfers $50 from account A to account B. It is defined as:
T1: read (A);
A: = A − 50;
write (A);
read (B);
B: = B + 50;
write (B).
 Transaction T2 transfers 10 percent of the balance from account A to account B. It is
defined as:
T2: read (A);
temp: = A * 0.1;
A: = A − temp;
write (A);
read (B);
B: = B + temp;
write (B).

 Suppose the current values of accounts A and B are $1000 and $2000, respectively.
 Suppose also that the two transactions are executed one at a time in the order T1
followed by T2. This execution sequence appears as follows:

Schedule 1—a serial schedule in which T1 is followed by T2.

 The final values of accounts A and B, after the execution in this figure takes place, are
$855 and $2145, respectively.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 6


UNIT –V DBMS

 Similarly, if the transactions are executed one at a time in the order T2 followed by
T1, then the corresponding execution sequence is as follows:

Schedule 2—a serial schedule in which T2 is followed by T1.

 Again, as expected, the sum A + B is preserved, and the final values of accounts A and
B are $850 and $2150, respectively.
 The execution sequences just described are called schedules.
 suppose that the two transactions are executed concurrently as follows:

Schedule 3—a concurrent schedule equivalent to schedule 1.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 7


UNIT –V DBMS

 After this execution takes place, we arrive at the same state as the one in which the
transactions are executed serially in the order T1 followed by T2. The sum A + B is
indeed preserved.
 Not all concurrent executions result in a correct state. To illustrate, consider the
schedule as follows:

Schedule 4—a concurrent schedule.


 After the execution of this schedule, we arrive at a state where the final values of
accounts A and B are $950 and $2100, respectively.
 This final state is an inconsistent state.
Serializability

The database system must control concurrent execution of transactions, to ensure that the
database state remains consistent. Since transactions are programs, it is computationally
difficult to determine exactly what operations a transaction performs and how operations of
various transactions interact. For this reason, we shall not interpret the type of operations that
a transaction can perform on a data item.

Instead, we consider only two operations: read and write. We thus assume that, between a
read(Q) instruction and a write(Q) instruction on a data item Q, a transaction may perform an
arbitrary sequence of operations on the copy of Q that is residing in the local buffer of the
transaction. Thus, the only significant operations of a transaction, from a scheduling point of
view, are its read and write instructions.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 8


UNIT –V DBMS

T1 T2

read(A)

write(A)

read(A)

write(A)

read(B)

write(B)
Schedule3-showing only the read and write instructions
read(B)

write(B)

Conflict Serializability

Let us consider a schedule S in which there are two consecutive instructions Ii and Ij, of
transactions Ti and Tj , respectively (i _= j). If Ii and Ij refer to different data items, then we
can swap Ii and Ij without affecting the results of any instruction in the schedule.

However, if Ii and Ij refer to the same data item Q, then the order of the two steps
maymatter.

Since we are dealing with only read and write instructions,there are four cases that we need
to consider:

1. Ii = read(Q), Ij = read(Q). The order of Ii and Ij does not matter, since the same value of Q
is read by Ti and Tj , regardless of the order.

2. Ii = read(Q), Ij = write(Q). If Ii comes before Ij, then Ti does not read the value of Q that is
written by Tj in instruction Ij. If Ij comes before Ii, then Ti reads the value of Q that is written
by Tj. Thus, the order of Ii and Ij matters.

3. Ii = write(Q), Ij = read(Q). The order of Ii and Ij matters for reasons similar to those of the
previous case.

4. Ii = write(Q), Ij = write(Q). Since both instructions are write operations, the order of these
instructions does not affect either Ti or Tj . However, the value obtained by the next read(Q)

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 9


UNIT –V DBMS

instruction of S is affected, since the result of only the latter of the two write instructions is
preserved in the database. If there is no other write(Q) instruction after Ii and Ij in S, then the
order of Ii and Ij directly affects the final value of Q in the database state that results from
schedule S.

The write(A) instruction of T1 conflicts with the read(A) instruction of T2.However, the
write(A) instruction of T2 does not conflict with the read(B) instruction of T1, because the
two instructions access different data items.

Since the write(A) instruction of T2 in schedule 3 does not conflict with the read(B)
instruction of T1, we can swap these instructions to generate an equivalent schedule.

We continue to swap non conflicting instructions:

• Swap the read(B) instruction of T1 with the read(A) instruction of T2.

• Swap the write(B) instruction of T1 with the write(A) instruction of T2.

• Swap the write(B) instruction of T1 with the read(A) instruction of T2.

T1 T2

read(A)

write(A)

read(A)

read(B) write(A)

If a schedule S can be transformed into a schedule S_ by a series of swaps of nonconflicting


write(B) read(B)
instructions, we say that S and S_ are conflict equivalent.
write(B)

We say that a schedule S is conflict serializable if it is conflict equivalent to a serial


schedule. Thus, schedule 3 is conflict serializable, since it is conflict equivalent to the serial
schedule 1.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 10


UNIT –V DBMS

T1 T2

read(A) read(A)

write(A) write(A)

read(B) read(B)

write(B) write(B)

Schedule 6—a serial schedule that is equivalent to schedule 3.

T3 T2
read(Q)
write(Q)

write(Q)

Schedule 7.

View Serializability

Consider two schedules S and S_, where the same set of transactions participates in both
schedules. The schedules S and S_ are said to be view equivalent if three conditions are met:

1. For each data item Q, if transaction Ti reads the initial value of Q in schedule S, then
transaction Ti must, in schedule S_, also read the initial value of Q.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 11


UNIT –V DBMS

2. For each data item Q, if transaction Ti executes read(Q) in schedule S, and if that value was
produced by a write(Q) operation executed by transaction Tj ,then the read(Q) operation of
transaction Ti must, in schedule S_, also read the value of Q that was produced by the same
write(Q) operation of transaction Tj .

3. For each data item Q, the transaction (if any) that performs the final write(Q) operation in
schedule S must perform the final write(Q) operation in schedule S_.

T1 T2

read(A) read(B)

A:=A-50 B:=B-10

Write(A) write(B)

read(B) read(A)

B:=B+50 A:=A+10
Schedule 8.
write(B) write(A)
The concept of view equivalence leads to the concept of view serializability. We say that a
schedule S is view serializable if it is view equivalent to a serial schedule.

T3 T4 T6

Read(Q) Write(Q) Write(Q)

Write(Q)

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 12


UNIT –V DBMS

Schedule 9—a view-serializable schedule.

Recoverability

We now address the effect of transaction failures during concurrent execution.If a transaction
Ti fails, for whatever reason, we need to undo the effect of thistransaction to ensure the
atomicity property of the transaction.

In a system that allowsconcurrent execution, it is necessary also to ensure that any transaction
Tj that isdependent on Ti (that is, Tj has read data written by Ti) is also aborted. To achieve
this surety, we need to place restrictions on the type of schedules permitted in the system.

Recoverable Schedules

Most database system require that all schedules be recoverable. A recoverable schedule is
one where, for each pair of transactions Ti and Tj such that Tj reads a data item previously
written by Ti, the commit operation of Ti appears before the commit operation of Tj .

Cascadeless Schedules

Even if a schedule is recoverable, to recover correctly from the failure of a transaction Ti, we
may have to roll back several transactions. Such situations occur if transactions have read
data written by Ti. As an illustration, consider the partial schedule

T8 T9
read(A) Read(A)

write(A)

read(B)

Schedule 10

T10 T11 T12

Read(A) Read(A) read(A)

Read(B) Write(A)

Write(A)

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 13


UNIT –V DBMS

Schedule 11

Transaction T10 writes a value of A that is read by transaction T11.Transaction T11 writes a
value of A that is read by transaction T12. Suppose that,at this point, T10 fails. T10 must be
rolled back. Since T11 is dependent on T10, T11must be rolled back. Since T12 is dependent
on T11, T12 must be rolled back.

Thisphenomenon, in which a single transaction failure leads to a series of transaction


rollbacks, is called cascading rollback.

Cascading rollback is undesirable, since it leads to the undoing of a significant amount of


work. It is desirable to restrict the schedules to those where cascading rollbacks cannot occur.
Such schedules are called cascadeless schedules.

Formally, acascadeless schedule is one where, for each pair of transactions Ti and Tj such
that Tj reads a data item previously written by Ti, the commit operation of Ti appears before
the read operation of Tj . It is easy to verify that every cascadeless schedule is also
recoverable.

Implementation of Isolation

There are various concurrency-control schemes that we can use to ensure that, even when
multiple transactions are executed concurrently, only acceptable schedules are generated,
regardless of how the operating-system time-shares resources (such as CPU time) among the
transactions.

As a trivial example of a concurrency-control scheme, consider this scheme: A transaction


acquires a lock on the entire database before it starts and releases the lock after it has
committed. While a transaction holds a lock, no other transaction is allowed to acquire the
lock, and all must therefore wait for the lock to be released.

As a result of the locking policy, only one transaction can execute at a time. Therefore, only
serial schedules are generated. These are trivially serializable, and it is easy to verify that they
are cascade less as well.

The goal of concurrency-control schemes is to provide a high degree of concurrency,while


ensuring that all schedules that can be generated are conflict or view serializable, and are
cascadeless.

Transaction Definition in SQL

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 14


UNIT –V DBMS

A data-manipulation language must include a construct for specifying the set of actions that
constitute a transaction.

The SQL standard specifies that a transaction begins implicitly. Transactions are ended by
one of these SQL statements:

• Commit work commits the current transaction and begins a new one.

• Rollback work causes the current transaction to abort.

The keyword work is optional in both the statements. If a program terminates without either
of these commands, the updates are either committed or rolled back— which of the two
happens is not specified by the standard and depends on the implementation.

The standard also specifies that the system must ensure both serializability and freedom from
cascading rollback.

The definition of serializability used by the standard is that a schedule must have the same
effect as would some serial schedule. Thus, conflict and view serializability are both
acceptable.

The SQL-92 standard also allows a transaction to specify that it may be executed in a manner
that causes it to become nonserializable with respect to other transactions.

Testing for Serializability

When designing concurrency control schemes, we must show that schedules generated by the
scheme are serializable. To do that, we must first understand how to determine, given a
particular schedule S, whether the schedule is serializable We now present a simple and
efficient method for determining conflict serializability of a schedule.

Consider a schedule S. We construct a directed graph, called a precedence graph, from S.

This graph consists of a pair G = (V, E), where V is a set of vertices and E is a set of edges.
The set of vertices consists of all the transactions participating in the schedule.

The set of edges consists of all edges Ti →Tj for which one of three conditions holds:

1. Ti executes write(Q) before Tj executes read(Q).

2. Ti executes read(Q) before Tj executes write(Q).

3. Ti executes write(Q) before Tj executes write(Q).

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 15


UNIT –V DBMS

.
T1 T2 T2 T1

(a) (b)

Precedence graph for (a) schedule 1 and (b) schedule 2

If an edge Ti → Tj exists in the precedence graph, then, in any serial schedule S_equivalent to
S, Ti must appear before Tj .

T1 →T2, because T1 executes read(A) before T2 executes write(A). It also contains the edge
T2 → T1, because T2 executes read(B) before T1 executes write(B).If the precedence graph
for S has a cycle, then schedule S is not conflict serializable.If the graph contains no cycles,
then the schedule S is conflict serializable.

A serializability order of the transactions can be obtained through topological sorting,


which determines a linear order consistent with the partial order of the precedence graph.
There are, in general, several possible linear orders that can be obtained through a topological
sorting.

Thus, to test for conflict serializability, we need to construct the precedence graph and to
invoke a cycle-detection algorithm.

Cycle-detection algorithms can be found in standard textbooks on algorithms. Cycle-


detection algorithms, such as those based on depth-first search, require on the order of n2
operations, where n is the number of vertices in the graph (that is, the number of
transactions). Thus, we have a practical
scheme for determining conflict serializability.

T1 T2

Precedence graph for schedule 4.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 16


UNIT –V DBMS

Testing for view serializability is rather complicated. In fact, it has been shown that the
problem of testing for view serializability is itself NP-complete. Thus, almost certainly there
exists no efficient algorithm to test for view serializability.

Concurrency Control

 Lock-Based Protocols
 Graph-Based Protocols
 Timestamp-Based Protocols
 Multiple Granularity
 Multiversion Protocols
 Deadlock Handling
Lock-Based Protocols:

 One way to ensure serializability is to require that data items be accessed


in a mutually exclusive manner, while one transaction is accessing a data
item, no other transaction can modify it.
 Lock is the most common mechanism to implement this requirement.
Lock:

 Mechanism to control concurrent access to a data item.


 Lock requests are made to concurrency-controlmanager.
 Transaction can proceed only after request is granted.Data items can be
locked in two modes:
exclusive mode (X): Data item can be both read as well as written.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 17


UNIT –V DBMS

X-lock is requested using lock-X(A) instruction.

shared mode (S): Data item can only be read. S-lock is requested

using lock- S(A) instruction. Locks can be released:U-lock(A)

Locking protocol:

A set of rules followed by all transactions while requesting and releasing locks.
Locking protocols restrict the set of possible schedules.Ensure serializable schedules by
delaying transactions that might violate serializability.

Lock-compatibility matrix tells whether two locks are compatible or not.Any number of
transactions canhold shared locks on a data item. If any transaction holds an exclusive
lock on a data item no other transaction may hold any lock on that item.

Locking Rules/Protocol:

 A transaction may be granted a lock on an item if the requested lock is


compatible with locks already held on the item by other transactions.
 If a lock cannot be granted, the requesting transaction is made to wait till all
incompatible locks held by other transactions have been released. The lock is
then granted.
Lock 2

Lock 1

Pitfalls of Lock-Based Protocols:

 Too early unlocking can lead to non-serializable schedules.Too late unlocking


can lead to deadlocks.
Example

Transaction T1 transfers $50 from account B to account A.Transaction T2 displays the


total amount of money in accounts A and B, that is, the sum of A + B.

Early unlocking can cause incorrect results, non-serializable schedules.if A and B get
updated in-between the read of A and B, the displayed sum would be wrong.

e.g., A = $100, B = $200, display A + B shows $250

1. X-lock(B)

2. read B

3. B := B-50

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 18


UNIT –V DBMS

4. write B

5. U-lock(B)

6. S-lock(A)

7. read A

8. U-lock(A)

9. S-lock(B)

10. read B

11. U-lock(B)

12. display A + B

13. X-lock(A)

14. read A

15. A := A+50

16. write A

17. U-lock(A)

T1 T2

Late unlocking causes deadlocks.Neither T1 nor T2 can make progress: executing lock-
S(B) causes T2 to wait for T1 to release its lock on B. executing lock-X(A) causes T1 to
wait for T2 to release its lock on A.To handle a deadlock

one of T1 or T2 must be rolled back and its locks released.

1. X-lock(B)

2. read(B)

3. B := B-50

4. write(B)

5. S-lock(A)

6. read(A)

7. S-lock(B)

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 19


UNIT –V DBMS

8. X-lock(A)

T1 T2

Two-Phase Locking Protocol:

A locking protocol that ensures conflict-serializable schedules. It works in two

phases:

Phase 1: Growing Phase: transaction may obtain locks transaction may not release locks

Phase 2: Shrinking Phase: transaction may release locks transaction may not obtain locks

 When the first lock is released, the transaction moves from phase 1 to phase 2.
 Properties of the Two-Phase Locking Protocol. Ensures serializability It can be
shown that the transactions can be serialized in the order of their
lock points (i.e. the point where a transaction acquired its final lock).

 Does not ensure freedom from deadlocks. Cascading roll-back is possible.


Modifications of the two-phase locking protocol
Strict two-phase locking

* A transaction must hold all its exclusive locks till it commits/aborts

* Avoids cascading roll-back

Rigorous two-phase locking

All locks are held till commit/abort. Transactions can be serialized in the order in
which they commit.
 Refine the two-phase locking protocol with lock conversions
Phase 1:Acquire a lock-S on item can acquire a lock-X on item can convert a lock-S to a
lock-X (upgrade)

Phase 2: can release a lock-S,can release a lock-X,can convert a lock-X to a lock-S


(downgrade)

* Ensures serializability; but still relies on the programmer to insert the various locking
instructions.

*Strict and rigorous two-phase locking (with lock conversions) are used extensively in
DBMS.

Automatic Acquisition of Locks:

 A transaction Ti issues the standard read/write instruction without explicit locking


calls (by the programmer).

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 20


UNIT –V DBMS

 The operation read(D) is processed as:


if Ti has a lock on D then

read(D)

else

if necessary wait until no other transaction has a lock-X on D;

grant Ti a lock-S on D;

read(D);

end

 write(D) is processed as:


if Ti has a lock-X on D then

write(D)

else

 if necessary wait until no other transaction has any lock on D;


if Ti has a lock-S on D then

upgrade lock on D to lock-X

else

grant Ti a lock-X on D;

end

write(D);

end

All locks are released after commit or abort

Implementation of Locking:

 A lock manager can be implemented as a separate process to which transactions


send lock and unlock requests.
 The lock manager replies to a lock request by sending a lock grant message (or a
message asking the transaction to roll back, in case of a deadlock).
 The requesting transaction waits until its request is answered.
 The lock manager maintains a data structure called a lock table to record granted
locks and pending requests.
Lock table:

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 21


UNIT –V DBMS

 Implemented as in-memory hash table indexed on the data item being locked. Black
rectangles indicate granted locks.
 White rectangles indicate waiting requests. Records also the type of lock
granted/requested.
 Processing of requests:
 New request is added to the end of the queue of requests for the data item, and
granted if it is compatible with all earlier locks.
 Unlock requests result in the request being deleted, and later requests are
checked to see if they can now be granted.
 If transaction aborts, all waiting or grantedrequests of the transaction are
deleted. Index on transaction to implement this efficiently.
Graph-Based Protocols:

 Impose a partial order (on the set D = {d1, d2 ,..., dh} of all data items.
 If di dj then any transaction accessing both di and dj must access di before
accessing dj.
 Implies that the set D may now be viewed as a directed acyclic graph, called
adatabase graph. Are an alternative to two-phase locking.Ensure conflict
serializability.

Tree-protocol: A simple kind of graphbased protocol which works as follows:


Only exclusive locks lock-X are allowed.

The first lock by Ti may be on any data item.

Subsequently, a data item Q can be locked by Ti only if the parent of Q is
currently locked by Ti.
 Data items may be unlocked at any time. A data item that has been locked and
unlocked by Ti cannot subsequently be relocked by Ti.
Example: The following 4 transactions follow the treeprotocol on the database graph below.

 T10: lock-X(B); lock-X(E); lock-X(D); unlock(B); unlock(E); lock-


X(G); unlock(D); unlock(G);
 T11: lock-X(D); lock-X(H); unlock(D); unlock(H);
 T12: lock-X(B); lock-X(E); unlock(E); unlock(B);
 T13: lock-X(D); lock-X(H); unlock(D); unlock(H);

 The tree protocol is ensures conflict serializability, ensures freedom from deadlock.
the abort of a transaction might lead to cascading rollbacks Unlocking may occur
earlier in the tree-locking protocol than in the two-phase locking protocol.
 shorter waiting times and increase in concurrency however, in the tree-protocol a
transaction may have to lock data items that it does not access. increased locking
overhead and additional waiting time potential decrease in concurrency
 Schedules not possible under two-phase locking are possible under tree protocol and
vice versa.
TIMESTAMP BASED PROTOCOL:
Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 22
UNIT –V DBMS

 Timestamp based protocol is the locking protocols and the order between every
pair of conflicting transactions at execution time by the first that both members
of the pairs request that involves incompatible modes.
 Another method for determining the serializability order is to select an ordering
among transaction in advance.The most common method for doing so is to use a
timestamp ordering scheme.
TIMESTAMPS:

 This timestamp is assigned by the database system before the transaction Ti


starts execution .
 If a transaction Ti has been assigned timestampTS(Ti)and a new transaction
Tj enters the system,thenTS(Ti)<TS(Tj).There are two simple method for
implementing this scheme.
1. Use the value of the system clock as the timestamp,(i.e) a transactions
timestamp is equal to the value of the clock when the transaction enters
the system.
2. Use a logical counter that is incremented after a new timestamp
has been assigned and a transaction timestamp is equal to the value of
the counter when the transaction enter the system.

To implement this scheme we associate with each data item Q two timestamp values.

 W-timestamp(Q) denotes the largest timestamp of any transaction that executed


write(Q) successfully.
 R-timestamp(Q) denotes the largest timestamp of any transaction that executed
read(Q) successfully.

THE TIMESTAMP-ORDERING PROTOCOL:

The timestamp-ordering protocol ensures that any conflicting read and write operations
are executed in timestamp order.This protocols operates as follows:

1.Suppose that transaction Ti issues read(Q).

a.If TS(Ti)<W-timestamp(Q), then Ti needs to read a value of Q that was already


overwritten.hence the read operation is rejected and Ti is rolled back.

b.If TS(Ti)=>W-timestamp(Q) then the read operation is executed and R-


timestamp(Q) is set to the maximum of R-timestamp(Q) and TS(Ti).

2. Suppose that transaction Ti issues write(Q).

a.If Ts(Ti)<R-timestamp(Q) the the value of Q thatTi is producing was needed


previously and the system assumed that the value would never be produced.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 23


UNIT –V DBMS

b.If TS(Ti)<W-timestamp(Q), then Ti is attempting to write an obsolete value of


Q.hence the system rejects this write operation and rolls Ti back.

c.Otherwise the system executes the write operation and sets W-timestamp(Q)to
TS(Ti).

The protocol can generate schedules that are not recoverable.However it can be
extended to make the schedules recoverable in one several ways

 Recoverability and cascadelesssness can be ensured by performing all writes


together at the end of the transaction.
 Recoverability and cascadelessness can also be guaranteed by using a limited
form of locking whereby reads of uncommitted items are postponed until the
transaction the updated the item commits.
 Recoverability alone can be ensured by tracking uncommitted writes and
allowing a transaction Ti to commit only after the commit of any transaction that
wrote a value that Ti read.
Example: The following schedule is possible under the

timestamp ordering protocol. Since TS(T14) < TS(T15), the schedule must be conflict
equivalent to schedule <T14,T15>

read(B)

read(B)

B := B – 50

write(B)

read(A)

read(A)

display(A+B)

A := A + 50

write(A)

display(A+B)

T14 T15

Thomas’ Write rule:

 Let us consider schedule 4 and apply the timestamp ordering protocol.Since


T16 starts before T17 we shall assume that TS(T16)<TS(T17).

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 24


UNIT –V DBMS

 The read(Q) operation of T16 succeeds as does the write (Q) operation,we find
that TS(T16)<W-timestamp(Q),since W-timestamp(Q)=TS(T17).
 Thus the write (Q) by T16 is rejected and transaction T16 must be rolled back.
 Although the rollback of T16 is required by the timestamp ordering protocol it
is unnecessary.Since T17 has already written Q,the value that T16 is
attempting to write is one that will never need to be read.
 Any transaction Ti with TS(Ti)<TS(T17) that attempts a read(Q) will be rolled
back.Since TS(Ti)<W-timestamp(Q).

The modification to the timestamp-ordering protocol called Thomas write rule is


this.Suppose that transaction Ti issues write(Q).

1.If TS(Ti)<R-timestamp(Q),then the value of Q that Ti is producing was previously


needed and it had been assumed that the value would never be produced.Hence the
system rejects the write operation and rolls Ti back.

2.If TS(Ti)<W-timestamp(Q) then Ti is attempting to write an obsolute value of


Q.Hence the write operation can be ignored.

3.Otherwise the system executes the write operation and sets W-timestamp(Q) to
TS(Ti).Under Thomas’ write rule the write(Q) operation of T16 would be ign ored.The
result is a schedule that is view equivalent to the serial schedule<T16,T17>.

VALIDATION-BASED PROTOCOLS:

 Validation based protocol is a majority of transaction are read only transaction


the rate of conflicts among transaction may be low.
 Thus may of this transaction if executed without the supervision of a
concurrency control schemewould nevertheless leave the system in a consistent
state.
 A concurrency control scheme imposes overhead of code execution and possible
delay of transactions.we assume that each transaction Ti executes in two or three
different phases are in order.
1.Read Phase:During the phase the system executes transaction Ti. It reads the values
of the various data items and stores them in variable local to Ti. It perform all write
operations on temporary local variables without update of the actual database.

2.Validation Phase: Transaction Ti performs a validation test to determine whether it


can copy to the database the temporary local variables that hold the results of write
operation without causing a violation of serializability.

3.Write Phase: If transaction Ti succeeds in valdation (step 2), then the system applies
the actual updates to the database.Otherwise the system rolls back Ti.

To perfom the validation test we need to know when the various phases of transaction
Ti took place and then the associate three different timestamps with transaction Ti.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 25


UNIT –V DBMS

1.Start(Ti) is the time when Ti started its execution.

2.Validation(Ti) is the time when Ti finished its read phase and started

its validation phase.

3.Finish(Ti) the time when Ti finished its write phase.

Thus the value TS(Ti)=Validation(Ti) and if TS(Tj)<TS(Tk) then any produced


schedule must be equivalent to a serial schedule in which transaction Tj appears before
transaction Tk.The validation test for transaction Tj requires that for all transaction Ti
with TS(Ti)<TS(Ti) one of the following two condition must hold.

1.Finish(Ti)<Start(Tj).Since Ti completes its execution before Tj started the


serializability order is indeed maintained.

2.The set of data items written by Ti does not intersect with the set of data items read
by Tj and Ti completes its write phase before Tj starts its validation
phase(start(Ti)<validation(Tj)).This condition ensures that the writes of Ti and Tj do
not overlap.

The validation scheme is called the optimistic concurrency control scheme since
transactions execute optimistically assuming the will be able to finish execution and
validate at the end.In constrast locking and timestamp ordering are pessimistic in that
they force a wait or a rollback whenever a conflict is detected and even though there is a
chance that the schedule may be conflict serializable.

Multiple Granularity

 Instead of locks on individual data items, sometimes it is advantageous to group


several data items and to treat them as one individual synchronization unit (e.g. if a
transaction accesses the entire DB).
 Define a hierarchy of data granularities of different size, where the small
granularities are nested within larger ones.
 Can be represented graphically as a tree When a transaction locks a node in the tree
explicitly, it implicitly locks all the node's descendents in the same mode.
Example: Graphical representation of a hierarchy of Granularities The highest level is the
entire database.The levels below are of type area, file and record in that order.

Granularity of locking (= level in tree where locking is done): fine granularity (lower in
tree): high concurrency, high locking overhead. coarse granularity (higher in tree): low
locking overhead, low concurrency

Multiversion Protocols:

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 26


UNIT –V DBMS

Concurrency control protocols studied thus far ensure serializability by either delaying an
operation or aborting the transaction.

Multiversion schemes keep old versions of data items to increase concurrency. Each
successful write(Q) creates a new version of Q.Timestamps are used to label versions.When
a read(Q) operation is issued, select an appropriate version of Q based on the timestamp of
the transaction. reads never have to wait as an appropriate version is available. Two types of
multiversion protocols

 Multiversion timestamp ordering


 Multiversion two-phase locking
Multiversion Timestamp Ordering:

Each data item Q has a sequence of versions<Q1,Q2,....,Qm>. Each version Qk contains 3


data fields:

 Content – the value of version Qk.


 W-timestamp(Qk) – timestamp of the transaction that created(wrote) version Qk
 R-timestamp(Qk) – largest timestamp of transaction that successfully read version
Qk
 When a transaction Ti creates a new version Qk of Q,the W-timestamp and R-
timestamp of Qk are initialized to TS(Ti).
 R-timestamp of Qk is updated whenever a transaction Tj reads Qk, and TS(Tj) > R-
timestamp(Qk).
The following multiversion timestamp-ordering protocol ensures serializability.

1. If transaction Ti issues a read(Q), then the value returned is the content of version Qk,
which is the version of Q with the largest write timestamp less than or equal to TS(Ti)

2. If transaction Ti issues a write(Q):

– If TS(Ti) < R-timestamp(Qk), then transaction Ti is rolled back.

– Otherwise, if TS(Ti) = W-timestamp(Qk), the contents of Qk are overwritten.

– Otherwise a new version of Q is created.

Properties of the multiversion timestamp-ordering protocol reads always succeed and


never have to wait A transaction reads the most recent version that comes before it in time.In
a typical DBMS reading is a more frequent operation than writing, hence this advantage
might be significant.

write: A transaction is aborted if it is “too late” in doing a

write, a write by Ti is rejected if another transaction Tj that should read Ti's

write has already read a version created by a transaction older than Ti.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 27


UNIT –V DBMS

Disadvantages

 Reading of a data item also requires the updating of the Rtimestamp,resulting in two
disk accesses rather than one.
 The conflicts between transactions are resolved through rollbacks rather than through
waits.

DEADLOCK HANDLING:

 A System is in a deadlock state if there exits a set of transactions such that


every transaction in the set is waiting for another transaction in the set.
 More prescisely there exists a set of waiting transaction{To,T1…..Tn} such that
To is waiting for a data item that T1 holds and T1 is waiting for a data item
thatT2 holds and Tn-1 is waiting for a data item that Tn holds and Tn is
waiting for a data item that To holds.
 There are two principal method for dealing with the deadlock problem.
1.Deadlock Prevention

2.Deadlock detection and deadlock recovery.

Consider the following two transactions:

T1: write (X) T2: write(Y)

write(Y) write(X)

 Schedule with deadlock

T1 T2

lockX

on X

write (X)

lockX

on Y

write (Y)

wait for lockX

on X

wait for lockX

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 28


UNIT –V DBMS

on Y

DEADLOCK PREVENTION:

 There are two approaches to deadlock prevention.One approach ensures that no cyclic
waits can occurby ordering the request for locks or requiring all locks to be acquired
together.

 The other approach is closer to deadlock recovery and performs transaction rollback
instead of waiting for a lock whenever the wait could potentially result in a deadlock.

 The first approach requires that each transaction locks all its data items before it begins
exection.There are two main disadvantages to this protocol.

1.It is often hard to predict before the transaction begins what data items need to be
locked.

2.Data-items utilization may be very low,since many of the data items may be locked
but unused for a long time.

 The second approach fo preventing deadlocks is to use preemption and


transaction rollbacks.

 In preemption when a transaction T2 requests a lock that transaction T1 holds


the lock granted to T1 may be preempted by rolling back of T1 and granting of
the lock to T2.

Two different deadlock prevention schemes using timestamps have been proposed.
1.The wait-die scheme is a nonpreemptive technique.When transaction Ti
request a data item currently held byTj,Ti is allowed to wait only if it has timestamp
smaller than that of Tj and otherwise Ti is rolled.

2.The wound-wait scheme is a preemptive technique.It is a counterpart to the


wait-die scheme.when transaction Ti request a data item currently held by Tj,Ti is
allowed to wait only if it has a timestamp larger than that of Tj.

Whenever the system rolls back transaction it is important to ensure that there
is no starvation and no transaction gets rolled back repeatedly and is never allowed to
make progress.

TIMEOUT-BASED SCHEMES:

 Another simple approach to deadlock handling is based on lock timeouts.In


this approach a transaction that has requested a lock waits for at specified
amount of time.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 29


UNIT –V DBMS

 The timeout scheme is particularly easy to implement and works well if


transaction are short and if long waits are likely to be due to deadlocks.
 Too long a wait result in unnecessary delays once a deadlock has occurred.
 Too short a wait result in transaction rollback even when there is no
deadlock leading to wasted resources.starvation is also a possibility with
scheme.The timeout-based scheme has limited applicability.
DEADLOCK DETECTION AND RECOVERY: When a deadlock is detected, the
system must recover from the deadlock. The most common solution is to roll back one
or more transactions to break the deadlock. Three actions are required:

1.Selection of a victim: Select that transaction(s) to roll back that will incur minimum
cost.

2.Rollback: Determine how far to roll back transaction.Total rollback: Abort the
transaction and then restart it. More effective to roll back transaction only as far as
necessary to break deadlock.

3.Check Starvation: happens if same transaction is always chosen as victim. Include the
number of rollbacks in the cost factor to avoid starvation

DEADLOCK DETECTION:

Deadlock can be described precisely in terms of a directed graph called a


wait for graph.This graph consists of a pair G=(V,E),where V is a set of vertices and E
is a set of edges.The set of vertices consists of all the transaction in the system.Each
element in the set E of edges is an ordered pair Ti->Tj.IfTi->Tjis in E,then there is a
directed edge from transactionTi to Tj implying that transactionti is waiting for
transaction Tj to release a data itemthat it needs.To illustrate this consepts consider the
wait for graph.

 Transaction T25 is waiting for transaction T26 and T27.


 Transaction T27 is waiting for transaction T26.
 Transaction T26 is waiting for transaction T28.
RECOVERY FROM DEADLOCK:

When a detection algorithm determines that a deadlock exists the system must
recover from the deadlock.The most common solution is to roll back one or more
transaction to break the deadlock.

1.Selection of a Victim:Given a set of deadlocked transactions, we must determine


which transaction to roll back to break the deadlock.We should roll back those
transactions that will incur the minimum cost.

2.Rollback:Once we have decided that a particular transaction must be rolled back


we must determine how far this transaction should be rolled back.The simplest solution

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 30


UNIT –V DBMS

is a total rollback.However it is more effective to roll back the transaction only as far as
necessary to break the deadlock.Such partial rollback requires the system to maintain
additional information about the state of all the running transactions.

3.In a system where the selection of victims is based primarly no cost factors.It may
happen that the same transaction is always picked as a victim.As a result this
transaction never completes its designated task thus there is a starvation.The most
common solution is to include the number of rollback is in the cost factors.

INSERT AND DELETE OPERATIONS:

 This restriction limits transactions to data items already in the database.some


transaction require not only access to existing data itemsbut also the ability to create
new data items.

 Other require the ability to delete data items.To examine how such transactions affect
concurreny control,we introduce these additional operations.

 Delet(Q) deletes data item Q from the database.

 Insert(Q) inserts a new data item Q into the database andassigns Q an initial value.

 An attempt by a transaction Ti to perform a read(Q) operation after q has been deleted


results in a logical error in Ti.

 Likewise an attempt by a transaction Ti to perform a read(Q) operation before Q has


been inserted results in a logical error in Ti.it is also logical error to attempt to delete a
nonexistent data item.

DELETION:

The presence of delete instructions affects concurrency control and we must decide
when a delete instruction conflicts with another instruction.Let Ii and Ij be instruction
of Ti and Tj respectively that appear in schedule S in consecutive order.let Ii=delete(Q).

 Ij=read(Q),Ii and Ij conflict.If Ii comes before Ij,Tj will have a logical


error.If Ij comes before Ii,Tj can execute the read operation
successfully.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 31


UNIT –V DBMS

 Ij=write(Q) Ii and Ij conflict.If Ii comes before Ij,Tj will have a


logical error.If Ij comes before Ii,Tj can execute the write operation
successfully.
 Ij=Delete(Q) Ii and Ij conflict.If Ii comes before Ij,Ti will have a
logical error.If Ij comes before Ii,Ti will have a logical error.
 Ij=insert(Q).Ii and Ij conflict.Suppose that data item Q did not exit
prior to the execution of Ii and Ij.then Ii comes before Ij a logical
error results for Ti.If Ij come before Ii the no logical error result.

INSERTION:

Insert(Q) operation conflicts with a delete (Q) operation.similarly


insert(Q) conflicts with a read(Q)operation or a write (Q) operation.no read or write can
be performed on a data item before it exists.since an insert(q) assigns a value to data
item Q,an insert is treated similarly to a write for concurrency control purpose.

 Under the two phase locking protocol.ifTi performs an Insert(Q) operation


Ti is given an exclusive lock on the newly created data itemQ.

 Under the timestamp ordering protocol if Ti performs an insert(q) operation the


values R-timestamp(Q) and W-timestamp(Q) are set to Ts(Ti).

THE PHANTOM PHENOMENNON:

Consider the transaction t20 that executes the following SQL query on the bank
database.

Select sum(balance)

From account

Where branch_name=’perryridge’

Transaction T29 requires access to all tuples of the account relation pertaining to
the perryridge branch.

The major disadvantages of locking a data item corresponding to the relation is the
low degree of concurrency two transaction that insert different tuples into a relation
are prevented from executing concurrently. A better solution for the index locking
technique.The index locking protocol takes advandages of the availability of indices
on a relation by turning instances of the phantom phenomenon into conflict on locks

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 32


UNIT –V DBMS

on index leaf nodes.the protocol operates as follows.Every relation must have one
index.

A transaction Ti can access tuples of a relation only after first finding them
through one or more of the indices on the relation.
 A transaction Ti that performs a lookupmust acquire a shared lock on all the
index leaf nodes that it accesses.
 A transaction Ti may not insert delete or update a tuple ti in arelation r without
updating all indices on r.
 The rules of the two phase locking protocol must be observed.
RECOVERY SYSTEM

Recovery system:

Recovering a system from failure crash is called recovery system or crash recovery.

Failure Classification

There are various types of failure that may occur in a system.there are
 Transaction failure :
1. Logical errors:
The Transaction cannot complete due to some internal error condition

2. System errors:
The database system must terminate an active transaction due to an error
condition. (e.g., deadlock)
3. System crash:
A power failure or other hardware or software failure causes the system to crash.
Fail stop assumption:
A non-volatile storage contents are assumed to not be corrupted by system
crash.Database systems have numerous integrity checks to prevent corruption of disk data
Disk failure:
A head crash or similar disk failure destroys all or part of Disk . Destruction is assumed to
be detectable: disk drives use checksums to detect failures
Recovery Algorithms

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 33


UNIT –V DBMS

Recovery algorithms are techniques to ensure database consistency and transaction


atomicity and durability despite failures.
Recovery algorithms have two parts
1. Actions taken during normal transaction processing to ensure enough information exists to
recover from failures
2. Actions taken after a failure to recover the database contents to a state that ensures
atomicity, consistency and durability.
Storage Structure :
The various data items in the database may be stored and accessed in a number of
different storage media.
Storage types
The types are :
 Volatile : It does not survive system crashes.eg: main memory, cache memory

 Non volatile storage: It survives system crashes eg: disk, tape, flash memory, non-
volatile (battery backed up) RAM.

 Stable storage: A mythical form of storage that survives all failures approximated by
maintaining multiple copies on distinct nonvolatile media.
Stable Storage Implementation

 Maintain multiple copies of each block on separate disks copies can be at remote
sites to protect against disasters such as fire or flooding.

 Failure during data transfer can still result in inconsistent copies:


Block transfer can result in
 Successful completion :
Information arrived safely at its destination
 Partial failure:
Destination block has incorrect information

 Total failure:

 Destination block was never updated .


 Protecting storage media from failure during data transfer (one solution):
Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 34
UNIT –V DBMS

Execute output operation as follows (assuming two copies of each block):

 Write the information onto the first physical block.

 When the first write successfully completes, write the Same information onto the
second physical block.
 The output is completed only after the second write successfully completes.
Protecting storage media from failure during data transfer copies of a block may differ
due to failure during output operation.
To recover from failure:

First find inconsistent blocks:

1. Expensive solution: Compare the two copies of every disk


block.

2. Better solution: Record in-progress disk writes on non- volatile


storage (Nonvolatile RAM or special area of
disk).
 Use this information during recovery to find blocks that may be inconsistent,
and only compare copies of these.
 Used in hardware RAID systems If either copy of an inconsistent block is
detected to have an error (bad checksum), overwrite it by the other copy. If
both have no error, but are different, overwrite the second block by the first
block.
Data Access

 Physical blocks are those blocks residing on the disk.


 Buffer blocks are the blocks residing temporarily in main memory.

Block movements between disk and main memory are initiated through the
following two operations:

 input (B) transfers the physical block B to main memory.

 output(B) transfers the buffer block B to the disk, and replaces the
appropriate physical block there.

 Each transaction Ti has its private work-area in which local copies of all
data items accessed and updated by it are kept.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 35


UNIT –V DBMS

 Ti's local copy of a data item X is called xi.

We assume, for simplicity, that each data item fits in, and is stored inside, a single
block.

Transaction transfers data items between system buffer blocks and its private work-area
using the following operations :

 read(X) assigns the value of data item X to the local variable xi.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 36


UNIT –V DBMS

 write(X) assigns the value of local variable xi to data item {X} in the buffer block.

 Both these commands may necessitate the issue of an input(BX) instruction


before the assignment, if the block BX in
which X resides is not already in memory.

Transactions:

 Perform read(X) while accessing X for the first time;


 All subsequent accesses are to the local copy.After last access, transaction
executes write(X).
 output(BX) need not immediately follow write(X). System can perform the
output operation when it deems fit.

Recovery and Atomicity

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 37


UNIT –V DBMS

 Modifying the database without ensuring that the transaction will commit may
leave the database in an inconsistent state.

 Consider transaction Ti that transfers $50 from account A to account B;goal is


either to perform all database modifications made by Ti or none at all.

 Several output operations may be required for Ti (to output A and B). A failure
may occur after one of these modifications have been made but before all of
them are made.

To ensure atomicity despite failures, we first output information describing the


modifications to stable storage without modifying the database itself.

We study two approaches:

1. log based recovery

2. shadow paging

We assume (initially) that transactions run serially, that is, one after the other.

Log Based Recovery

 A log is kept on stable storage.

 The log is a sequence of log records, and maintains a record of update activities on
the database.
 When transaction Ti starts, it registers itself by writing a<Ti start>log record
 Before Ti executes write(X), a log record <Ti, X, V1, V2> is written, where V1 is
the value of X before the write, and V2 is the value to be written to x.
 Log record notes that Ti has performed a write on data item Xj Xj had value V1
before the write, and will have value V2 after the write.
 When Ti finishes it last statement, the log record <Ti commit> is written to .x

We assume for now that log records are written directly to stable storage (that is,
they are not buffered)

Two approaches using logs

 Deferred database modification


 Immediate database modification

Deferred Database Modification


Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 38
UNIT –V DBMS

The deferred database modification scheme records all modifications to the log, but
defers all the writes to after partial commit.

Assume that transactions execute serially


Transaction starts by writing <Ti start> record to log.
 A write(X) operation results in a log record <Ti, X, V> being written, where V is the
new value for X.
 Note: old value is not needed for this scheme.
 The write is not performed on X at this time, but is deferred. When Ti
partially commits, <Ti commit> is written to the log .
 Finally, the log records are read and used to actually execute the previously
deferred writes.
 During recovery after a crash, a transaction needs to be redone if and only if
both <Ti start> and<Ti commit> are there in the log.
 Redoing a transaction Ti ( redoTi) sets the value of all data items updated by
the transaction to the new values.
 Crashes can occur while the transaction is executing the original updates,
or while recovery action is being taken

example : transactions T0 and T1 (T0 executes before T1):

T0 : read (A)

T1 : read (C)

A: - A - 50 C:- C- 100

Write (A) write (C)

read (B)

B:- B + 50

write (B)

Below we show the log as it appears at three instances of time.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 39


UNIT –V DBMS

 If log on stable storage at time of crash is as in case:


(a) No redo actions need to be taken

(b) redo(T0) must be performed since <T0 commit> is present

(c) redo(T0) must be performed followed by redo(T1) since <T0 commit> and <Ti
commit> are present

Immediate Database Modification

 The immediate database modification scheme allows database updates of an


uncommitted transaction to be made as the writes are issued.
 since undoing may be needed, update logs must have both old value and new
value
 Update log record must be written before database item is written.
 We assume that the log record is output directly to stable storage
 can be extended to postpone log record output, so long as prior to execution
of an output(B) operation for a data block B, all log records corresponding to
items B must be flushed to stable storage.
 Output of updated blocks can take place at any time
before or after transaction commit.

 Order in which blocks are output can be different from


the order in which they are written.

Example :

Log input output

Log Write Output

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 40


UNIT –V DBMS

<T0 start>

<T0, A, 1000, 950>

To, B, 2000, 2050

A = 950

B = 2050

<T0 commit>

<T1 start>

<T1, C, 700, 600>

C = 600

BB, BC

<T1 commit>

BA

Note: BX denotes block containing X.x1

Recovery procedure has two operations instead of one:


undo(Ti) restores the value of all data items updated by Ti to their old values,
going backwards from the last log record for Ti
 redo(Ti) sets the value of all data items updated by Ti to the
new values, going forward from the first log record for Ti.

 Both operations must be idempotent That is, even if the operation is executed
multiple times the effect is the same as if it Is executed once

 Needed since operations may get re-executed during recovery


When recovering after failure:

 Transaction Ti needs to be undone if the log contains the record


 <Ti start>, but does not contain the record <Ti commit>.
 Transaction Ti needs to be redone if the log contains both the record
 <Ti start> and the record <Ti commit>.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 41


UNIT –V DBMS

Undo operations are performed first, then redo operations.

Recovery actions in each case above are:

.(a) undo (T0): B is restored to 2000 and A to 1000.

(b) undo (T1) and redo (T0): C is restored to 700, and then A and

B are set to 950 and 2050 respectively.

(c) redo (T0) and redo (T1): A and B are set to 950 and 2050

respectively. Then C is set to 600

checkpoints:

When a system failure occurs, we must consult the log to determine those transactions
that need to be redone and those that need to be undone. We need to search the entire
log to determine this information. There are two major difficulties with this approach:

1. The search process is time consuming.


2. Most of the transaction that, according to our algorithm, need to be redone have
already written their updates into the database.
To reduce these types of difficulties, we introduce checkpoints.
During execution, the system maintains the log, the system
periodically performs checkpoints, which require the following
sequence of action to take place:

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 42


UNIT –V DBMS

1. Output onto stable storage all log records currently residing in main memory.

2. Output to the disk all modified buffer blocks.


3. Output onto stable storage a log record <checkpoint>.

Transaction are not allowed to perform any update actions, such as


writing to a buffer block or writing a log record, while a checkpoint is
in progress.

Consider a transaction Ti that committed prior to the checkpoint.


For such a transaction, the <Ti commit> record appears in the log
before the <checkpoint> record. Any database modifications made by
Ti must been written to the database either prior o the checkpoint or as
part of the checkpoint itself.

Shadow Paging

shadow paging is an alternative to log-based recovery; this scheme is useful if


transactions execute serially

 It maintain two page tables during the lifetime of a transaction –the current page
table and the shadow page table
 Store the shadow page table in nonvolatile storage, such that state of the database
prior to transaction execution may be recovered.
 Shadow page table is never modified during execution. To start with, both the page
tables are identical. Only urrent page table is used for data item accesses during
execution of the transaction.

Whenever any page is about to be written for the first time.

 A copy of this page is made onto an unused page.


 The current page table is then made to point to the copy.
 The update is performed on the copy.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 43


UNIT –V DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 44


UNIT –V DBMS

To commit a transaction

1. Flush all modified pages in main memory to disk

2. output current page table to disk

3. Make the current page table the new shadow page table, as follows:

 keep a pointer to the shadow page table at a fixed (known) location on disk.
 To make the current page table the new shadow page table, simply update the pointer
to point to current page table on disk
 Once pointer to shadow page table has been written, transaction is committed.
 No recovery is needed after a crash — new transactions can start right away, using the
shadow page table.
 Pages not pointed to from current/shadow page table should be freed (garbage
collected).

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 45


UNIT –V DBMS

Advantages of shadow-paging over log-based schemes

 no overhead of writing log records


 recovery is trivial

Disadvantages :

 Copying the entire page table is very expensive


 Can be reduced by using a page table structured like a B+-tree
 No need to copy entire tree, only need to copy paths in the tree that lead to updated
leaf nodes
 Commit overhead is high even with above extension
 Need to flush every updated page, and page table
 Data gets fragmented (related pages get separated on disk)
 After every transaction completion, the database pages containing old
versions of modified data need to be garbage collected
 Easier to extend log based schemes

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 46

You might also like