Usit 304 Unit 4 Dbms 2022

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

Database

Management
System

MODULE-4: Transaction management and


Concurrency

Compiled by: Prof. Aasha Chavan

Vidyalankar School of
Information Technology
Wadala (E), Mumbai
www.vsit.edu.in
Certificate
This is to certify that the e-book titled “Database Management
Systems” comprises all elementary learning tools for a better understating of
the relevant concepts. This e-book is comprehensively compiled as per the predefined
eight parameters and guidelines.

Date:30-07-2022

Signature
Ms.Aasha Chavan
Assistant Professor
Department of IT

DISCLAIMER: The information contained in this e-book is compiled and distributed for
educational purposes only. This e-book has been designed to help learners understand
relevant concepts with a more dynamic interface. The compiler of this e-book and
Vidyalankar Institute of Technology give full and due credit to the authors of the contents,
developers and all websites from wherever information has been sourced. We acknowledge
our gratitude towards the websites YouTube, Wikipedia, and Google search engine. No
commercial benefits are being drawn from this project.
Unit IV
 Contents
 ACID properties
 Serializability and concurrency control
 Lock based concurrency control (2PL, Deadlocks)
 Time stamping methods
 Optimistic methods
 Database recovery management.

 Recommended Books
 Database System and Concepts A Silberschatz, H Korth, S Sudarshan McGrawHill
 Database Systems Rob Coronel Cengage Learning Twelfth Edition
 Programming with PL/SQL for Beginners H. Dand, R. Patil and T. Sambare
 Introduction to Database System C.J.Date

Unit IV Pre-requisites Linkage


Sem I Sem. II Sem. Sem. IV Sem. V Sem.
III VI
Transaction - Web DBMS Software Enterprise Project
Management Programming, Engineering, Java,
& Object Core Java Advanced
Concurrency Oriented Web
Control Programming Programming,
Project
Concept Of Transaction Management
• In simple SQL query, the SQL command (DML, DDL etc) sent to server and
executed one after another.
• In place of sending one by one SQL command to server we can combine multiple
operations that are logically similar and send to server as a single logical unit.
• Example: Transferring Rs 100 from one account to another
• Withdraw Rs. 100 from account_1
• Deposit Rs. 100 from account_2
• Collection of multiple operations that form a single logical unit is called as
transaction.
• A transaction is a sequence of one or more SQL statements that combined
together to form a single logical unit of work.

• Types of operations in transactions


• Read – This operation transfers one data item from the database memory to a
local buffer of the transaction that executed the read operation. Ex: Data
Selection/Retrieval Language
• SELECT * FROM STUDENTS
• Write – This operation transfers one data item from the local buffer of the
transaction that executed the write back to the database. Ex: Data Manipulation
Language (DML)
• UPDATE STUDENT SET NAME= ‘RAVI’ WHERE SID=102

• Information processing in DBMS is divided into individual, indivisible operational


logical units called transactions.
• Transactions are one of the mechanisms for managing changes to the database
• A transaction is a series of small database operations that together form a single large
operation.
• Application programs use transactions to execute sequences of operations when it is
important that all operations are successfully completed.

• During the money transfer between two bank accounts it is unacceptable for the
operation that updates the second account to fail.
• This would lead to transferred money being lost as it would be withdrawn from one
account but not inserted in the second account
• BEGIN TRANSACTION transfer
• UPDATE accounts
SET balance=balance-100
WHERE account=A
• UPDATE accounts
SET balance=balance+100
WHERE account=B
• If no errors, then
COMMIT transaction
• Else
ROLLBACK transaction
• End If
• END TRANSACTION transfer
Transaction Structure and Boundaries
• The transactions consist of all SQL operations executed between the begin
transaction and end transaction.
• A transaction is started by issuing a BEGIN TRANSACTION command. Once this
command is executed the DBMS starts monitoring the transaction. All operations are
executed after a BEGIN TRANSACTION command and are treated as a single large
operation.
• When transaction is completed it must either be committed by executing a COMMIT
command or rolled back by executing a ROLLBACK command.
• Until a transaction commits or rolls back the database system remains unchanged to
other users of the system. Therefore, the transaction can make as many changes as
it wishes but none of the updates are reflected in the database until the transaction
completes or fails.
• A transaction that is successful and has encountered no errors is committed that is all
changes to the database are made permanent and become visible to other users of
the database.
• A transaction that is unsuccessful and has encountered some type of errors is rolled
back that is all changes to the database are undone and the database remains
unchanged by transaction.
• The DBMS guarantees that all the operations in the transaction either complete or fail.
When a transaction fails all its operations are undone and the database is returned to
the state it was in before the transaction started.

Need Of Transaction
• A database management system (DBMS) should be able to survive in a system failure
conditions that is if a DBMS is having a system failure, then it should be possible to
restart the system and the DBMS without losing any of information contained in the
database.
• Modern DBMS should not loose data because of system failure.
• A DBMS must manage the data in the database so that as users change the data it
does not become corrupt and unavailable.
• Computer systems fail to work correctly for many of below reasons
• Software Errors – Are normally the mistakes in the logic of the program code.
E.g., when the value of a variable is set to an incorrect value.
• Hardware Errors – They occur when a component of the computer fails for
e.g., a disc drive.
• Communication Errors – In large systems programs may access data stored
at numerous locations on a network. The communication links between sites
on the network can fail to work.
• Conflicting Programs – When a system executes more than one program at
the same time the program may access and change the same data for e.g.,
many users may have access to be one set of data.
• Transaction processing is designed to maintain a database in a consistent state.
• It ensures that any operations carried out on the system that is interdependent.
• Transactions are either completed successfully or all aborted successfully.
• Transaction processing is meant for concurrent execution and recovery from possible
system failure in a DBMS.
Fundamental Properties (ACID) Of Transaction
• To understand transaction properties, we consider a transaction of transferring 100
rupees from account A to account B as below
• Let Ti be a transaction that transfers 100 from account A to account B. This transaction
can be defined as
• Read balance of account A
• Withdraw 100 rupees from account A and write back result of balance update
• Read balance of account B
• Deposit 100 rupees from to account B and write back result of balance update

Read (A)

A := A – 100

Write (A)

Ti
Read (B)

B := B +100

Write (B)

Atomicity
• Transaction must be treated as single unit of operation
• That means when a sequence of operations is performed in a single transaction they
are treated as a single large operation
• Examples
• Withdrawing money from the account
• Making an airline reservation
• The term atomic means things that cannot be divided into parts
• Similarly, the execution of a transaction should either be complete or nothing should
be executed at all
• No partial transaction executions are allowed
• Example: Money transfer in above example
• Suppose some type of failure occurs after Write(A) but before Write(B) then system
may lose 100 rupees in calculation which may cause error as sum of original
balance(A+B) in accounts A and B is not preserved. In such case database should
automatically restore original value of data items
• Example: Making an airline reservation
• Check availability of seats in desired flights. Airline confirms the reservation. Reduces
number of available seats. Charges the credit card of the customer.
• In above case either all above changes are made to database or nothing should be
done as half-done transaction may leave data as incomplete state
• If one part of transaction fails, the entire transaction fails, and the database state is
left unchanged

Consistency
• Consistent state is a state in which only valid data will be written to the database
• If due to some reasons a transaction is executed that violates the databases
consistency rules, the entire transaction will be rolled back and the database will be
restored to a state consistent with those rules
• On the other hand, if a transaction is executed successfully it will take the database
from one consistent state to another consistent state
• DBMS should handle an inconsistency rather than ensure the database is clean at the
end of each transaction
• Consistency guarantees that a transaction will never leave your database in a half
finished state
• Consistency means if one part of the transaction fails all of the pending changes are
rolled back, leaving the database as it was before the transaction was initiated.
• Example: Money Transfer in above example
• Initially the total balance in account A is 1000 and B is 5000 so the sum of balance in
both accounts is 6000 and while carrying out above transaction some type of failure
occurs after Write (A) but before Write(B) then system may lose 100 rupees in
calculation. As now sum of balance in both accounts is 5900 (which should be 6000)
which is not a consistent result which introduces inconsistency in database
• Example Student Database
• When a student record is deleted his details from all the other associated records must
get deleted. A properly configured database would not let delete the student record,
leaving its associated records stranded
• All the data involved in the operation must be left in consistent state upon completion
or roll back of the transaction, database integrity cannot be compromised

Isolation
• Isolation property ensures that each transaction must remain unaware of other
concurrently executing transaction.
• Isolation property keeps multiple transactions separated from each other until they are
completed.
• Operations occurring in a transaction are invisible to other transactions until the
transaction commits or rolls back.
• For example, when a transaction changes a bank account balance other transactions
cannot see the new balance until the transaction commits.
• Different isolation levels can be set to modify this default behaviour.
• Transaction isolation is generally configurable in a variety of modes. For example, in
one mode a transaction blocks until the other transaction finishes.
• Even though many transactions may execute concurrently in the system. System must
guarantee that for every transaction (T i) other all transactions has finished before
transaction (Ti) started or other transactions are started execution after transaction (T i)
finished.
• That means each transaction is unaware of other transactions executing in the system
simultaneously.
• Example: Money transfer in above example
• The database is temporarily inconsistent while above transaction is executing
with the deducted total written to A and increased total is not written to account
B. If some other concurrently running transaction reads balance of account A
and B at this intermediate point and computes A+B it will observe an
inconsistent value (Rs. 5900). If that other transaction wants to perform updates
on accounts A and B based on inconsistent values (Rs. 5900) that it read, the
database may be left in an inconsistent state even after both transactions have
been completed.
• A way to avoid the problem of concurrently executing transactions is to execute one
transaction at a time.

Durability
• Durability property guarantees that the database will keep track of pending changes
in such a way that the server can recover from an abnormal termination
• Even if the database server is unplugged in the middle of the transaction it will return
to consistent state when its restarted
• The database handles this by storing uncommitted transactions in a transaction log.
By virtue of consistency a partially completed transaction won’t be written to the
database in the event of an abnormal termination. However, when the database is
restarted after such a termination it examines the transaction log for completed
transactions that had not been committed and applies them
• The results of a transaction that has been successfully committed to the database will
remain unchanged even after database fails
• Changes made during the transaction are permanent once the transaction commits
• Example – Once the execution of the transaction completes successfully and the user
will be notified that the transfer of the amount has taken place if there is no system
failure in this transfer of funds
• The durability property guarantees that once a transaction completes successfully all
the changes made by transaction on the database persist even if there is a system
failure after the transaction completes execution

 ACID Properties:

Source - ACID Properties: https://www.youtube.com/watch?v=dZc6CP-x2d0


Working Of Transaction Processing – Transaction State
Diagram:
• If a transaction completes successfully it may be saved on to the database server
called as committed transaction
• A committed transaction transfers database from one consistent state to other
consistent state which must persist even if there is a system failure
• Transaction may not always complete its execution successfully and such transaction
must be aborted
• An aborted transaction must not have any change that the aborted transaction made
to the database. Such changes must be rolled back
• Once a transaction is committed we cannot undo its effects by aborting it
• A transaction must be in one of the following states

State Diagram of a Transaction

• Active
• This is initial state of transaction execution
• As soon as transaction execution starts it is in active state
• Transaction remains in this state till transaction finishes
• Partially Committed
• As soon as last operation in transaction is executed transaction goes to partially
committed state
• At this condition the transaction has completed its execution and ready to
commit on database server, but it is possible that it may be aborted as the
actual output may be there in main memory and thus a hardware failure may
prohibit its successful completion
• Failed
• A transaction enters a failed state after the system determines that the
transaction can no longer proceed with its normal execution
• Example In case of hardware or logical errors while execution
• Aborted
• Transaction has been rolled back restoring to prior state
• Failed transaction must be rolled back. Then it enters the aborted state. In this
stage the system has two options
• Restart the transaction – A restarted transaction is a new transaction
which may recover from possible failure
• Kill the transaction – Because of the bad input or because the desired
data were not present in the database an error can occur. In this case
we can kill the transaction to recover from failure
• Committed
• When the last data item is written out, the transaction enters committed state
• This state occurs after successful completion of transaction
• A transaction is said to have terminated if it has either committed or aborted

Transactions Schedules
• Schedule is a sequence of instructions that specify the sequential order in which
instructions of transactions are executed
• A schedule for a set of transactions must consist of all instructions present in that
transactions it must save the order in which the instructions appear in each individual
transaction
• A transaction that successfully completes its execution will have to commit all
instructions executed by it at the end of execution
• A transaction that fails to successfully complete its execution will have to abort all
instructions executed by transaction at the end of execution
• We denote this transaction T as
• Rr(X) – Denotes read operation performed by the transaction T on object X
• Wr(X) – Denotes write operation performed by the transaction T on object X
• Each transaction must specify its final action as commit or abort

Serial Transactions / Schedules

• This is simple model in which transactions executed in serial order that means after
finishing first transaction second transaction starts its execution
• This approach specifies first my transaction then your transaction other transactions
should not see preliminary results
• Example – Consider below two transactions T1 and T2
• Transaction T1 – Deposits Rs. 100 to both accounts A and B
• Transaction T2 – Doubles the balance of accounts A and B
Concurrent Transactions / Schedules
• Transactions executed concurrently that means operating system executes one
transaction for a while then context switches to second transaction and so on
• Transaction processing can allow multiple transactions to be executed simultaneously
on database server
• Allowing multiple transactions to change data in database concurrently causes several
complications with consistency of the data in the database
• It is very simple to maintain consistency in case of serial execution as compared to
concurrent execution of transactions
• Advantages
• Improved Throughput
• Throughput of transaction is defined as the number of transactions in a
given amount of time
• If we are executing multiple transactions simultaneously that may
increase throughput considerably
• Resource Utilization
• It is defined as the processor and the disk performing useful work or not
in idle state
• The processor and disk utilization increases as number of concurrent
transaction increases
• Reduced Waiting Time
• There may be some small and some long transactions may be executing
on a system
• If transactions are running serially a short transaction may have to wait
for a earlier long transaction to complete which can lead to random
delays in running a transaction
• Example – Consider below two transactions T1 and T2
• Transaction T1 – Deposits Rs. 100 to both accounts A and B
• Transaction T2 – Doubles the balance of accounts A and B

• Above transactions cab executed concurrently as shown below obtained by


interleaving the actions of T1 with those of T2
SERIALIZABILITY & CONCURRENCY CONTROL

The database system must control concurrent execution of transactions, to ensure that the
database state remains consistent. Before we examine how the database system can carry
out this task, we must first understand which schedules will ensure consistency, and which
schedules will not.

T1 T2
read(A)
A := A − 50
read(A)
temp := A * 0.1
A := A − temp
write(A)
read(B)
write(A)
read(B)
B := B + 50
write(B)
B := B + temp
write(B)

Fig. 1: Schedule 4-n a concurrent schedule.

Since transactions are programs, it is computationally difficult to determine exactly what


operations a transaction performs and how operations of various transactions interact. For
this reason, we shall not interpret the type of operations that a transaction can perform on a
data item. Instead, we consider only two operations: read and write. We thus assume that,
between a read(Q) instruction and a write (Q) instruction on a data item Q, a transaction may
perform an arbitrary sequence of operations on the copy of Q that is residing in the local
buffer of the transaction. Thus, the only significant operations of a transaction, from a
scheduling point of view, are instructions in schedules, as we do in schedule 3 in Figure 1.
In this section, we discuss different forms of schedule equivalence; they lead to the notions
of conflict serializability and view serializability.

T1 T2
read(A)
write(A)
read(A)
write(A)
read(B)
write(B)
read(B)
write(B)

Fig. 2: Schedule 3showing only the read and write instructions.


LOCK BASED CONCURRENCY CONTROL
A DBMS must be able to ensure that only serializable, recoverable schedules are allowed,
and that no actions of committed transactions are lost while undoing aborted transactions. A
DBMS typically uses a locking protocol to achieve this. A locking protocol is a set of rules to
be followed by each transaction (and enforced by the DBMS), in order to ensure that even
though actions of several transactions might be interleaved, the net effect is identical to
executing all transactions in some serial order.
Strict Two-Phase Locking (Strict 2PL)
The most widely used locking protocol, called Strict Two-Phase Locking, or Strict 2PL, has
two rules. The first rule is
(i) If a transaction T wants to read (respectively, modify) an object, it first requests a shared
(respectively, exclusive) lock on the object. Of course, a transaction that has an exclusive
lock can also read the object; an additional shared lock is not required. A transaction that
requests a lock is suspended until the DBMS is able to grant it the requested lock. The
DBMS keeps track of the locks it has granted and ensures that if a transaction holds an
exclusive lock on an object, no other transaction holds a shared or exclusive lock on the
same object.
The second rule in Strict 2PL is:
(ii) All locks held by a transaction are released when the transaction is completed. Requests
to acquire and release locks can be automatically inserted into transactions by the
DBMS; users need not worry about these details. In effect the locking protocol allows
only ‘safe’ interleaving’s of transactions. If two transactions access completely
independent parts of the database, they will be able to concurrently obtain the locks that
they need and proceed merrily on their ways. On the other hand, if two transactions
access the same object, and one of them wants to modify it, their actions are effectively
ordered seriallyall actions of one of these transactions (the one that gets the lock on
the common object first) are completed before (this lock is released and) the other
transaction can proceed. We denote the action of a transaction T requesting a shared
(respectively, exclusive) lock on object O as ST (O) (respectively, X(O)), and omit the
subscript denoting the transaction when it is clear from the context. As an example,
consider the schedule shown in figure 2. This interleaving could result in a state that
cannot result from any serial execution of the three transactions. For instance, T1 could
change A from 10 to 20, then T2 (which reads the value 20 for A) could change B from
100 to 200, and then T1 would read the value 200 for B. If run serially, either T1 or T2
would execute first, and read the values 10 for A and 100 for B: Clearly, the interleaved
execution is not equivalent to either serial execution. If the Strict 2PL protocol is used,
the above interleaving is disallowed. Let us see why. Assuming that the transactions
proceed at the same relative speed as before, T1 would obtain an exclusive lock on A
first and then read and write A (Figure 3). Then, T2 would request a lock on A. However,
this request cannot be granted until
T1 T2
X(A)
R(A)
W(A)

Fig. 3: Schedule Illustrating Strict 2PL


T1 releases its exclusive lock on A, and the DBMS therefore suspends T2. T1now proceeds
to obtain an exclusive lock on B, reads and writes B, then finally commits, at which time its
locks are released. T2’s lock request is now granted, and it proceeds. In this example the
locking protocol results in a serial execution of the two transactions. In general, however, the
actions of different transactions could be interleaved. As an example, consider the
interleaving of two transactions shown in Figure 3, which is permitted by the Strict 2PL
protocol.

Source - Serializability: https://www.youtube.com/watch?v=xWV1z5Du8N0

Deadlock
In a multi-process system, deadlock is an unwanted situation that arises in a shared resource
environment, where a process indefinitely waits for a resource that is held by another
process. For example, assume a set of transactions {T0, T1, T2, ..., Tn}. T0 needs a resource
X to complete its task. Resource X is held by T1, and T1 is waiting for a resource Y, which is
held by T2. T2 is waiting for resource Z, which is held by T 0. Thus, all the processes wait for
each other to release resources. In this situation, none of the processes can finish their task.
This situation is known as a deadlock. Deadlocks are not healthy for a system. In case a
system is stuck in a deadlock, the transactions involved in the deadlock are either rolled back
or restarted.

Source – Deadlock https://youtu.be/Iz66t1uyYIM


Deadlock Prevention
To prevent any deadlock situation in the system, the DBMS aggressively inspects all the
operations, where transactions are about to execute. The DBMS inspects the operations and
analyzes if they can create a deadlock situation. If it finds that a deadlock situation might
occur, then that transaction is never allowed to be executed.
There are deadlock prevention schemes that use timestamp ordering mechanism of
transactions in order to predetermine a deadlock situation.

Wait-Die Scheme
In this scheme, if a transaction requests to lock a resource (data item), which is already held
with a conflicting lock by another transaction, then one of the two possibilities may occur −
 If TS(Ti) < TS(Tj) − that is Ti, which is requesting a conflicting lock, is older than T j −
then Ti is allowed to wait until the data-item is available.
 If TS(Ti) > TS(tj) − that is Ti is younger than Tj − then Ti dies. Ti is restarted later with
a random delay but with the same timestamp.
This scheme allows the older transaction to wait but kills the younger one.

Wound-Wait Scheme
In this scheme, if a transaction requests to lock a resource (data item), which is already held
with conflicting lock by some another transaction, one of the two possibilities may occur −
 If TS(Ti) < TS(Tj), then Ti forces Tj to be rolled back − that is Ti wounds Tj. Tj is restarted
later with a random delay but with the same timestamp.
 If TS(Ti) > TS(Tj), then Ti is forced to wait until the resource is available.
This scheme allows the younger transaction to wait; but when an older transaction requests
an item held by a younger one, the older transaction forces the younger one to abort and
release the item. In both the cases, the transaction that enters the system at a later stage is
aborted.

Deadlock Detection
A simple way to detect a state of deadlock is with the help of wait-for graph. This graph is
constructed and maintained by the system. One node is created in the wait-for graph for each
transaction that is currently executing. Whenever a transaction Ti is waiting to lock an item X
that is currently locked by a transaction Tj, a directed edge (Ti->Tj).is created in the wait-for
graph. When Tj releases the lock(s) on the items that Ti was waiting for, the directed edge is
dropped from the wait-for graph. We have a state of deadlock if and only if the wait-for graph
has a cycle. Then each transaction involved in the cycle is said to be deadlocked. To detect
deadlocks, the system needs to maintain the wait for graph, and periodically to invoke an
algorithm that searches for a cycle in the graph. To illustrate these concepts, consider the
following wait-for graph in figure. Here:
Transaction T25 is waiting for transactions T26 and T27.
Transactions T27 is waiting for transaction T26.
Transaction T26 is waiting for transaction T28.
This wait-for graph has no cycle, so there is no deadlock state.
Suppose now that transaction T28 is requesting an item held by T27. Then the edge T28 ---
-------->T27 is added to the wait -for graph, resulting in a new system state as shown in figure.

This time the graph contains the cycle.


T26------>T28------->T27----------->T26
It means that transactions T26, T27 and T28 are all deadlocked.
Invoking the deadlock detection algorithm
The invoking of deadlock detection algorithm depends on two factors:
• How often does a deadlock occur?
• How many transactions will be affected by the deadlock?
If deadlocks occur frequently, then the detection algorithm should be invoked more frequently
than usual. Data items allocated to deadlocked transactions will be unavailable to other
transactions until the deadlock can be broken. In the worst case, we would invoke the
detection algorithm every time a request for allocation could not be granted immediately.

Deadlock Recovery
When a detection algorithm determines that a deadlock exists, the system must recover from
the deadlock. The most common solution is to roll back one or more transactions to break
the deadlock. Choosing which transaction to abort is known as Victim Selection.

Choice of Deadlock victim


In below wait-for graph transactions T26, T28 and T27 are deadlocked. In order to remove
deadlock one of the transactions out of these three transactions must be roll backed. We
should roll back those transactions that will incur the minimum cost. When a deadlock is
detected, the choice of which transaction to abort can be made using following criteria:
• The transaction which have the fewest locks
• The transaction that has done the least work
• The transaction that is farthest from completion

Rollback
Once we have decided that a particular transaction must be rolled back, we must determine
how far this transaction should be rolled back. The simplest solution is a total rollback; Abort
the transaction and then restart it. However, it is more effective to roll back the transaction
only as far as necessary to break the deadlock. But this method requires the system to
maintain additional information about the state of all the running system.

Problem of Starvation
In a system where the selection of victims is based primarily on cost factors, it may happen
that the same transaction is always picked as a victim. As a result, this transaction never
completes can be picked as a victim only a (small) finite number of times. The most common
solution is to include the number of rollbacks in the cost factor.

Source - Deadlock: https://www.youtube.com/watch?v=WZtebOyiu0M


Time Stamping Methods
Timestamp-Based Concurrency Control
In lock-based concurrency control, conflicting actions of different transactions are ordered by
the order in which locks are obtained, and the lock protocol extends this ordering on actions
to transactions, thereby ensuring serializability. In optimistic concurrency control, a
timestamp ordering is imposed on transactions, and validation checks that all conflicting
actions occurred in the same order. Timestamps can also be used in another way: each
transaction can be assigned a time stamp at startup, and we can ensure, at execution time,
that if action aj of transaction Tj conflicts with action aj of transaction Tj, aj occurs before aj if
TS(Ti)<TS(Tj). If an action violates this ordering, the transaction is aborted and restarted. To
implement this concurrency control scheme, every database object O, is given a read
timestamp RTS(O) and a write timestamp WTS(O). If transaction T wants to read object O,
and TS (T) <WTS(O), the order of this read with respect to the most recent write on O would
violate the timestamp order between this transaction and the writer. Therefore, T is aborted
and restarted with a new, larger timestamp. If TS(T) >WTS(O), T reads O, and RTS(O) is set
to the larger of RTS(O)and TS(T). (Note that there is a physical changethe change to
RTS(O)to be written to disk and to be recorded in the log for recovery purposes, even on
reads. This write operation is a significant overhead.)

Observe that if T is restarted with the same timestamp, it is guaranteed to be aborted again,
due to the same conflict. Contrast this behavior with the use of timestamps in 2PL for
deadlock prevention: there, transactions were restarted with the same timestamp as before
in order to avoid repeated restarts. This shows that the two uses of timestamps are quite
different and should not be confused.

Next, let us consider what happens when transaction T wants to write object O:
i) If TS(T)<RTS(O), the write action conflicts with the most recent read action of O, and T is
therefore aborted and restarted.
ii) If TS(T) <WTS(O), a naive approach would be to abort T because its write action conflicts
with the most recent write of O and is out of timestamp order. It turns out that we can
safely ignore such writes and continue. Ignoring outdated writes is called the Thomas
Write Rule.
iii) Otherwise, T writes O and WTS (O) is set to TS(T).

Source – Timestamp Based Control - https://youtu.be/S4DnmYefUmI


Optimistic Methods
Optimistic concurrency control (OCC) is a concurrency control method applied to
transactional systems such as relational database management systems and software
transactional memory. OCC assumes that multiple transactions can frequently complete
without interfering with each other. While running, transactions use data resources without
acquiring locks on those resources. Before committing, each transaction verifies that no other
transaction has modified the data it has read. If the check reveals conflicting modifications,
the committing transaction rolls back and can be restarted.[1] Optimistic concurrency control
was first proposed by H.T. Kung.[2] OCC is generally used in environments with low data
contention. When conflicts are rare, transactions can complete without the expense of
managing locks and without having transactions wait for other transactions' locks to clear,
leading to higher throughput than other concurrency control methods. However, if contention
for data resources is frequent, the cost of repeatedly restarting transactions hurts
performance significantly; it is commonly thought[who?] that other concurrency control methods
have better performance under these conditions.[citation needed] However, locking-based
("pessimistic") methods also can deliver poor performance because locking can drastically
limit effective concurrency even when deadlocks are avoided.
More specifically, OCC transactions involve these phases:
 Begin: Record a timestamp marking the transaction's beginning.
 Modify: Read database values, and tentatively write changes.
 Validate: Check whether other transactions have modified data that this transaction
has used (read or written). This includes transactions that completed after this
transaction's start time, and optionally, transactions that are still active at validation
time.
 Commit/Rollback: If there is no conflict, make all changes take effect. If there is a
conflict, resolve it, typically by aborting the transaction, although other resolution
schemes are possible. Care must be taken to avoid a TOCTTOU bug, particularly if
this phase and the previous one are not performed as a single atomic operation

DATABASE RECOVERY MANAGEMENT

A computer system, like any other device, is subject to failure from a variety of causes: disk
crash, power outage, software error, a fire in the machine room, even sabotage. In any
failure, information may be lost. Therefore, the database system must take actions in
advance to ensure that the atomicity and durability properties of transactions, introduced in
Chapter 15, are preserved. An integral part of a database system is a recovery scheme that
can restore the database to the consistent state that existed before the failure. The recovery
scheme must also provide high availability; that is, it must minimize the time for which the
database is not usable after a crash.

Failure Classification
There are various types of failure that may occur in a system, each of which needs to be
dealt with in a different manner. The simplest type of failure is one that does not result in the
loss of information in the system. The failures that are more difficult to deal with are those
that result in loss of information. In this chapter, we shall consider only the following types of
failure:
 Transaction failure. There are two types of errors that may cause a transaction to fail:
− Logical error. The transaction can no longer continue with its normal execution because
of some internal condition, such as bad input, data not found, overflow, or resource
limit exceeded.
− System error. The system has entered an undesirable state (for example, deadlock), as
a result of which a transaction cannot continue with its normal execution. The
transaction, however, can be re-executed at a later time.
 System crash. There is a hardware malfunction, or a bug in the database software or the
operating system, that causes the loss of the content of volatile storage and brings
transaction processing to a halt. The content of nonvolatile storage remains intact and is
not corrupted. The assumption that hardware errors and bugs in the software bring the
system to a halt, but do not corrupt the nonvolatile storage contents, is known as the fail-
stop assumption. Well-designed systems have numerous internal checks, at the
hardware and the software level, that bring the system to a halt when there is an error.
Hence, the fail-stop assumption is a reasonable one.
 Disk failure. A disk block loses its content as a result of either a head crash or failure
during a data transfer operation. Copies of the data on other disks, or archival backups
on tertiary media, such as tapes, are used to recover from the failure.

To determine how the system should recover from failures, we need to identify the failure
modes of those devices used for storing data. Next, we must consider how these failure
modes affect the contents of the database. We can then propose algorithms to ensure
database consistency and transaction atomicity despite failures. These algorithms, known as
recovery algorithms, have two parts:
i) Actions taken during normal transaction processing to ensure that enough information
exists to allow recovery from failures.
ii) Actions taken after a failure to recover the database contents to a state that ensures
database consistency, transaction atomicity, and durability.

STORAGE STRUCTURE
The various data items in the database may be stored and accessed in a number of different
storage media. To understand how to ensure the atomicity and durability properties of a
transaction, we must gain a better understanding of these storage media and their access
methods.
Storage Types
We saw that storage media can be distinguished by their relative speed, capacity, and
resilience to failure, and classified as volatile storage or nonvolatile storage. We review these
terms, and introduce another class of storage, called stable storage.
 Volatile storage. Information residing in volatile storage does not usually survive system
crashes. Examples of such storage are main memory and cache memory. Access to
volatile storage is extremely fast, both because of the speed of the memory access itself,
and because it is possible to access any data item in volatile storage directly.
 Nonvolatile storage. Information residing in nonvolatile storage survives system
crashes. Examples of such storage are disk and magnetic tapes. Disks are used for online
storage, whereas tapes are used for archival storage. Both, however, are subject to failure
(for example, head crash), which may result in loss of information. At the current state of
technology, nonvolatile storage is slower than volatile storage by several orders of
magnitude. This is because disk and tape devices are electromechanical, rather than
based entirely on chips, as is volatile storage. In database systems, disks are used for
most nonvolatile storage. Other nonvolatile media are normally used only for backup data.
Flash storage, though nonvolatile, has insufficient capacity for most database systems.
 Stable storage. Information residing in stable storage is never lost (never should be taken
with a grain of salt, since theoretically never cannot be guaranteed — for example, it is
possible, although extremely unlikely, that a black hole may envelop the earth and
permanently destroy all data!). Although stable storage is theoretically impossible to obtain,
it can be closely approximated by techniques that make data loss extremely unlikely.
The distinctions among the various storage types are often less clear in practice than in our
presentation. Certain systems provide battery backup, so that some main memory can
survive system crashes and power failures. Alternative forms of nonvolatile storage, such as
optical media, provide an even higher degree of reliability than do disks.

Stable-Storage Implementation
To implement stable storage, we need to replicate the needed information in several
nonvolatile storage media (usually disk) with independent failure modes, and to update the
information in a controlled manner to ensure that failure during data transfer does not damage
the needed information. Recall that RAID systems guarantee that the failure of a single disk
(even during data transfer) will not result in loss of data. The simplest and fastest form of
RAID is the mirrored disk, which keeps two copies of each block, on separate disks. Other
forms of RAID offer lower costs, but at the expense of lower performance.

RAID systems, however, cannot guard against data loss due to disasters such as fires or
flooding. Many systems store archival backups of tapes off-site to guard against such
disasters. However, since tapes cannot be carried off-site continually, updates since the most
recent time that tapes were carried off-site could be lost in such a disaster. More secure
systems keep a copy of each block of stable storage at a remote site, writing it out over a
computer network, in addition to storing the block on a local disk system. Since the blocks are
output to a remote system as and when they are output to local storage, once an output
operation is complete, the output is not lost, even in the event of a disaster such as a fire or
flood. In the remainder of this section, we discuss how storage media can be protected from
failure during data transfer. Block transfer between memory and disk storage can result in
 Successful completion. The transferred information arrived safely at its destination.
 Partial failure. A failure occurred in the midst of transfer, and the destination block has
incorrect information.
 Total failure. The failure occurred sufficiently early during the transfer that the destination
block remains intact.

We can extend this procedure easily to allow the use of an arbitrarily large number of copies
of each block of stable storage. Although a large number of copies reduces the probability of
a failure to even lower than two copies do, it is usually reasonable to simulate stable storage
with only two copies.

Data Access
The database system resides permanently on nonvolatile storage (usually disks) and is
partitioned into fixed-length storage units called blocks. Blocks are the units of data transfer
to and from disk and may contain several data items. We shall assume that no data item
spans two or more blocks. This assumption is realistic for most data-processing applications,
such as our banking example.
Fig. 4 : Block storage operations.

Transactions input information from the disk to main memory, and then output the information
back onto the disk. The input and output operations are done in block units. The blocks
residing on the disk are referred to as physical blocks; the blocks residing temporarily in main
memory are referred to as buffer blocks. The area of memory where blocks reside temporarily
is called the disk buffer.

Block movements between disk and main memory are initiated through the following two
operations:
1) Input (B) transfers the physical block B to main memory.
2) Output (B) transfers the buffer block B to the disk, and replaces the appropriate physical
block there.

Each transaction Ti has a private work area in which copies of all the data items accessed
and updated by Ti are kept. The system creates this work area when the transaction is
initiated; the system removes it when the transaction either commits or aborts. Each data
item X kept in the work area of transaction Ti is denoted by xi. Transaction Ti interacts with
the database system by transferring data to and from its work area to the system buffer. We
transfer data by these two operations:
i) read (X) assigns the value of data item X to the local variable xi. It executes this operation
as follows:
(a) If block BX on which X resides is not in main memory, it issues input (BX).
(b) It assigns to xi, the value of X from the buffer block.

ii) write(X) assigns the value of local variable xi to data item X in the buffer block. It executes
this operation as follows:
(a) If block BX on which X resides is not in main memory, it issues input (BX).
(b) It assigns the value of xi to X in buffer BX.

Note that both operations may require the transfer of a block from disk to main memory. They
do not, however, specifically require the transfer of a block from main memory to disk. A
buffer block is eventually written out to the disk either because the buffer manager needs the
memory space for other purposes or because the database system wishes to reflect the
change to B on the disk. We shall say that the database system performs a force-output of
buffer B if it issues an output (B). When a transaction needs to access a data item X for the
first time, it must execute read (X). The system then performs all updates to X on xi. After the
transaction accesses X for the final time, it must execute write(X) to reflect the change to X
in the database itself.
The output (BX) operation for the buffer block BX on which X resides does not need to take
effect immediately after write (X) is executed, since the block BX may contain other data items
that are still being accessed. Thus, the actual output may take place later. Notice that, if the
system crashes after the write(X) operation was executed but before output (BX) was
executed, the new value of X is never written to disk and, thus, is lost.

Recovery and Atomicity


Consider again our simplified banking system and transaction Ti that transfers $50 from
account A to account B, with initial values of A and B being $1000 and $2000, respectively.
Suppose that a system crash has occurred during the execution of Ti, after output(BA) has
taken place, but before output(BB) was executed, where BA and denote the buffer blocks on
which A and B reside. Since the memory contents were lost, we do not know the fate of the
transaction; thus, we could invoke one of two possible recovery procedures:
• Re-execute Ti. This procedure will result in the value of A becoming $900, rather than
$950. Thus, the system enters an inconsistent state.
• Do not re-execute Ti. The current system state has values of $950 and $2000 for A and
B, respectively. Thus, the system enters an inconsistent state.

In either case, the database is left in an inconsistent state, and thus this simple recovery
scheme does not work. The reason for this difficulty is that we have modified the database
without having assurance that the transaction will indeed commit. Our goal is to perform
either all or no database modifications made by Ti. However, if Ti performed multiple
database modifications, several output operations may be required, and a failure may occur
after some of these modifications have been made, but before all of them are made.To
achieve our goal of atomicity, we must first output information describing the modifications to
stable storage, without modifying the database itself. As we shall see, this procedure will
allow us to output all the modifications made by a committed transaction, despite failures.

Graded Questions:
1. Explain the concept of transaction
2. Describe ACID properties of transaction
3. Explain difference between the term’s serial schedule and serializable schedule with
suitable examples.
4. Explain View and conflict serializability with suitable example
5. Explain the need of concurrency control in transaction management
6. Write a short note on Two phase locking protocol
7. Explain Timestamp based protocol
8. Show that two phase locking protocol ensures conflict serializability
9. What is Deadlock. Explain Deadlock detection
10. Explain Deadlock Recovery
11. Explain how atomicity is related with recovery
12. What are the different storage types in database?
13. Explain different types of failure in database recovery management
14. Write a short note on optimistic concurrency control
15. Explain with example problem of starvation
Multiple Choice Questions
1.Identify the characteristics of transactions
a) Atomicity
b) Durability
c) Isolation
d) All of the mentioned
2.Which of the following has “all-or-none” property?
a) Atomicity
b) Durability
c) Isolation
d) All of the mentioned
3.The database system must take special actions to ensure that transactions operate
properly without interference from concurrently executing database statements. This
property is referred to as
a) Atomicity
b) Durability
c) Isolation
d) All of the mentioned
4. Deadlocks are possible only when one of the transactions wants to obtain a(n) ____ lock
on a data item.
a) binary
b) exclusive
c) shared
d) complete

5. The ____ statement is used to end a successful transaction.


a) COMMIT
b) DONE
c) END
d) QUIT

6. If a transaction acquires a shared lock, then it can perform ___________operation.


a) read
b) write
c) read and write
d) update

7. A system is in a ______ state if there exists a set of transactions such that every
transaction in the set is waiting for another transaction in the set.
a) Idle
b) Waiting
c) Deadlock
d) Ready
8. The deadlock state can be changed back to stable state by using _____________
statement.
a) Commit
b) Rollback
c) Savepoint
d) Deadlock

9.What are the ways of dealing with deadlock?


a) Deadlock prevention
b) Deadlock recovery
c) Deadlock detection
d) All of the mentioned

10. The situation where the lock waits only for a specified amount of time for another lock to
be released is
a) Lock timeout
b) Wait-wound
c) Timeout
d) Wait

11. The log is a sequence of _________ recording all the update activities in the database
a) Log Records
b) Records
c) Columns
d) Log Columns

12. OCC stands for:


a) Occupancy Concurrency Control
b) Optimistic concise Control
c) Occupancy Constant Control
d) Optimistic Concurrency Control

13. Which of the following is not a phase in Optimistic Concurrency Control?


a) Begin
b) Declare
c) Commit
d) Validate

14. ___________ is a distinct identifier formed by DBMS to recognize the relative starting
time of transaction.
a) 2PL
b) Timestamp
c) Shared
d) Strict 2PL

15. In ________ one transaction overwrites the changes of another transaction.


a) Lost Update Problem
b) Uncommitted Dependency Problem
c) Inconsistent Analysis Problem
d) Consistent Analysis Problem

You might also like