UNIT IV DBMS by Grishman
UNIT IV DBMS by Grishman
UNIT IV DBMS by Grishman
UNIT IV - DBMS
Transaction:
o The transaction is a set of logically related operation. It contains a group of tasks.
o A transaction is an action or series of actions. It is performed by a single user to
perform operations for accessing the contents of the database.
Example: Suppose an employee of bank transfers Rs 800 from X's account to Y's
account. This small transaction contains several low-level tasks:
X's Account
1. Open_Account(X)
2. Old_Balance = X.balance
3. New_Balance = Old_Balance - 800
4. X.balance = New_Balance
5. Close_Account(X)
Y's Account
1. Open_Account(Y)
2. Old_Balance = Y.balance
3. New_Balance = Old_Balance + 800
4. Y.balance = New_Balance
5. Close_Account(Y)
Operations of Transaction:
Following are the main operations of transaction:
Read(X): Read operation is used to read the value of X from the database and stores it
in a buffer in main memory.
Write(X): Write operation is used to write the value back to the database from the
buffer.
vkv
2|P ag e
UNIT IV - DBMS
Let's take an example to debit transaction from an account which consists of following
operations:
1. R(X);
2. X = X - 500;
3. W(X);
o The first operation reads X's value from database and stores it in a buffer.
o The second operation will decrease the value of X by 500. So buffer will contain
3500.
o The third operation will write the buffer's value to the database. So X's final value
will be 3500.
But it may be possible that because of the failure of hardware, software or power, etc.
that transaction may fail before finished all the operations in the set.
For example: If in the above transaction, the debit transaction fails after executing
operation 2 then X's value will remain 4000 in the database which is not acceptable by
the bank.
States of Transactions:
A transaction in a database can be in one of the following states −
vkv
3|P ag e
UNIT IV - DBMS
Active − In this state, the transaction is being executed. This is the initial state
of every transaction.
Partially Committed − When a transaction executes its final operation, it is
said to be in a partially committed state.
Failed − A transaction is said to be in a failed state if any of the checks made by
the database recovery system fails. A failed transaction can no longer proceed
further.
Aborted − If any of the checks fails and the transaction has reached a failed
state, then the recovery manager rolls back all its write operations on the
database to bring the database back to its original state where it was prior to the
execution of the transaction. Transactions in this state are called aborted. The
database recovery module can select one of the two operations after a
transaction aborts −
o Re-start the transaction
o Kill the transaction
Committed − If a transaction executes all its operations successfully, it is said
to be committed. All its effects are now permanently established on the database
system
vkv
4|P ag e
UNIT IV - DBMS
Transaction property
The transaction has the four properties. These are used to maintain consistency in a
database, before and after the transaction.
Property of Transaction
1. Atomicity
2. Consistency
3. Isolation
4. Durability
o It states that all operations of the transaction take place at once if not, the
transaction is aborted.
o There is no midway, i.e., the transaction cannot occur partially. Each transaction
is treated as one unit and either run to completion or is not executed at all.
Abort: If a transaction aborts then all the changes made are not visible.
Commit: If a transaction commits then all the changes made are visible.
Example: Let's assume that following transaction T consisting of T1 and T2. A consists
of Rs 600 and B consists of Rs 300. Transfer Rs 100 from account A to account B.
T1 T2
Read(A) Read(B)
A:= A-100 Y:= Y+100
Write(A) Write(B)
vkv
5|P ag e
UNIT IV - DBMS
If the transaction T fails after the completion of transaction T1 but before completion of
transaction T2, then the amount will be deducted from A but not added to B. This shows
the inconsistent database state. In order to ensure correctness of database state, the
transaction must be executed in entirety.
Consistency
o The integrity constraints are maintained so that the database is consistent before
and after the transaction.
o The execution of a transaction will leave a database in either its prior stable state
or a new stable state.
o The consistent property of database states that every transaction sees a consistent
database instance.
o The transaction is used to transform the database from one consistent state to
another consistent state.
For example: The total amount must be maintained before or after the transaction.
Therefore, the database is consistent. In the case when T1 is completed but T2 fails, then
inconsistency will occur.
Isolation
o It shows that the data which is used at the time of execution of a transaction
cannot be used by the second transaction until the first one is completed.
o In isolation, if the transaction T1 is being executed and using the data item X,
then that data item can't be accessed by any other transaction T2 until the
transaction T1 ends.
o The concurrency control subsystem of the DBMS enforced the isolation property.
Durability
vkv
6|P ag e
UNIT IV - DBMS
state known as the consistent state. That consistent state cannot be lost, even in
the event of a system's failure.
o The recovery subsystem of the DBMS has the responsibility of Durability
property.
Schedule
A series of operation from one transaction to another transaction is known as schedule.
It is used to preserve the order of the operation in each of the individual transaction.
1. Serial Schedule
The serial schedule is a type of schedule where one transaction is executed completely
before starting another transaction. In the serial schedule, when the first transaction
completes its cycle, then the next transaction is executed.
For example: Suppose there are two transactions T1 and T2 which have some
operations. If it has no interleaving of operations, then there are the following two
possible outcomes:
1. Execute all the operations of T1 which was followed by all the operations of T2.
o the given (a) figure, Schedule A shows the serial schedule where T1 followed by
T2.
o In the given (b) figure, Schedule B shows the serial schedule where T2 followed
by T1.
vkv
7|P ag e
UNIT IV - DBMS
2. Non-serial Schedule
3. Serializable schedule
o The serializability of schedules is used to find non-serial schedules that allow the
transaction to execute concurrently without interfering with one another.
o It identifies which schedules are correct when executions of the transaction have
interleaving of their operations.
o A non-serial schedule will be serializable if its result is equal to the result of its
transactions executed serially.
vkv
8|P ag e
UNIT IV - DBMS
vkv
9|P ag e
UNIT IV - DBMS
Here,
Serializability in DBMS-
Serializable Schedules-
If a given non-serial schedule of ‘n’ transactions is equivalent to some serial
schedule of ‘n’ transactions, then it is called as a serializable schedule.
Testing of Serializability
Assume a schedule S. For S, we construct a graph known as precedence graph. This graph has a
pair G = (V, E), where V consists a set of vertices, and E consists a set of edges. The set of
vertices is used to contain all the transactions participating in the schedule. The set of edges is
used to contain all edges Ti ->Tj for which one of the three conditions holds:
vkv
10 | P a g e
UNIT IV - DBMS
o If a precedence graph contains a single edge Ti → Tj, then all the instructions of Ti are
executed before the first instruction of Tj is executed.
o If a precedence graph for schedule S contains a cycle, then S is non-serializable. If the
precedence graph has no cycle, then S is known as serializable.
Types of Serializability-
Serializability is mainly of two types-
1. Conflict Serializability
2. View Serializability
vkv
11 | P a g e
UNIT IV - DBMS
Conflict Serializability-
If a given non-serial schedule can be converted into a serial schedule by swapping its non-
conflicting operations, then it is called as a conflict serializable schedule.
Conflicting Operations-
Two operations are called as conflicting operations if all the following conditions hold true for
them-
Both the operations belong to different transactions
Both the operations are on the same data item
At least one of the two operations is a write operation
Example-
In this schedule,
W1 (A) and R2 (A) are called as conflicting operations.
This is because all the above conditions hold true for them.
vkv
12 | P a g e
UNIT IV - DBMS
Follow the following steps to check whether a given non-serial schedule is conflict
serializable or not-
Step-01:
Step-02:
Start creating a precedence graph by drawing one node for each transaction.
Step-03:
Draw an edge for each conflict pair such that if Xi (V) and Yj (V) forms a
conflict pair then draw an edge from Ti to Tj.
This ensures that Ti gets executed before Tj.
Step-04:
For example:
vkv
13 | P a g e
UNIT IV - DBMS
Explanation:
vkv
14 | P a g e
UNIT IV - DBMS
The precedence graph for schedule S1 contains a cycle that's why Schedule S1 is non-
serializable.
vkv
15 | P a g e
UNIT IV - DBMS
Explanation:
vkv
16 | P a g e
UNIT IV - DBMS
The precedence graph for schedule S2 contains no cycle that's why ScheduleS2 is
serializable
But before knowing about concurrency control, we should know about concurrent
execution.
o In a multi-user system, multiple users can access and use the same database at
one time, which is known as the concurrent execution of the database. It
means that the same database is executed simultaneously on a multi-user
system by different users.
o The thing is that the simultaneous execution that is performed should be done
in an interleaved manner, and no operation should affect the other executing
operations, thus maintaining the consistency of the database. Thus, on making
the concurrent execution of the transaction operations, there occur several
challenging problems that need to be solved.
vkv
17 | P a g e
UNIT IV - DBMS
In a database transaction, the two main operations are READ and WRITE operations.
So, there is a need to manage these two operations in the concurrent execution of the
transactions as if these operations are not performed in an interleaved manner, and
the data may become inconsistent. So, the following problems occur with the
Concurrent Execution of the operations:
The problem occurs when two different database transactions perform the read/write
operations on the same database items in an interleaved manner (i.e., concurrent
execution) that makes the values of the items incorrect hence making the database
inconsistent.
For example:
Consider the below diagram where two transactions TX and TY, are performed on the
same account A where the balance of account A is $300.
o At time t1, transaction TX reads the value of account A, i.e., $300 (only read).
vkv
18 | P a g e
UNIT IV - DBMS
o At time t2, transaction TX deducts $50 from account A that becomes $250 (only
deducted and not updated/write).
o Alternately, at time t3, transaction TY reads the value of account A that will be
$300 only because TX didn't update the value yet.
o At time t4, transaction TY adds $100 to account A that becomes $400 (only
added but not updated/write).
o At time t6, transaction TX writes the value of account A that will be updated as
$250 only, as TY didn't update the value yet.
The dirty read problem occurs when one transaction updates an item of the database,
and somehow the transaction fails, and before the data gets rollback, the updated
database item is accessed by another transaction. There comes the Read-Write Conflict
between both transactions.
For example:
vkv
19 | P a g e
UNIT IV - DBMS
o At time t3, transaction TX writes the updated value in account A, i.e., $350.
o Then at time t4, transaction TY reads account A that will be read as $350.
o Then at time t5, transaction TX rollbacks due to server problem, and the value
changes back to $300 (as initially).
o But the value for account A remains $350 for transaction T Y as committed,
which is the dirty read and therefore known as the Dirty Read Problem.
Also known as Inconsistent Retrievals Problem that occurs when in a transaction, two
different values are read for the same database item.
For example:
vkv
20 | P a g e
UNIT IV - DBMS
o At time t1, transaction TX reads the value from account A, i.e., $300.
o At time t2, transaction TY reads the value from account A, i.e., $300.
o At time t3, transaction TY updates the value of account A by adding $100 to the
available balance, and then it becomes $400.
o After that, at time t5, transaction TX reads the available value of account A, and
that will be read as $400.
o It means that within the same transaction TX, it reads two different values of
account A, i.e., $ 300 initially, and after updation made by transaction T Y, it
reads $400. It is an unrepeatable read and is therefore known as the
Unrepeatable read problem.
Thus, in order to maintain consistency in the database and avoid such problems that
take place in concurrent execution, management is needed, and that is where the
concept of Concurrency Control comes into role.
Concurrency Control :
Concurrency Control in Database Management System is a procedure of
managing simultaneous operations without conflicting with each other. It ensures
vkv
21 | P a g e
UNIT IV - DBMS
Lock-based Protocols
vkv
22 | P a g e
UNIT IV - DBMS
vkv
23 | P a g e
UNIT IV - DBMS
Two-phase locking has two phases, one is growing, where all the locks are being
acquired by the transaction; and the second phase is shrinking, where the locks
held by the transaction are being released.
To claim an exclusive (write) lock, a transaction must first acquire a shared (read)
lock and then upgrade it to an exclusive lock.
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL. After acquiring all the locks in the
first phase, the transaction continues to execute normally. But in contrast to 2PL,
Strict-2PL does not release a lock after using it. Strict-2PL holds all the locks until
the commit point and releases all the locks at a time.
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol.
This protocol uses either system time or logical counter as a timestamp.
Lock-based protocols manage the order between the conflicting pairs among
transactions at the time of execution, whereas timestamp-based protocols start
working as soon as a transaction is created.
Every transaction has a timestamp associated with it, and the ordering is
determined by the age of the transaction. A transaction created at 0002 clock time
would be older than all other transactions that come after it. For example, any
transaction 'y' entering the system at 0004 is two seconds younger and the priority
would be given to the older one.
In addition, every data item is given the latest read and write-timestamp. This lets
the system know when the last ‘read and write’ operation was performed on the
data item.
vkv
24 | P a g e
UNIT IV - DBMS
vkv
25 | P a g e
UNIT IV - DBMS
Deadlock in DBMS:
A deadlock is a condition where two or more transactions are waiting
indefinitely for one another to give up locks. Deadlock is said to be one of the most
feared complications in DBMS as no task ever gets finished and is in waiting state
forever.
For example: In the student table, transaction T1 holds a lock on some rows and
needs to update some rows in the grade table. Simultaneously, transaction T2 holds
locks on some rows in the grade table and needs to update the rows in the Student
table held by Transaction T1.
Now Transaction T1 is waiting for T2 to release its lock and similarly, transaction
T2 is waiting for T1 to release its lock. All activities come to a halt state and
remain at a standstill. It will remain in a standstill until the DBMS detects the
deadlock and aborts one of the transactions.
Deadlock Avoidance
vkv
26 | P a g e
UNIT IV - DBMS
Deadlock Detection
The wait for a graph for the above scenario is shown below:
Deadlock Prevention
vkv
27 | P a g e
UNIT IV - DBMS
Wait-Die scheme
In this scheme, if a transaction requests for a resource which is already held with a
conflicting lock by another transaction then the DBMS simply checks the
timestamp of both transactions. It allows the older transaction to wait until the
resource is available for execution.
Let's assume there are two transactions Ti and Tj and let TS(T) is a timestamp of
any transaction T. If Tj holds a lock by some other transaction and Ti is requesting
for resources held by Tj then the following actions are performed by DBMS:
1. Check if TS(Ti) < TS(Tj) - If Ti is the older transaction and Tj has held
some resource, then Ti is allowed to wait until the data-item is available for
execution. That means if the older transaction is waiting for a resource
which is locked by the younger transaction, then the older transaction is
allowed to wait for resource until it is available.
2. Check if TS(Ti) < TS(Tj) - If Ti is older transaction and has held some
resource and if Tj is waiting for it, then Tj is killed and restarted later with
the random delay but with the same timestamp.
vkv
28 | P a g e
UNIT IV - DBMS
There are both automatic and non-automatic ways for both, backing up of
data and recovery from any failure situations. The techniques used to recover the
lost data due to system crash, transaction errors, viruses, catastrophic failure,
incorrect commands execution etc. are database recovery techniques. So to
prevent data loss recovery techniques based on deferred update and immediate
update or backing up data can be used.
Recovery techniques are heavily dependent upon the existence of a special file
known as a system log. It contains information about the start and end of each
transaction and any updates which occur in the transaction. The log keeps track
of all transaction operations that affect the values of database items. This
information is needed to recover from transaction failure.
The log is kept on disk start_transaction(T): This log entry records that
transaction T starts the execution.
read_item(T, X): This log entry records that transaction T reads the value of
database item X.
write_item(T,X):
A transaction T reaches its commit point when all its operations that access the
database have been executed successfully i.e. the transaction has reached the
point at which it will not abort (terminate without completing). Once committed,
the transaction is permanently recorded in the database. Commitment always
involves writing a commit entry to the log and writing the log to disk. At the time
of a system crash, item is searched back in the log for all transactions T that have
written a start_transaction(T) entry into the log but have not written a commit(T)
entry yet; these transactions may have to be rolled back to undo their effect on the
database during the recovery process
Undoing – If a transaction crashes, then the recovery manager may undo
transactions i.e. reverse the operations of a transaction. This involves
vkv
29 | P a g e
UNIT IV - DBMS
vkv
30 | P a g e
UNIT IV - DBMS
Full database backup – In this full database including data and database, Meta
information needed to restore the whole database, including full-text catalogs
are backed up in a predefined time series.
Differential backup – It stores only the data changes that have occurred since
last full database backup. When same data has changed many times since last
full database backup, a differential backup stores the most recent version of
changed data. For this first, we need to restore a full database backup.
Transaction log backup – In this, all events that have occurred in the
database, like a record of every single statement executed is backed up. It is the
backup of transaction log entries and contains all transaction that had happened
to the database. Through this, the database can be recovered to a specific point
in time. It is even possible to perform a backup from a transaction log if the
data files are destroyed and not even a single committed transaction is lost.
vkv