UNIT IV DBMS by Grishman

Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

1|P ag e

UNIT IV - DBMS

UNIT IV - TRANSACTION MANAGEMENT


Transaction Concepts – Transaction Recovery – ACID Properties –Need for Concurrency Control –
Schedule and Recoverability- Serializability and Schedules – Concurrency Control – Types of Locks-
Two Phases locking- Deadlock- Time Stamp based Concurrency Control – Recovery Techniques –
Concepts - Immediate Update-Deferred Update - Shadow Paging

Transaction:
o The transaction is a set of logically related operation. It contains a group of tasks.
o A transaction is an action or series of actions. It is performed by a single user to
perform operations for accessing the contents of the database.

Example: Suppose an employee of bank transfers Rs 800 from X's account to Y's
account. This small transaction contains several low-level tasks:

X's Account

1. Open_Account(X)
2. Old_Balance = X.balance
3. New_Balance = Old_Balance - 800
4. X.balance = New_Balance
5. Close_Account(X)

Y's Account

1. Open_Account(Y)
2. Old_Balance = Y.balance
3. New_Balance = Old_Balance + 800
4. Y.balance = New_Balance
5. Close_Account(Y)

Operations of Transaction:
Following are the main operations of transaction:

Read(X): Read operation is used to read the value of X from the database and stores it
in a buffer in main memory.

Write(X): Write operation is used to write the value back to the database from the
buffer.

vkv
2|P ag e
UNIT IV - DBMS

Let's take an example to debit transaction from an account which consists of following
operations:

1. R(X);
2. X = X - 500;
3. W(X);

Let's assume the value of X before starting of the transaction is 4000.

o The first operation reads X's value from database and stores it in a buffer.
o The second operation will decrease the value of X by 500. So buffer will contain
3500.
o The third operation will write the buffer's value to the database. So X's final value
will be 3500.

But it may be possible that because of the failure of hardware, software or power, etc.
that transaction may fail before finished all the operations in the set.

For example: If in the above transaction, the debit transaction fails after executing
operation 2 then X's value will remain 4000 in the database which is not acceptable by
the bank.

To solve this problem, we have two important operations:

Commit: It is used to save the work done permanently.

Rollback: It is used to undo the work done.

States of Transactions:
A transaction in a database can be in one of the following states −

vkv
3|P ag e
UNIT IV - DBMS

 Active − In this state, the transaction is being executed. This is the initial state
of every transaction.
 Partially Committed − When a transaction executes its final operation, it is
said to be in a partially committed state.
 Failed − A transaction is said to be in a failed state if any of the checks made by
the database recovery system fails. A failed transaction can no longer proceed
further.
 Aborted − If any of the checks fails and the transaction has reached a failed
state, then the recovery manager rolls back all its write operations on the
database to bring the database back to its original state where it was prior to the
execution of the transaction. Transactions in this state are called aborted. The
database recovery module can select one of the two operations after a
transaction aborts −
o Re-start the transaction
o Kill the transaction
 Committed − If a transaction executes all its operations successfully, it is said
to be committed. All its effects are now permanently established on the database
system

vkv
4|P ag e
UNIT IV - DBMS

Transaction property
The transaction has the four properties. These are used to maintain consistency in a
database, before and after the transaction.

Property of Transaction
1. Atomicity
2. Consistency
3. Isolation
4. Durability

o It states that all operations of the transaction take place at once if not, the
transaction is aborted.
o There is no midway, i.e., the transaction cannot occur partially. Each transaction
is treated as one unit and either run to completion or is not executed at all.

Atomicity involves the following two operations:

Abort: If a transaction aborts then all the changes made are not visible.

Commit: If a transaction commits then all the changes made are visible.

Example: Let's assume that following transaction T consisting of T1 and T2. A consists
of Rs 600 and B consists of Rs 300. Transfer Rs 100 from account A to account B.

T1 T2

Read(A) Read(B)
A:= A-100 Y:= Y+100
Write(A) Write(B)

vkv
5|P ag e
UNIT IV - DBMS

After completion of the transaction, A consists of Rs 500 and B consists of Rs 400.

If the transaction T fails after the completion of transaction T1 but before completion of
transaction T2, then the amount will be deducted from A but not added to B. This shows
the inconsistent database state. In order to ensure correctness of database state, the
transaction must be executed in entirety.

Consistency
o The integrity constraints are maintained so that the database is consistent before
and after the transaction.
o The execution of a transaction will leave a database in either its prior stable state
or a new stable state.
o The consistent property of database states that every transaction sees a consistent
database instance.
o The transaction is used to transform the database from one consistent state to
another consistent state.

For example: The total amount must be maintained before or after the transaction.

1. Total before T occurs = 600+300=900


2. Total after T occurs= 500+400=900

Therefore, the database is consistent. In the case when T1 is completed but T2 fails, then
inconsistency will occur.

Isolation

o It shows that the data which is used at the time of execution of a transaction
cannot be used by the second transaction until the first one is completed.
o In isolation, if the transaction T1 is being executed and using the data item X,
then that data item can't be accessed by any other transaction T2 until the
transaction T1 ends.
o The concurrency control subsystem of the DBMS enforced the isolation property.

Durability

o The durability property is used to indicate the performance of the database's


consistent state. It states that the transaction made the permanent changes.
o They cannot be lost by the erroneous operation of a faulty transaction or by the
system failure. When a transaction is completed, then the database reaches a

vkv
6|P ag e
UNIT IV - DBMS

state known as the consistent state. That consistent state cannot be lost, even in
the event of a system's failure.
o The recovery subsystem of the DBMS has the responsibility of Durability
property.

Schedule
A series of operation from one transaction to another transaction is known as schedule.
It is used to preserve the order of the operation in each of the individual transaction.

1. Serial Schedule

The serial schedule is a type of schedule where one transaction is executed completely
before starting another transaction. In the serial schedule, when the first transaction
completes its cycle, then the next transaction is executed.

For example: Suppose there are two transactions T1 and T2 which have some
operations. If it has no interleaving of operations, then there are the following two
possible outcomes:

1. Execute all the operations of T1 which was followed by all the operations of T2.

o the given (a) figure, Schedule A shows the serial schedule where T1 followed by
T2.
o In the given (b) figure, Schedule B shows the serial schedule where T2 followed
by T1.
vkv
7|P ag e
UNIT IV - DBMS

2. Non-serial Schedule

o If interleaving of operations is allowed, then there will be non-serial schedule.


o It contains many possible orders in which the system can execute the individual
operations of the transactions.
o In the given figure (c) and (d), Schedule C and Schedule D are the non-serial
schedules. It has interleaving of operations.

3. Serializable schedule

o The serializability of schedules is used to find non-serial schedules that allow the
transaction to execute concurrently without interfering with one another.
o It identifies which schedules are correct when executions of the transaction have
interleaving of their operations.
o A non-serial schedule will be serializable if its result is equal to the result of its
transactions executed serially.

vkv
8|P ag e
UNIT IV - DBMS

vkv
9|P ag e
UNIT IV - DBMS

Here,

Schedule A and Schedule B are serial schedule.

Schedule C and Schedule D are Non-serial schedule.

Serializability in DBMS-

 Some non-serial schedules may lead to inconsistency of the database.


 Serializability is a concept that helps to identify which non-serial schedules are correct
and will maintain the consistency of the database.

Serializable Schedules-

 If a given non-serial schedule of ‘n’ transactions is equivalent to some serial
schedule of ‘n’ transactions, then it is called as a serializable schedule.
Testing of Serializability

Serialization Graph is used to test the Serializability of a schedule.

Assume a schedule S. For S, we construct a graph known as precedence graph. This graph has a
pair G = (V, E), where V consists a set of vertices, and E consists a set of edges. The set of
vertices is used to contain all the transactions participating in the schedule. The set of edges is
used to contain all edges Ti ->Tj for which one of the three conditions holds:

vkv
10 | P a g e
UNIT IV - DBMS

1. Create a node Ti → Tj if Ti executes write (Q) before Tj executes


read (Q).
2. Create a node Ti → Tj if Ti executes read (Q) before Tj executes
write (Q).
3. Create a node Ti → Tj if Ti executes write (Q) before Tj executes
write (Q).

o If a precedence graph contains a single edge Ti → Tj, then all the instructions of Ti are
executed before the first instruction of Tj is executed.
o If a precedence graph for schedule S contains a cycle, then S is non-serializable. If the
precedence graph has no cycle, then S is known as serializable.

Types of Serializability-
Serializability is mainly of two types-

1. Conflict Serializability
2. View Serializability
vkv
11 | P a g e
UNIT IV - DBMS

Conflict Serializability-

If a given non-serial schedule can be converted into a serial schedule by swapping its non-
conflicting operations, then it is called as a conflict serializable schedule.

Conflicting Operations-

Two operations are called as conflicting operations if all the following conditions hold true for
them-
 Both the operations belong to different transactions
 Both the operations are on the same data item
 At least one of the two operations is a write operation

Example-

Consider the following schedule-

In this schedule,
 W1 (A) and R2 (A) are called as conflicting operations.
 This is because all the above conditions hold true for them.

vkv
12 | P a g e
UNIT IV - DBMS

Checking Whether a Schedule is Conflict Serializable Or Not-

Follow the following steps to check whether a given non-serial schedule is conflict
serializable or not-

Step-01:

Find and list all the conflicting operations.

Step-02:

Start creating a precedence graph by drawing one node for each transaction.

Step-03:

 Draw an edge for each conflict pair such that if Xi (V) and Yj (V) forms a
conflict pair then draw an edge from Ti to Tj.
 This ensures that Ti gets executed before Tj.

Step-04:

 Check if there is any cycle formed in the graph.


 If there is no cycle found, then the schedule is conflict serializable otherwise
not.

For example:

vkv
13 | P a g e
UNIT IV - DBMS

Explanation:

Read(A): In T1, no subsequent writes to A, so no new edges


Read(B): In T2, no subsequent writes to B, so no new edges
Read(C): In T3, no subsequent writes to C, so no new edges
Write(B): B is subsequently read by T3, so add edge T2 → T3
Write(C): C is subsequently read by T1, so add edge T3 → T1
Write(A): A is subsequently read by T2, so add edge T1 → T2
Write(A): In T2, no subsequent reads to A, so no new edges
Write(C): In T1, no subsequent reads to C, so no new edges
Write(B): In T3, no subsequent reads to B, so no new edges

vkv
14 | P a g e
UNIT IV - DBMS

Precedence graph for schedule S1:

The precedence graph for schedule S1 contains a cycle that's why Schedule S1 is non-
serializable.

vkv
15 | P a g e
UNIT IV - DBMS

Explanation:

Read(A): In T4,no subsequent writes to A, so no new edges


Read(C): In T4, no subsequent writes to C, so no new edges
Write(A): A is subsequently read by T5, so add edge T4 → T5
Read(B): In T5,no subsequent writes to B, so no new edges
Write(C): C is subsequently read by T6, so add edge T4 → T6
Write(B): A is subsequently read by T6, so add edge T5 → T6
Write(C): In T6, no subsequent reads to C, so no new edges
Write(A): In T5, no subsequent reads to A, so no new edges
Write(B): In T6, no subsequent reads to B, so no new edges

vkv
16 | P a g e
UNIT IV - DBMS

Precedence graph for schedule S2:

The precedence graph for schedule S2 contains no cycle that's why ScheduleS2 is
serializable

DBMS Concurrency Control

Concurrency Control is the management procedure that is required for controlling


concurrent execution of the operations that take place on a database.

But before knowing about concurrency control, we should know about concurrent
execution.

Concurrent Execution in DBMS

o In a multi-user system, multiple users can access and use the same database at
one time, which is known as the concurrent execution of the database. It
means that the same database is executed simultaneously on a multi-user
system by different users.

o While working on the database transactions, there occurs the requirement of


using the database by multiple users for performing different operations, and
in that case, concurrent execution of the database is performed.

o The thing is that the simultaneous execution that is performed should be done
in an interleaved manner, and no operation should affect the other executing
operations, thus maintaining the consistency of the database. Thus, on making
the concurrent execution of the transaction operations, there occur several
challenging problems that need to be solved.

vkv
17 | P a g e
UNIT IV - DBMS

Problems with Concurrent Execution

In a database transaction, the two main operations are READ and WRITE operations.
So, there is a need to manage these two operations in the concurrent execution of the
transactions as if these operations are not performed in an interleaved manner, and
the data may become inconsistent. So, the following problems occur with the
Concurrent Execution of the operations:

Problem 1: Lost Update Problems (W - W Conflict)

The problem occurs when two different database transactions perform the read/write
operations on the same database items in an interleaved manner (i.e., concurrent
execution) that makes the values of the items incorrect hence making the database
inconsistent.

For example:

Consider the below diagram where two transactions TX and TY, are performed on the
same account A where the balance of account A is $300.

o At time t1, transaction TX reads the value of account A, i.e., $300 (only read).

vkv
18 | P a g e
UNIT IV - DBMS

o At time t2, transaction TX deducts $50 from account A that becomes $250 (only
deducted and not updated/write).

o Alternately, at time t3, transaction TY reads the value of account A that will be
$300 only because TX didn't update the value yet.

o At time t4, transaction TY adds $100 to account A that becomes $400 (only
added but not updated/write).

o At time t6, transaction TX writes the value of account A that will be updated as
$250 only, as TY didn't update the value yet.

o Similarly, at time t7, transaction TY writes the values of account A, so it will


write as done at time t4 that will be $400. It means the value written by T X is
lost, i.e., $250 is lost.

Hence data becomes incorrect, and database sets to inconsistent.

Dirty Read Problems (W-R Conflict)

The dirty read problem occurs when one transaction updates an item of the database,
and somehow the transaction fails, and before the data gets rollback, the updated
database item is accessed by another transaction. There comes the Read-Write Conflict
between both transactions.

For example:

Consider two transactions TX and TY in the below diagram performing read/write


operations on account A where the available balance in account A is $300:

vkv
19 | P a g e
UNIT IV - DBMS

o At time t1, transaction TX reads the value of account A, i.e., $300.

o At time t2, transaction TX adds $50 to account A that becomes $350.

o At time t3, transaction TX writes the updated value in account A, i.e., $350.

o Then at time t4, transaction TY reads account A that will be read as $350.

o Then at time t5, transaction TX rollbacks due to server problem, and the value
changes back to $300 (as initially).

o But the value for account A remains $350 for transaction T Y as committed,
which is the dirty read and therefore known as the Dirty Read Problem.

Unrepeatable Read Problem (W-R Conflict)

Also known as Inconsistent Retrievals Problem that occurs when in a transaction, two
different values are read for the same database item.

For example:

Consider two transactions, TX and TY, performing the read/write operations on


account A, having an available balance = $300. The diagram is shown below:

vkv
20 | P a g e
UNIT IV - DBMS

o At time t1, transaction TX reads the value from account A, i.e., $300.

o At time t2, transaction TY reads the value from account A, i.e., $300.

o At time t3, transaction TY updates the value of account A by adding $100 to the
available balance, and then it becomes $400.

o At time t4, transaction TY writes the updated value, i.e., $400.

o After that, at time t5, transaction TX reads the available value of account A, and
that will be read as $400.

o It means that within the same transaction TX, it reads two different values of
account A, i.e., $ 300 initially, and after updation made by transaction T Y, it
reads $400. It is an unrepeatable read and is therefore known as the
Unrepeatable read problem.

Thus, in order to maintain consistency in the database and avoid such problems that
take place in concurrent execution, management is needed, and that is where the
concept of Concurrency Control comes into role.

Concurrency Control :
Concurrency Control in Database Management System is a procedure of
managing simultaneous operations without conflicting with each other. It ensures
vkv
21 | P a g e
UNIT IV - DBMS

that Database transactions are performed concurrently and accurately to produce


correct results without violating data integrity of the respective Database.

In a multiprogramming environment where multiple transactions can be executed


simultaneously, it is highly important to control the concurrency of transactions.
We have concurrency control protocols to ensure atomicity, isolation, and
serializability of concurrent transactions. Concurrency control protocols can be
broadly divided into two categories −

 Lock based protocols


 Time stamp based protocols

Lock-based Protocols

Database systems equipped with lock-based protocols use a mechanism by which


any transaction cannot read or write data until it acquires an appropriate lock on it.
Locks are of two kinds −
 Binary Locks − A lock on a data item can be in two states; it is either
locked or unlocked.
 Shared/exclusive − This type of locking mechanism differentiates the locks
based on their uses. If a lock is acquired on a data item to perform a write
operation, it is an exclusive lock. Allowing more than one transaction to
write on the same data item would lead the database into an inconsistent
state. Read locks are shared because no data value is being changed.
There are four types of lock protocols available −
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object
before a 'write' operation is performed. Transactions may unlock the data item
after completing the ‘write’ operation.
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on
which they need locks. Before initiating an execution, the transaction requests the
system for all the locks it needs beforehand. If all the locks are granted, the
transaction executes and releases all the locks when all its operations are over. If
all the locks are not granted, the transaction rolls back and waits until all the locks
are granted.

vkv
22 | P a g e
UNIT IV - DBMS

Two-Phase Locking 2PL


This locking protocol divides the execution phase of a transaction into three parts.
In the first part, when the transaction starts executing, it seeks permission for the
locks it requires. The second part is where the transaction acquires all the locks.
As soon as the transaction releases its first lock, the third phase starts. In this
phase, the transaction cannot demand any new locks; it only releases the acquired
locks.

vkv
23 | P a g e
UNIT IV - DBMS

Two-phase locking has two phases, one is growing, where all the locks are being
acquired by the transaction; and the second phase is shrinking, where the locks
held by the transaction are being released.
To claim an exclusive (write) lock, a transaction must first acquire a shared (read)
lock and then upgrade it to an exclusive lock.
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL. After acquiring all the locks in the
first phase, the transaction continues to execute normally. But in contrast to 2PL,
Strict-2PL does not release a lock after using it. Strict-2PL holds all the locks until
the commit point and releases all the locks at a time.

Strict-2PL does not have cascading abort as 2PL does.

Timestamp-based Protocols

The most commonly used concurrency protocol is the timestamp based protocol.
This protocol uses either system time or logical counter as a timestamp.
Lock-based protocols manage the order between the conflicting pairs among
transactions at the time of execution, whereas timestamp-based protocols start
working as soon as a transaction is created.
Every transaction has a timestamp associated with it, and the ordering is
determined by the age of the transaction. A transaction created at 0002 clock time
would be older than all other transactions that come after it. For example, any
transaction 'y' entering the system at 0004 is two seconds younger and the priority
would be given to the older one.
In addition, every data item is given the latest read and write-timestamp. This lets
the system know when the last ‘read and write’ operation was performed on the
data item.

vkv
24 | P a g e
UNIT IV - DBMS

Timestamp Ordering Protocol

The timestamp-ordering protocol ensures serializability among transactions in


their conflicting read and write operations. This is the responsibility of the
protocol system that the conflicting pair of tasks should be executed according to
the timestamp values of the transactions.

 The timestamp of transaction Ti is denoted as TS(Ti).


 Read time-stamp of data-item X is denoted by R-timestamp(X).
 Write time-stamp of data-item X is denoted by W-timestamp(X).
Timestamp ordering protocol works as follows −
 If a transaction Ti issues a read(X) operation −
o If TS(Ti) < W-timestamp(X)
 Operation rejected.
o If TS(Ti) >= W-timestamp(X)
 Operation executed.
o All data-item timestamps updated.
 If a transaction Ti issues a write(X) operation −
o If TS(Ti) < R-timestamp(X)
 Operation rejected.
o If TS(Ti) < W-timestamp(X)
 Operation rejected and Ti rolled back.
o Otherwise, operation executed.
Thomas' Write Rule
This rule states if TS(Ti) < W-timestamp(X), then the operation is rejected and
Ti is rolled back.
Time-stamp ordering rules can be modified to make the schedule view
serializable.
Instead of making Ti rolled back, the 'write' operation itself is ignored.

vkv
25 | P a g e
UNIT IV - DBMS

Deadlock in DBMS:
A deadlock is a condition where two or more transactions are waiting
indefinitely for one another to give up locks. Deadlock is said to be one of the most
feared complications in DBMS as no task ever gets finished and is in waiting state
forever.

For example: In the student table, transaction T1 holds a lock on some rows and
needs to update some rows in the grade table. Simultaneously, transaction T2 holds
locks on some rows in the grade table and needs to update the rows in the Student
table held by Transaction T1.

Now Transaction T1 is waiting for T2 to release its lock and similarly, transaction
T2 is waiting for T1 to release its lock. All activities come to a halt state and
remain at a standstill. It will remain in a standstill until the DBMS detects the
deadlock and aborts one of the transactions.

Deadlock Avoidance

o When a database is stuck in a deadlock state, then it is better to avoid the


database rather than aborting or restating the database. This is a waste of
time and resource.
o Deadlock avoidance mechanism is used to detect any deadlock situation in
advance. A method like "wait for graph" is used for detecting the deadlock
situation but this method is suitable only for the smaller database. For the
larger database, deadlock prevention method can be used.

vkv
26 | P a g e
UNIT IV - DBMS

Deadlock Detection

In a database, when a transaction waits indefinitely to obtain a lock, then the


DBMS should detect whether the transaction is involved in a deadlock or not. The
lock manager maintains a Wait for the graph to detect the deadlock cycle in the
database.

Wait for Graph


o This is the suitable method for deadlock detection. In this method, a graph is
created based on the transaction and their lock. If the created graph has a
cycle or closed loop, then there is a deadlock.
o The wait for the graph is maintained by the system for every transaction
which is waiting for some data held by the others. The system keeps
checking the graph if there is any cycle in the graph.

The wait for a graph for the above scenario is shown below:

Deadlock Prevention

o Deadlock prevention method is suitable for a large database. If the resources


are allocated in such a way that deadlock never occurs, then the deadlock
can be prevented.
o The Database management system analyzes the operations of the transaction
whether they can create a deadlock situation or not. If they do, then the
DBMS never allowed that transaction to be executed.

vkv
27 | P a g e
UNIT IV - DBMS

Wait-Die scheme

In this scheme, if a transaction requests for a resource which is already held with a
conflicting lock by another transaction then the DBMS simply checks the
timestamp of both transactions. It allows the older transaction to wait until the
resource is available for execution.

Let's assume there are two transactions Ti and Tj and let TS(T) is a timestamp of
any transaction T. If Tj holds a lock by some other transaction and Ti is requesting
for resources held by Tj then the following actions are performed by DBMS:

1. Check if TS(Ti) < TS(Tj) - If Ti is the older transaction and Tj has held
some resource, then Ti is allowed to wait until the data-item is available for
execution. That means if the older transaction is waiting for a resource
which is locked by the younger transaction, then the older transaction is
allowed to wait for resource until it is available.
2. Check if TS(Ti) < TS(Tj) - If Ti is older transaction and has held some
resource and if Tj is waiting for it, then Tj is killed and restarted later with
the random delay but with the same timestamp.

Wound wait scheme


o In wound wait scheme, if the older transaction requests for a resource which
is held by the younger transaction, then older transaction forces younger one
to kill the transaction and release the resource. After the minute delay, the
younger transaction is restarted but with the same timestamp.
o If the older transaction has held a resource which is requested by the
Younger transaction, then the younger transaction is asked to wait until older
releases it.

Database Recovery Techniques in DBMS:


Database systems, like any other computer system, are subject to failures
but the data stored in it must be available as and when required. When a database
fails it must possess the facilities for fast recovery. It must also have atomicity i.e.
either transactions are completed successfully and committed (the effect is
recorded permanently in the database) or the transaction should have no effect on
the database.

vkv
28 | P a g e
UNIT IV - DBMS

There are both automatic and non-automatic ways for both, backing up of
data and recovery from any failure situations. The techniques used to recover the
lost data due to system crash, transaction errors, viruses, catastrophic failure,
incorrect commands execution etc. are database recovery techniques. So to
prevent data loss recovery techniques based on deferred update and immediate
update or backing up data can be used.

Recovery techniques are heavily dependent upon the existence of a special file
known as a system log. It contains information about the start and end of each
transaction and any updates which occur in the transaction. The log keeps track
of all transaction operations that affect the values of database items. This
information is needed to recover from transaction failure.
 The log is kept on disk start_transaction(T): This log entry records that
transaction T starts the execution.
 read_item(T, X): This log entry records that transaction T reads the value of
database item X.
 write_item(T,X):
A transaction T reaches its commit point when all its operations that access the
database have been executed successfully i.e. the transaction has reached the
point at which it will not abort (terminate without completing). Once committed,
the transaction is permanently recorded in the database. Commitment always
involves writing a commit entry to the log and writing the log to disk. At the time
of a system crash, item is searched back in the log for all transactions T that have
written a start_transaction(T) entry into the log but have not written a commit(T)
entry yet; these transactions may have to be rolled back to undo their effect on the
database during the recovery process
 Undoing – If a transaction crashes, then the recovery manager may undo
transactions i.e. reverse the operations of a transaction. This involves

vkv
29 | P a g e
UNIT IV - DBMS

examining a transaction for the log entry write_item(T, x, old_value,


new_value) and setting the value of item x in the database to old-value.
There are two major techniques for recovery from non-catastrophic
transaction failures: deferred updates and immediate updates.
 Deferred update – This technique does not physically update the database on
disk until a transaction has reached its commit point. Before reaching commit,
all transaction updates are recorded in the local transaction workspace. If a
transaction fails before reaching its commit point, it will not have changed the
database in any way so UNDO is not needed. It may be necessary to REDO the
effect of the operations that are recorded in the local transaction workspace,
because their effect may not yet have been written in the database. Hence, a
deferred update is also known as the No-undo/redo algorithm.
 Immediate update – In the immediate update, the database may be
updated by some operations of a transaction before the transaction
reaches its commit point. However, these operations are recorded in a log
on If a transaction fails to reach its commit point disk before they are
applied to the database, making recovery still possible., the effect of its
operation must be undone i.e. the transaction must be rolled back hence
we require both undo and redo. This technique is known as undo/redo
algorithm.
 Caching/Buffering – In this one or more disk pages that include data items to
be updated are cached into main memory buffers and then updated in memory
before being written back to disk. A collection of in-memory buffers called the
DBMS cache is kept under control of DBMS for holding these buffers. A
directory is used to keep track of which database items are in the buffer. A
dirty bit is associated with each buffer, which is 0 if the buffer is not modified
else 1 if modified.

vkv
30 | P a g e
UNIT IV - DBMS

 Shadow paging – It provides atomicity and durability. A directory with n


entries is constructed, where the ith entry points to the ith database page on the
link. When a transaction began executing the current directory is copied into a
shadow directory. When a page is to be modified, a shadow page is allocated in
which changes are made and when it is ready to become durable, all pages that
refer to original are updated to refer new replacement page.

Some of the backup techniques are as follows :

 Full database backup – In this full database including data and database, Meta
information needed to restore the whole database, including full-text catalogs
are backed up in a predefined time series.
 Differential backup – It stores only the data changes that have occurred since
last full database backup. When same data has changed many times since last
full database backup, a differential backup stores the most recent version of
changed data. For this first, we need to restore a full database backup.
 Transaction log backup – In this, all events that have occurred in the
database, like a record of every single statement executed is backed up. It is the
backup of transaction log entries and contains all transaction that had happened
to the database. Through this, the database can be recovered to a specific point
in time. It is even possible to perform a backup from a transaction log if the
data files are destroyed and not even a single committed transaction is lost.

vkv

You might also like