DBMS

Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

UNIT -IV

ACID PROPERTIES
Transaction: A transaction is a set of actions that forms a single logical unit of task or work.
To maintain the consistency and integrity of the data, there are four properties defined in DBMS
known as the ACID properties.
1. Atomicity
2. Consistency
3. Isolation
4. Durability

Atomicity: It means if any operation is performed on the data, either it should be performed
completely or should not at all occur. In the case of executing operations on the transaction, the
operation should be completely executed and not partially.

Ex: If some money is transferred between A and B, then money debited from A should be
credited in B else money should not at all be debited from A. It means atomicity is preserved.

Consistency: It means that the value should remain preserved always i.e. integrity of the data
should be maintained. The total sum before and after the transaction should be same.

Ex: If Account A wants to transfer 50$ to account B with initial balances of A and B as 500$ and
300$, then the total sum before the transaction is 800$. Then after the successful transaction T, if
the available amount in A becomes 450$ and B becomes $350, then also the total sum is 800$ . It
means the consistency of database is preserved.

Isolation: It is the property of a database where no data should affect the other one and may
occur concurrently. In short, the operation on one database should begin when the operation on
the first database gets complete.

Ex: If two operations are concurrently running on two different accounts, then the value of both
accounts should not get affected.
Durability: Durability ensures the permanency i.e. even if the system fails or leads to a crash,
the database still survives. It becomes the responsibility of the recovery manager for ensuring the
durability of the database.

Transaction states:

In a database, the transaction can be in one of the following states -

Active state
o The active state is the first state of every transaction. In this state, the transaction is being
executed.

Partially committed
o In the partially committed state, a transaction executes its final operation, but the data is
still not saved to the database.
Committed

A transaction is said to be in a committed state if it executes all its operations successfully. In


this state, all the effects are now permanently saved on the database system.

Failed state
o A transaction is said to be in the failed state if it cannot proceed with normal execution.

Aborted: If a transaction has reached a failed state then the database recovery system will make
sure that the database is in its previous consistent state. If not then it will abort or roll back the
transaction to bring the database into a consistent state.

Concurrent Execution of Transaction

In the transaction process, a system usually allows executing more than one transaction
simultaneously. This process is called a concurrent execution.

Advantages of concurrent execution of a transaction

Decrease waiting time or turnaround time.

Improve response time

Increased throughput or resource utilization.

Concurrency Problems in DBMS (Different types of conflicts)-

• When multiple transactions execute concurrently in an uncontrolled or unrestricted manner,


then it might lead to several problems.
• Such problems are called as concurrency problems.
The concurrency problems are-
1. Dirty Read Problem/ Reading uncommitted data (Write-Read Conflict)
2. Unrepeatable Read Problem/ Non repeatable read (Read-Write Conflict)
3. Lost Update Problem/ Overwriting Uncommitted data (Write-Write Conflict)
4. Phantom Read Problem

1. Dirty Read Problem-

Reading the data written by an uncommitted transaction is called as dirty read.

This read is called as dirty read because-

• There is always a chance that the uncommitted transaction might roll back later.
• This leads to inconsistency of the database.
Example-

In this example,

• T2 reads the dirty value of A written by the uncommitted transaction T1.


• T1 fails in later stages and roll backs.
• Thus, the value that T2 read now stands to be incorrect.
• Therefore, database becomes inconsistent.

2. Unrepeatable Read Problem-


This problem occurs when a transaction gets to read unrepeated i.e. different values of the same
variable in its different read operations even when it has not updated its value.

Example-

Here,

1. T1 reads the value of X (= 10 say).


2. T2 reads the value of X (= 10).
3. T1 updates the value of X (from 10 to 15 say) in the buffer.
4. T2 again reads the value of X (but = 15).
In this example,

• T2 gets to read a different value of X in its second reading.


• T2 wonders how the value of X got changed because according to it, it is running in isolation.

3. Lost Update Problem-


This problem occurs when multiple transactions execute concurrently and updates from one or
more transactions get lost.

Example-

Here,

1. T1 reads the value of A (= 10 say).


2. T1 updates the value to A (= 15 say) in the buffer.
3. T2 does blind write A = 25 (write without read) in the buffer.
4. T2 commits.
5. When T1 commits, it writes A = 25 in the database.

In this example,

• T1 writes the over written value of A in the database.


• Thus, update from T1 gets lost.

4. Phantom Read Problem-


This problem occurs when a transaction reads some variable from the buffer and when it reads
the same variable later, it finds that the variable does not exist.
Example-

Here,

1. T1 reads X.
2. T2 reads X.
3. T1 deletes X.
4. T2 tries reading X but does not find it.
In this example,

• T2 finds that there does not exist any variable X when it tries reading X again.
• T2 wonders who deleted the variable X because according to it, it is running in isolation.
Serializability:

o The serializability of schedules is used to find non-serial schedules that allow the
transaction to execute concurrently without interfering with one another.
o It identifies which schedules are correct when executions of the transaction have
interleaving of their operations.
o A non-serial schedule will be serializable if its result is equal to the result of its
transactions executed serially.

There are 3 ways of testing Serializability:

1. Result Serializability
2. Conflict Serializability
3. View Serializability

1.Result Serializability: A non-serial schedule will be result serializable, if its result is equal to
the result of its transactions executed serially.

2. Conflict Serializability: The schedule will be a conflict serializable if it is conflict equivalent


to a serial schedule.

Conflicting Operations:

The two operations become conflicting if all conditions satisfy:

1. Both belong to separate transactions.


2. They have the same data item.
3. They contain at least one write operation.

Note: A precedence graph is used to check conflict serializability. If there exists no cycle in
the precedence graph then the given schedule S is conflict serializable else it is not conflict
serializable.
Example: Check whether the given schedule S is conflict serializable or not-

S : R1(A) , R2(A) , R1(B) , R2(B) , R3(B) , W1(A) , W2(B)

Solution-

T1 T2 T3
R(A)
R(A)
R(B)
R(B)
R(B)
W(A)
W(B)

Step-01:

List all the conflicting operations and determine the dependency between the transactions-

• R1(B) , W2(B) (T1 → T2)


• R2(A) , W1(A) (T2 → T1)
• R3(B) , W2(B) (T3 → T2)

Step-02:

Draw the precedence graph-


• Clearly, there exists a cycle in the precedence graph.
Therefore, the given schedule S is not conflict serializable.

3.View Serializability-

If a given schedule is found to be view equivalent to some serial schedule, then it is called
as a view serializable schedule.

View Equivalent Schedules-

Consider two schedules S1 and S2 each consisting of two transactions T1 and T2.

Schedules S1 and S2 are called view equivalent if the following three conditions hold true for
them-

Condition-01:
For each data item X, if transaction Ti reads X from the database initially in schedule S1, then in
schedule S2 also, Ti must perform the initial read of X from the database.

Thumb Rule
“Initial readers must be same for all the data items”.

Condition-02:

If transaction Ti reads a data item that has been updated by the transaction Tj in schedule S1, then
in schedule S2 also, transaction Ti must read the same data item that has been updated by the
transaction Tj.

Thumb Rule
“Write-read sequence must be same.”.

Condition-03:

For each data item X, if X has been updated at last by transaction Ti in schedule S1, then in
schedule S2 also, X must be updated at last by transaction Ti.

Thumb Rule
“Final writers must be same for all the data items”.

RECOVERABILITY:

Recoverable Schedule:

If a Transaction T2 reads a value written by another Transaction T1 and T2 commits after T1


commits, then it is called recoverable schedule because if T1 fails and rolls back , then T2 also
can rollback.

Example:

T1 T2
R(A)
W(A)
R(A)
W(A)
COMMIT
COMMIT

Irrecoverable Schedule:

If a Transaction T2 reads a value written by another Transaction T1 and T2 commits before T1


commits, then it is called irrecoverable schedule because if T1 fails and rolls back , then T2
cannot rollback.

Example:

T1 T2
R(A)
W(A)
R(A)
W(A)

COMMIT
COMMIT

Lock Based Protocols in DBMS

The protocol utilizes locks, applied by a transaction to data item, which may block other
transactions from accessing the same data during the transaction's life.

Types of locking protocols

1. Simple locking
2. 2-Phase Locking
a. Simple 2-phase locking
b. Strict 2-phase locking
c. Rigorous 2-phase locking
d. Conservative 2-Phase locking

1) Simple locking:

There are two types of lock :

1. Shared Lock
2. Exclusive Lock
Shared Locks are represented by S. The data items can only read without performing
modification to it from the database. S – lock is requested using
lock – s instruction.

Exclusive Locks are represented by X. The data items can be read as well as written. X – lock is
requested using lock – X instruction.

Lock Compatibility Matrix


• Lock Compatibility Matrix controls whether multiple transactions can acquire locks on the same
resource at the same time.

Shared Exclusive

Shared True False

Exclusive False False

LIMITATIONS:

1. If the locks are released too early, it leads to Database Inconsistency.


2. It may not guarantee Serializability.

3. Dead locks are possible.

2) 2-Phase locking protocols:

It is used to overcome the disadvantage of the simple lock. In the 2-phase locking, there are two
phases are growing phase and shrinking phase. 2-phase ensures serializability.

Growing phase: A transaction may obtain locks but not release any locks.

Shrinking phase: A transaction may release lock but may not obtain new locks.
A) Simple 2-phase locking: Once a transaction releases a lock it enters in shrinking phase and in
shrinking phase it cannot issue any more locks.

Example:

T1 COMMENTS
LOCK-S(A) GROWING PHASE
READ(A)

LOCK-X(B)
READ(B)

WRITE(B)

UNLOCK-X(B) SHRINKING PHASE


UNLOCK-S(A)

Advantage:

• Less resource utilization.


• Guarantees serializability based on Lock points.

Disadvantage:

• Deadlock present.
• Cascading roll backs are possible

There are three types of 2-phase locking protocols:

B) Strict 2-phase locking:


Transaction does not release any of the exclusive locks until after its commit or abort. It
means all the exclusive locks held by a transaction are released only after the transaction
commits.

Example:

T1
LOCK-S(A)
READ(A)

LOCK-X(B)
READ(B)

WRITE(B)
UNLOCK-S(A)
COMMIT
UNLOCK-X(B)

Advantage:

• No dirty read problem.


• No Cascading rollbacks.

Disadvantage:

• Deadlock are possible.

C) Rigorous 2-phase locking: Transaction "t" does not release any of its locks until after it
commits or aborts. It means both shared and exclusive locks are released only after the
transaction commits.

T1
LOCK-S(A)
READ(A)

LOCK-X(B)
READ(B)

WRITE(B)

COMMIT
UNLOCK-S(A)
UNLOCK-X(B)

Advantage:

• No Cascading rollbacks

Disadvantage:

• Deadlock can still occur.

D) Conservative 2-phase locking: The transaction will obtain all the locks at the beginning i.e.
before the transaction starts and will release all the locks only after it commits.

Example:

T1
LOCK-S(A)
LOCK-X(B)

READ(A)
READ(B)

WRITE(B)

COMMIT
UNLOCK-S(A)
UNLOCK-X(B)

Advantage:

• No waiting for the data item


• No Cascading rollbacks
• No deadlocks

Disadvantage:

• Practically data implementation is difficult


• It leads to starvation.
GRAPH BASED PROTOCOLS

Graph-based protocols are an alternative to two-phase locking.


A Graph based protocol is implemented as a Tree.

Impose a partial ordering on the set D = {d1, d2 ,..., dh} of all data items.

The tree-protocol is a simple kind of graph protocol.

The set of rules used by a Tree Protocol are:

• Only Exclusive locks are allowed.

• The first lock by Ti may be on any data item. Subsequently, a data Q can be locked by Ti
only if the parent of Q is currently locked by Ti.

• Data items may be unlocked at any time.

• Data item that is locked and unlocked by Ti cannot be subsequently relocked by the same
transaction Ti.

Advantage –

1. Ensures Conflict Serializable Schedule.


2. Ensures Deadlock Free Schedule
3. Shorter waiting time as unlocking can be done anytime
Disadvantage –

1. Unnecessary locking overheads may happen sometimes, like if we want both D and E,
then at least we have to lock B to follow the protocol.
2. Cascading Rollbacks is still a problem. We don’t follow a rule of when Unlock operation
may occur so this problem persists for this protocol.
3. Prior knowledge of data access.

TIME STAMP BASED PROTOCOLS:

Each transaction is issued a timestamp when it enters the system. If an old transaction Ti has
time-stamp TS(Ti), a new transaction Tj is assigned time-stamp TS(Tj) such that

TS(Ti) <TS(Tj).

• The protocol manages concurrent execution such that the time-stamps determine the
serializability order.

Timestamps can be used in two ways:

1. To determine the outdatedness of a request with respect to the data object it is


operating on.

2. To order events with respect to one another.

• In order to assure such behavior, the protocol maintains for each data Q two timestamp
values:

• W-timestamp(A) is the largest time-stamp of any transaction that executed write(A)


successfully.

R-timestamp(Q) is the largest time-stamp of any transaction that executed read(Q)


successfully.

There are two types of Time stamp based protocols:


1. Time stamp ordering protocol
2. Thomas Write Rule.

1. TIMESTAMP ORDERING PROTOCOL:

The timestamp ordering protocol provides an ordering among the transactions so that any
conflicting read and write operations are executed in timestamp order.

If not such an operation is rejected and the transaction is rolled back and the rolled back
transaction is given a new timestamp.

Timestamp ordering protocol works as follows −

• If a transaction Ti issues a read(X) operation −

o If TS(Ti) < W-timestamp(X)

▪ Operation rejected.

T1(9) T2(10)

W(X)

R(X)-REJECTED

o If TS(Ti) >= W-timestamp(X)

▪ Operation executed.

T1(10) T2(9)

W(X)

R(X)-ALLOWED
• If a transaction Ti issues a write(X) operation −

o If TS(Ti) < R-timestamp(X) then the Operation rejected.

T1(9) T2(10)

R(X)

W(X)-REJECT

▪ If TS(Ti) < W-timestamp(X) then the Operation rejected and Ti rolled


back.

T1(9) T2(10)

W(X)

W(X)-REJECT

2.THOMAS WRITE RULE:

It ignores the outdated write operations.

It improves the Basic Timestamp Ordering Algorithm.

If transaction Ti wants to execute the WRITE operation on X:

The basic Thomas write rules are as follows:

1. If TS(T) < R_TS(X) then transaction T is aborted and rolled back, and operation is
rejected.

Example:
T1(9) T2(10)

R(X)

W(X)-REJECT

2. If TS(T) < W_TS(X) then don't execute the W_item(X) operation of the transaction and
continue processing.

T1(9) T2(10)

W(X)

COMMIT

W(X)-OUTDATED

COMMIT

3. If neither condition 1 nor condition 2 occurs, then allowed to execute the WRITE
operation by transaction Ti and set W_TS(X) to TS(T).
VALIDATION BASED PROTOCOLS:

Validation based protocols are also known as optimistic concurrency control technique or
Certification based protocols

It is called optimistic concurrency control protocols because it hopes that all the operations of
transaction are correct.

It is used to validate/ check the update operation (read/ write operation) on a data item by a
Transaction.

In the validation based protocol, the transaction is executed in the following three phases:

1. Read phase: In this phase, the transaction T reads committed values of various data items
and stores them in temporary local variables. It can perform all the write operations on
temporary variables without an update to the actual database.
2. Validation phase: In this phase, the temporary variable value will be validated against the
actual data to see if it violates the serializability and everything is correct.
3. Write phase: If the validation of the transaction is validated, then the temporary results
are written to the database or system otherwise the transaction is rolled back and
restarted.

Here each phase has the following different timestamps:

Start(Ti): It contains the time when Ti started its execution.

Validation (Ti): It contains the time when Ti finishes its read phase and starts its validation
phase.

Finish(Ti): It contains the time when Ti finishes its write phase.

o This protocol is used to determine the time stamp for the transaction for serialization
using the time stamp of the validation phase, as it is the actual phase which determines if
the transaction will commit or rollback.
o Hence TS(T) = validation(T).
o The serializability is determined during the validation process. It can't be decided in
advance.
o While executing the transaction, it ensures a greater degree of concurrency and also less
number of conflicts.
o Thus it contains transactions which have less number of rollbacks.

Advantages:

➢ The validation based protocol considers high priority to the greater degree of
concurrency resulting in fewer conflict scenarios.
➢ This protocol is known for less number of rollback while maintaining the
concurrency control mechanism.

Disadvantages:

➢ The validation based protocol may arise in the scenario of transaction starvation.
This could be due to the short transaction conflicts associated with this protocol.
➢ The validation based protocol may not suitable for large transactions as it efficient
for maintaining the shorter conflicts in the transactions.
MULTIPLE GRANULARITY

Locking Granularity: It is the size of the data item that can be locked.

There are 2 types of locking granularity:

1. Coarse granularity: It refers to large data item sizes


Ex: Entire relation or Table
2. Fine granularity: It refers to small data item sizes
Ex: Tuple or attribute of a relation

Multiple Granularity: It allows the data items of any size to be locked.

o It can be defined as hierarchically breaking up the database into blocks which can be
locked.
o The Multiple Granularity protocol enhances concurrency and reduces lock overhead.
o It maintains the track of what to lock and how to lock.
o It makes easy to decide either to lock a data item or to unlock a data item. This type of
hierarchy can be graphically represented as a tree.

Note : In multiple-granularity, the locks are acquired in top-down order, and locks must be
released in bottom-up order.

There are five lock modes with multiple granularity:

Shared lock: Shared locks support read integrity.

Exclusive lock: Exclusive locks support write integrity.


Intention-shared (IS): Intention-shared (IS) indicates that one or more shared locks will be
requested on some descendant node(s) i.e. It contains explicit locking at a lower level of the tree
but only with shared locks.

Intention-Exclusive (IX): Intention-exclusive (IX) indicates that one or more exclusive locks will
be requested on some descendant node(s) i.e. It contains explicit locking at a lower level with
exclusive or shared locks.

Shared & Intention-Exclusive (SIX): Shared-intention-exclusive (SIX) indicates that the current
node is locked in shared mode but that one or more exclusive locks will be requested on some
descendant node(s). i.e.

MULTIPLE GRANULARITY LOCKING PROTOCOL RULES:

The multiple granularity locking (MGL) protocol consists of the following rules i.e. It requires
that if a transaction attempts to lock a node, then that node must follow these rules:

1. Transaction T1 should follow the lock-compatibility matrix.


2. The root of the tree must be locked first, in any mode.

3. A node N can be locked by a transaction T in S or IS mode only if the parent node N is


already locked by transaction T in either IS or IX mode.

4. A node N can be locked by a transaction T in X, IX, or SIX mode only if the parent of
node N is already locked by transaction T in either IX or SIX mode.

5. A transaction T can lock a node only if it has not unlocked any node (to enforce the
2PL protocol).
6. A transaction T can unlock a node, N, only if none of the children of node N are
currently locked by T.

You might also like