Unit V Correct
Unit V Correct
UNIT-V
Transactions: Transaction concept – Transaction State – Implementation of Atomicity and Durability
– Concurrent Executions – Serializability – Testing for Serializability. Concurrency Control: Lock-
Based Protocols – Timestamp-Based Protocols. Recovery System: Failure Classification – Storage
Structure – Recovery and Atomicity – Log-Based Recovery – Shadow Paging
TRANSACTIONS
Collections of operations that form a single logical unit of work is called transactions
.A database system must ensure proper execution of transactions despite failures—
either the entire transaction executes, or none of it does.
Transaction Concept:
These properties are often called the ACID properties; the acronym is derived from
the first letter of each of the four properties.
• Atomicity: Suppose that, just before the execution of transaction Ti the values
of accounts A and B are $1000 and $2000, respectively. Suppose that the failure
happened after the write (A) operation but before the write (B) operation. In this
case, the values of accounts A and B reflected in the database are $950 and $2000.
The system destroyed $50 as a result of this failure. We term such a state an
inconsistent state. If the atomicity property is present, all actions of the
transaction are reflected in the database, or none are. Ensuring atomicity is the
responsibility of the database system itself; specifically, it is handled by a
component called the transaction-management component.
• Isolation: Even if the consistency and atomicity properties are ensured for each
transaction, if several transactions are executed concurrently, their operations may
interleave in some undesirable way, resulting in an inconsistent state. A way to avoid
the problem of concurrently executing transactions is to execute transactions serially
—that is, one after the other. Ensuring the isolation property is the responsibility of a
component of the database system called the concurrency-control component
executes transactions serially—that is, one after the other.
Transaction State:
Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 2
UNIT –V DBMS
A transaction may not always complete its execution successfully. Such a transaction
is termed aborted. Any changes that the aborted transaction made to the database must be
undone. Once the changes caused by an aborted transaction have been undone, we say that
the transaction has been rolled back. A transaction that completes its execution successfully
is said to be committed.
A committed transaction that has performed updates transforms the database into a new
consistent state, which must persist even if there is a system failure. The only way to undo the
effects of a committed transaction is to execute a compensating transaction. A transaction
must be in one of the following states:
• Active, the initial state; the transaction stays in this state while it is
executing.
• Partially committed, after the final statement has been execute.
• Failed, after the discovery that normal execution can no longer
Proceed.
• Aborted, after the transaction has been rolled back and the
database has been restored to its state prior to the start of the
transaction.
• Committed, after successful completion.
Concurrent Executions:
The motivation for using concurrent execution in a database is essentially the same as
the motivation for using multiprogramming in an operating system.
Let T1 andT2 be two transactions that transfer funds from one account to another.
Transaction T1 transfers $50 from account A to account B. It is defined as:
T1: read (A);
A: = A − 50;
write (A);
read (B);
B: = B + 50;
write (B).
Transaction T2 transfers 10 percent of the balance from account A to account B. It is
defined as:
T2: read (A);
temp: = A * 0.1;
A: = A − temp;
write (A);
read (B);
B: = B + temp;
write (B).
Suppose the current values of accounts A and B are $1000 and $2000, respectively.
Suppose also that the two transactions are executed one at a time in the order T1
followed by T2. This execution sequence appears as follows:
The final values of accounts A and B, after the execution in this figure takes place, are
$855 and $2145, respectively.
Similarly, if the transactions are executed one at a time in the order T2 followed by
T1, then the corresponding execution sequence is as follows:
Again, as expected, the sum A + B is preserved, and the final values of accounts A and
B are $850 and $2150, respectively.
The execution sequences just described are called schedules.
suppose that the two transactions are executed concurrently as follows:
After this execution takes place, we arrive at the same state as the one in which the
transactions are executed serially in the order T1 followed by T2. The sum A + B is
indeed preserved.
Not all concurrent executions result in a correct state. To illustrate, consider the
schedule as follows:
The database system must control concurrent execution of transactions, to ensure that the
database state remains consistent. Since transactions are programs, it is computationally
difficult to determine exactly what operations a transaction performs and how operations of
various transactions interact. For this reason, we shall not interpret the type of operations that
a transaction can perform on a data item.
Instead, we consider only two operations: read and write. We thus assume that, between a
read(Q) instruction and a write(Q) instruction on a data item Q, a transaction may perform an
arbitrary sequence of operations on the copy of Q that is residing in the local buffer of the
transaction. Thus, the only significant operations of a transaction, from a scheduling point of
view, are its read and write instructions.
T1 T2
read(A)
write(A)
read(A)
write(A)
read(B)
write(B)
Schedule3-showing only the read and write instructions
read(B)
write(B)
Conflict Serializability
Let us consider a schedule S in which there are two consecutive instructions Ii and Ij, of
transactions Ti and Tj , respectively (i _= j). If Ii and Ij refer to different data items, then we
can swap Ii and Ij without affecting the results of any instruction in the schedule.
However, if Ii and Ij refer to the same data item Q, then the order of the two steps
maymatter.
Since we are dealing with only read and write instructions,there are four cases that we need
to consider:
1. Ii = read(Q), Ij = read(Q). The order of Ii and Ij does not matter, since the same value of Q
is read by Ti and Tj , regardless of the order.
2. Ii = read(Q), Ij = write(Q). If Ii comes before Ij, then Ti does not read the value of Q that is
written by Tj in instruction Ij. If Ij comes before Ii, then Ti reads the value of Q that is written
by Tj. Thus, the order of Ii and Ij matters.
3. Ii = write(Q), Ij = read(Q). The order of Ii and Ij matters for reasons similar to those of the
previous case.
4. Ii = write(Q), Ij = write(Q). Since both instructions are write operations, the order of these
instructions does not affect either Ti or Tj . However, the value obtained by the next read(Q)
instruction of S is affected, since the result of only the latter of the two write instructions is
preserved in the database. If there is no other write(Q) instruction after Ii and Ij in S, then the
order of Ii and Ij directly affects the final value of Q in the database state that results from
schedule S.
The write(A) instruction of T1 conflicts with the read(A) instruction of T2.However, the
write(A) instruction of T2 does not conflict with the read(B) instruction of T1, because the
two instructions access different data items.
Since the write(A) instruction of T2 in schedule 3 does not conflict with the read(B)
instruction of T1, we can swap these instructions to generate an equivalent schedule.
T1 T2
read(A)
write(A)
read(A)
read(B) write(A)
T1 T2
read(A) read(A)
write(A) write(A)
read(B) read(B)
write(B) write(B)
T3 T2
read(Q)
write(Q)
write(Q)
Schedule 7.
View Serializability
Consider two schedules S and S_, where the same set of transactions participates in both
schedules. The schedules S and S_ are said to be view equivalent if three conditions are met:
1. For each data item Q, if transaction Ti reads the initial value of Q in schedule S, then
transaction Ti must, in schedule S_, also read the initial value of Q.
2. For each data item Q, if transaction Ti executes read(Q) in schedule S, and if that value was
produced by a write(Q) operation executed by transaction Tj ,then the read(Q) operation of
transaction Ti must, in schedule S_, also read the value of Q that was produced by the same
write(Q) operation of transaction Tj .
3. For each data item Q, the transaction (if any) that performs the final write(Q) operation in
schedule S must perform the final write(Q) operation in schedule S_.
T1 T2
read(A) read(B)
A:=A-50 B:=B-10
Write(A) write(B)
read(B) read(A)
B:=B+50 A:=A+10
Schedule 8.
write(B) write(A)
The concept of view equivalence leads to the concept of view serializability. We say that a
schedule S is view serializable if it is view equivalent to a serial schedule.
T3 T4 T6
Write(Q)
Recoverability
We now address the effect of transaction failures during concurrent execution.If a transaction
Ti fails, for whatever reason, we need to undo the effect of thistransaction to ensure the
atomicity property of the transaction.
In a system that allowsconcurrent execution, it is necessary also to ensure that any transaction
Tj that isdependent on Ti (that is, Tj has read data written by Ti) is also aborted. To achieve
this surety, we need to place restrictions on the type of schedules permitted in the system.
Recoverable Schedules
Most database system require that all schedules be recoverable. A recoverable schedule is
one where, for each pair of transactions Ti and Tj such that Tj reads a data item previously
written by Ti, the commit operation of Ti appears before the commit operation of Tj .
Cascadeless Schedules
Even if a schedule is recoverable, to recover correctly from the failure of a transaction Ti, we
may have to roll back several transactions. Such situations occur if transactions have read
data written by Ti. As an illustration, consider the partial schedule
T8 T9
read(A) Read(A)
write(A)
read(B)
Schedule 10
Read(B) Write(A)
Write(A)
Schedule 11
Transaction T10 writes a value of A that is read by transaction T11.Transaction T11 writes a
value of A that is read by transaction T12. Suppose that,at this point, T10 fails. T10 must be
rolled back. Since T11 is dependent on T10, T11must be rolled back. Since T12 is dependent
on T11, T12 must be rolled back.
Formally, acascadeless schedule is one where, for each pair of transactions Ti and Tj such
that Tj reads a data item previously written by Ti, the commit operation of Ti appears before
the read operation of Tj . It is easy to verify that every cascadeless schedule is also
recoverable.
Implementation of Isolation
There are various concurrency-control schemes that we can use to ensure that, even when
multiple transactions are executed concurrently, only acceptable schedules are generated,
regardless of how the operating-system time-shares resources (such as CPU time) among the
transactions.
As a result of the locking policy, only one transaction can execute at a time. Therefore, only
serial schedules are generated. These are trivially serializable, and it is easy to verify that they
are cascade less as well.
A data-manipulation language must include a construct for specifying the set of actions that
constitute a transaction.
The SQL standard specifies that a transaction begins implicitly. Transactions are ended by
one of these SQL statements:
• Commit work commits the current transaction and begins a new one.
The keyword work is optional in both the statements. If a program terminates without either
of these commands, the updates are either committed or rolled back— which of the two
happens is not specified by the standard and depends on the implementation.
The standard also specifies that the system must ensure both serializability and freedom from
cascading rollback.
The definition of serializability used by the standard is that a schedule must have the same
effect as would some serial schedule. Thus, conflict and view serializability are both
acceptable.
The SQL-92 standard also allows a transaction to specify that it may be executed in a manner
that causes it to become nonserializable with respect to other transactions.
When designing concurrency control schemes, we must show that schedules generated by the
scheme are serializable. To do that, we must first understand how to determine, given a
particular schedule S, whether the schedule is serializable We now present a simple and
efficient method for determining conflict serializability of a schedule.
This graph consists of a pair G = (V, E), where V is a set of vertices and E is a set of edges.
The set of vertices consists of all the transactions participating in the schedule.
The set of edges consists of all edges Ti →Tj for which one of three conditions holds:
.
T1 T2 T2 T1
(a) (b)
If an edge Ti → Tj exists in the precedence graph, then, in any serial schedule S_equivalent to
S, Ti must appear before Tj .
T1 →T2, because T1 executes read(A) before T2 executes write(A). It also contains the edge
T2 → T1, because T2 executes read(B) before T1 executes write(B).If the precedence graph
for S has a cycle, then schedule S is not conflict serializable.If the graph contains no cycles,
then the schedule S is conflict serializable.
Thus, to test for conflict serializability, we need to construct the precedence graph and to
invoke a cycle-detection algorithm.
T1 T2
Testing for view serializability is rather complicated. In fact, it has been shown that the
problem of testing for view serializability is itself NP-complete. Thus, almost certainly there
exists no efficient algorithm to test for view serializability.
Concurrency Control
Lock-Based Protocols
Graph-Based Protocols
Timestamp-Based Protocols
Multiple Granularity
Multiversion Protocols
Deadlock Handling
Lock-Based Protocols:
shared mode (S): Data item can only be read. S-lock is requested
Locking protocol:
A set of rules followed by all transactions while requesting and releasing locks.
Locking protocols restrict the set of possible schedules.Ensure serializable schedules by
delaying transactions that might violate serializability.
Lock-compatibility matrix tells whether two locks are compatible or not.Any number of
transactions canhold shared locks on a data item. If any transaction holds an exclusive
lock on a data item no other transaction may hold any lock on that item.
Locking Rules/Protocol:
Lock 1
Early unlocking can cause incorrect results, non-serializable schedules.if A and B get
updated in-between the read of A and B, the displayed sum would be wrong.
1. X-lock(B)
2. read B
3. B := B-50
4. write B
5. U-lock(B)
6. S-lock(A)
7. read A
8. U-lock(A)
9. S-lock(B)
10. read B
11. U-lock(B)
12. display A + B
13. X-lock(A)
14. read A
15. A := A+50
16. write A
17. U-lock(A)
T1 T2
Late unlocking causes deadlocks.Neither T1 nor T2 can make progress: executing lock-
S(B) causes T2 to wait for T1 to release its lock on B. executing lock-X(A) causes T1 to
wait for T2 to release its lock on A.To handle a deadlock
1. X-lock(B)
2. read(B)
3. B := B-50
4. write(B)
5. S-lock(A)
6. read(A)
7. S-lock(B)
8. X-lock(A)
T1 T2
phases:
Phase 1: Growing Phase: transaction may obtain locks transaction may not release locks
Phase 2: Shrinking Phase: transaction may release locks transaction may not obtain locks
When the first lock is released, the transaction moves from phase 1 to phase 2.
Properties of the Two-Phase Locking Protocol. Ensures serializability It can be
shown that the transactions can be serialized in the order of their
lock points (i.e. the point where a transaction acquired its final lock).
All locks are held till commit/abort. Transactions can be serialized in the order in
which they commit.
Refine the two-phase locking protocol with lock conversions
Phase 1:Acquire a lock-S on item can acquire a lock-X on item can convert a lock-S to a
lock-X (upgrade)
* Ensures serializability; but still relies on the programmer to insert the various locking
instructions.
*Strict and rigorous two-phase locking (with lock conversions) are used extensively in
DBMS.
read(D)
else
grant Ti a lock-S on D;
read(D);
end
write(D)
else
else
grant Ti a lock-X on D;
end
write(D);
end
Implementation of Locking:
Implemented as in-memory hash table indexed on the data item being locked. Black
rectangles indicate granted locks.
White rectangles indicate waiting requests. Records also the type of lock
granted/requested.
Processing of requests:
New request is added to the end of the queue of requests for the data item, and
granted if it is compatible with all earlier locks.
Unlock requests result in the request being deleted, and later requests are
checked to see if they can now be granted.
If transaction aborts, all waiting or grantedrequests of the transaction are
deleted. Index on transaction to implement this efficiently.
Graph-Based Protocols:
Impose a partial order (on the set D = {d1, d2 ,..., dh} of all data items.
If di dj then any transaction accessing both di and dj must access di before
accessing dj.
Implies that the set D may now be viewed as a directed acyclic graph, called
adatabase graph. Are an alternative to two-phase locking.Ensure conflict
serializability.
Only exclusive locks lock-X are allowed.
The first lock by Ti may be on any data item.
Subsequently, a data item Q can be locked by Ti only if the parent of Q is
currently locked by Ti.
Data items may be unlocked at any time. A data item that has been locked and
unlocked by Ti cannot subsequently be relocked by Ti.
Example: The following 4 transactions follow the treeprotocol on the database graph below.
The tree protocol is ensures conflict serializability, ensures freedom from deadlock.
the abort of a transaction might lead to cascading rollbacks Unlocking may occur
earlier in the tree-locking protocol than in the two-phase locking protocol.
shorter waiting times and increase in concurrency however, in the tree-protocol a
transaction may have to lock data items that it does not access. increased locking
overhead and additional waiting time potential decrease in concurrency
Schedules not possible under two-phase locking are possible under tree protocol and
vice versa.
TIMESTAMP BASED PROTOCOL:
Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 22
UNIT –V DBMS
Timestamp based protocol is the locking protocols and the order between every
pair of conflicting transactions at execution time by the first that both members
of the pairs request that involves incompatible modes.
Another method for determining the serializability order is to select an ordering
among transaction in advance.The most common method for doing so is to use a
timestamp ordering scheme.
TIMESTAMPS:
To implement this scheme we associate with each data item Q two timestamp values.
The timestamp-ordering protocol ensures that any conflicting read and write operations
are executed in timestamp order.This protocols operates as follows:
c.Otherwise the system executes the write operation and sets W-timestamp(Q)to
TS(Ti).
The protocol can generate schedules that are not recoverable.However it can be
extended to make the schedules recoverable in one several ways
timestamp ordering protocol. Since TS(T14) < TS(T15), the schedule must be conflict
equivalent to schedule <T14,T15>
read(B)
read(B)
B := B – 50
write(B)
read(A)
read(A)
display(A+B)
A := A + 50
write(A)
display(A+B)
T14 T15
The read(Q) operation of T16 succeeds as does the write (Q) operation,we find
that TS(T16)<W-timestamp(Q),since W-timestamp(Q)=TS(T17).
Thus the write (Q) by T16 is rejected and transaction T16 must be rolled back.
Although the rollback of T16 is required by the timestamp ordering protocol it
is unnecessary.Since T17 has already written Q,the value that T16 is
attempting to write is one that will never need to be read.
Any transaction Ti with TS(Ti)<TS(T17) that attempts a read(Q) will be rolled
back.Since TS(Ti)<W-timestamp(Q).
3.Otherwise the system executes the write operation and sets W-timestamp(Q) to
TS(Ti).Under Thomas’ write rule the write(Q) operation of T16 would be ign ored.The
result is a schedule that is view equivalent to the serial schedule<T16,T17>.
VALIDATION-BASED PROTOCOLS:
3.Write Phase: If transaction Ti succeeds in valdation (step 2), then the system applies
the actual updates to the database.Otherwise the system rolls back Ti.
To perfom the validation test we need to know when the various phases of transaction
Ti took place and then the associate three different timestamps with transaction Ti.
2.Validation(Ti) is the time when Ti finished its read phase and started
2.The set of data items written by Ti does not intersect with the set of data items read
by Tj and Ti completes its write phase before Tj starts its validation
phase(start(Ti)<validation(Tj)).This condition ensures that the writes of Ti and Tj do
not overlap.
The validation scheme is called the optimistic concurrency control scheme since
transactions execute optimistically assuming the will be able to finish execution and
validate at the end.In constrast locking and timestamp ordering are pessimistic in that
they force a wait or a rollback whenever a conflict is detected and even though there is a
chance that the schedule may be conflict serializable.
Multiple Granularity
Granularity of locking (= level in tree where locking is done): fine granularity (lower in
tree): high concurrency, high locking overhead. coarse granularity (higher in tree): low
locking overhead, low concurrency
Multiversion Protocols:
Concurrency control protocols studied thus far ensure serializability by either delaying an
operation or aborting the transaction.
Multiversion schemes keep old versions of data items to increase concurrency. Each
successful write(Q) creates a new version of Q.Timestamps are used to label versions.When
a read(Q) operation is issued, select an appropriate version of Q based on the timestamp of
the transaction. reads never have to wait as an appropriate version is available. Two types of
multiversion protocols
1. If transaction Ti issues a read(Q), then the value returned is the content of version Qk,
which is the version of Q with the largest write timestamp less than or equal to TS(Ti)
write has already read a version created by a transaction older than Ti.
Disadvantages
Reading of a data item also requires the updating of the Rtimestamp,resulting in two
disk accesses rather than one.
The conflicts between transactions are resolved through rollbacks rather than through
waits.
DEADLOCK HANDLING:
write(Y) write(X)
T1 T2
lockX
on X
write (X)
lockX
on Y
write (Y)
on X
on Y
DEADLOCK PREVENTION:
There are two approaches to deadlock prevention.One approach ensures that no cyclic
waits can occurby ordering the request for locks or requiring all locks to be acquired
together.
The other approach is closer to deadlock recovery and performs transaction rollback
instead of waiting for a lock whenever the wait could potentially result in a deadlock.
The first approach requires that each transaction locks all its data items before it begins
exection.There are two main disadvantages to this protocol.
1.It is often hard to predict before the transaction begins what data items need to be
locked.
2.Data-items utilization may be very low,since many of the data items may be locked
but unused for a long time.
Two different deadlock prevention schemes using timestamps have been proposed.
1.The wait-die scheme is a nonpreemptive technique.When transaction Ti
request a data item currently held byTj,Ti is allowed to wait only if it has timestamp
smaller than that of Tj and otherwise Ti is rolled.
Whenever the system rolls back transaction it is important to ensure that there
is no starvation and no transaction gets rolled back repeatedly and is never allowed to
make progress.
TIMEOUT-BASED SCHEMES:
1.Selection of a victim: Select that transaction(s) to roll back that will incur minimum
cost.
2.Rollback: Determine how far to roll back transaction.Total rollback: Abort the
transaction and then restart it. More effective to roll back transaction only as far as
necessary to break deadlock.
3.Check Starvation: happens if same transaction is always chosen as victim. Include the
number of rollbacks in the cost factor to avoid starvation
DEADLOCK DETECTION:
When a detection algorithm determines that a deadlock exists the system must
recover from the deadlock.The most common solution is to roll back one or more
transaction to break the deadlock.
is a total rollback.However it is more effective to roll back the transaction only as far as
necessary to break the deadlock.Such partial rollback requires the system to maintain
additional information about the state of all the running transactions.
3.In a system where the selection of victims is based primarly no cost factors.It may
happen that the same transaction is always picked as a victim.As a result this
transaction never completes its designated task thus there is a starvation.The most
common solution is to include the number of rollback is in the cost factors.
Other require the ability to delete data items.To examine how such transactions affect
concurreny control,we introduce these additional operations.
Insert(Q) inserts a new data item Q into the database andassigns Q an initial value.
DELETION:
The presence of delete instructions affects concurrency control and we must decide
when a delete instruction conflicts with another instruction.Let Ii and Ij be instruction
of Ti and Tj respectively that appear in schedule S in consecutive order.let Ii=delete(Q).
INSERTION:
Consider the transaction t20 that executes the following SQL query on the bank
database.
Select sum(balance)
From account
Where branch_name=’perryridge’
Transaction T29 requires access to all tuples of the account relation pertaining to
the perryridge branch.
The major disadvantages of locking a data item corresponding to the relation is the
low degree of concurrency two transaction that insert different tuples into a relation
are prevented from executing concurrently. A better solution for the index locking
technique.The index locking protocol takes advandages of the availability of indices
on a relation by turning instances of the phantom phenomenon into conflict on locks
on index leaf nodes.the protocol operates as follows.Every relation must have one
index.
A transaction Ti can access tuples of a relation only after first finding them
through one or more of the indices on the relation.
A transaction Ti that performs a lookupmust acquire a shared lock on all the
index leaf nodes that it accesses.
A transaction Ti may not insert delete or update a tuple ti in arelation r without
updating all indices on r.
The rules of the two phase locking protocol must be observed.
RECOVERY SYSTEM
Recovery system:
Recovering a system from failure crash is called recovery system or crash recovery.
Failure Classification
There are various types of failure that may occur in a system.there are
Transaction failure :
1. Logical errors:
The Transaction cannot complete due to some internal error condition
2. System errors:
The database system must terminate an active transaction due to an error
condition. (e.g., deadlock)
3. System crash:
A power failure or other hardware or software failure causes the system to crash.
Fail stop assumption:
A non-volatile storage contents are assumed to not be corrupted by system
crash.Database systems have numerous integrity checks to prevent corruption of disk data
Disk failure:
A head crash or similar disk failure destroys all or part of Disk . Destruction is assumed to
be detectable: disk drives use checksums to detect failures
Recovery Algorithms
Non volatile storage: It survives system crashes eg: disk, tape, flash memory, non-
volatile (battery backed up) RAM.
Stable storage: A mythical form of storage that survives all failures approximated by
maintaining multiple copies on distinct nonvolatile media.
Stable Storage Implementation
Maintain multiple copies of each block on separate disks copies can be at remote
sites to protect against disasters such as fire or flooding.
Total failure:
When the first write successfully completes, write the Same information onto the
second physical block.
The output is completed only after the second write successfully completes.
Protecting storage media from failure during data transfer copies of a block may differ
due to failure during output operation.
To recover from failure:
Block movements between disk and main memory are initiated through the
following two operations:
output(B) transfers the buffer block B to the disk, and replaces the
appropriate physical block there.
Each transaction Ti has its private work-area in which local copies of all
data items accessed and updated by it are kept.
We assume, for simplicity, that each data item fits in, and is stored inside, a single
block.
Transaction transfers data items between system buffer blocks and its private work-area
using the following operations :
read(X) assigns the value of data item X to the local variable xi.
write(X) assigns the value of local variable xi to data item {X} in the buffer block.
Transactions:
Modifying the database without ensuring that the transaction will commit may
leave the database in an inconsistent state.
Several output operations may be required for Ti (to output A and B). A failure
may occur after one of these modifications have been made but before all of
them are made.
2. shadow paging
We assume (initially) that transactions run serially, that is, one after the other.
The log is a sequence of log records, and maintains a record of update activities on
the database.
When transaction Ti starts, it registers itself by writing a<Ti start>log record
Before Ti executes write(X), a log record <Ti, X, V1, V2> is written, where V1 is
the value of X before the write, and V2 is the value to be written to x.
Log record notes that Ti has performed a write on data item Xj Xj had value V1
before the write, and will have value V2 after the write.
When Ti finishes it last statement, the log record <Ti commit> is written to .x
We assume for now that log records are written directly to stable storage (that is,
they are not buffered)
The deferred database modification scheme records all modifications to the log, but
defers all the writes to after partial commit.
T0 : read (A)
T1 : read (C)
A: - A - 50 C:- C- 100
read (B)
B:- B + 50
write (B)
(c) redo(T0) must be performed followed by redo(T1) since <T0 commit> and <Ti
commit> are present
Example :
<T0 start>
A = 950
B = 2050
<T0 commit>
<T1 start>
C = 600
BB, BC
<T1 commit>
BA
undo(Ti) restores the value of all data items updated by Ti to their old values,
going backwards from the last log record for Ti
redo(Ti) sets the value of all data items updated by Ti to the
new values, going forward from the first log record for Ti.
Both operations must be idempotent That is, even if the operation is executed
multiple times the effect is the same as if it Is executed once
(b) undo (T1) and redo (T0): C is restored to 700, and then A and
(c) redo (T0) and redo (T1): A and B are set to 950 and 2050
checkpoints:
When a system failure occurs, we must consult the log to determine those transactions
that need to be redone and those that need to be undone. We need to search the entire
log to determine this information. There are two major difficulties with this approach:
1. Output onto stable storage all log records currently residing in main memory.
Shadow Paging
It maintain two page tables during the lifetime of a transaction –the current page
table and the shadow page table
Store the shadow page table in nonvolatile storage, such that state of the database
prior to transaction execution may be recovered.
Shadow page table is never modified during execution. To start with, both the page
tables are identical. Only urrent page table is used for data item accesses during
execution of the transaction.
To commit a transaction
3. Make the current page table the new shadow page table, as follows:
keep a pointer to the shadow page table at a fixed (known) location on disk.
To make the current page table the new shadow page table, simply update the pointer
to point to current page table on disk
Once pointer to shadow page table has been written, transaction is committed.
No recovery is needed after a crash — new transactions can start right away, using the
shadow page table.
Pages not pointed to from current/shadow page table should be freed (garbage
collected).
Disadvantages :