Databases 2 Course Material

Databases 2 course material
Course type: Undergraduate

Module delivery: Semester
ECTS: 7.5 credits
Assessment:
• Assignment (40%)
• Mid-term examination (20%)
• Final examination (40%)
Course overview
1) Introduction to database management system (DBMS)
2) Normalization
3) Transactions
4) DBMS languages
5) SQL create database
6) SQL queries
Topic 1 Introduction to database management system

Databases
 Database is a collection of data stored in tables

 The tables contains data rows (records)
 They can be processed to produce information
Database Management System (DBMS)

 Database management system controls the logical and physical organization of the database.
 Includes different kinds of data: User data, Meta data, Indexes and other overhead data, etc.
 Also includes a collection of programs that allow users to:
 Create database
 Administer/Maintain database
 Query/Process database
Functions of the DBMS
 Data storage, retrieval and update

 Enforce rules
 Transaction support
 Concurrency control
 Back up & Recovery
 Provide security
Why to learn DBMS?

 Real-world entity: A modern DBMS is more realistic and uses real-world entities to design its
architecture. It uses the behavior and attributes too. For example, a school database may use
students as an entity and their age as an attribute.
 Relation-based tables: DBMS allows entities and relations among them to form tables. A user
can understand the architecture of a database just by looking at the table names.
 Isolation of data and application: A database system is entirely different than its data. A
database is an active entity, whereas data is said to be passive, on which the database works
and organizes. DBMS also stores metadata, which is data about data, to ease its own process.
 Less redundancy: DBMS follows the rules of normalization, which splits a relation when any
of its attributes is having redundancy in values. Normalization is a mathematically rich and
scientific process that reduces data redundancy.
 Consistency: Consistency is a state where every relation in a database remains consistent.
There exist methods and techniques, which can detect attempt of leaving database in
inconsistent state. A DBMS can provide greater consistency as compared to earlier forms of
data storing applications like file-processing systems.
 Query Language: DBMS is equipped with query language, which makes it more efficient to
retrieve and manipulate data. A user can apply as many and as different filtering options as
required to retrieve a set of data. Traditionally it was not possible where file-processing system
was used.
Users
The users have different rights and permissions who use it for different purposes.
 Administrators: Maintain the DBMS and are responsible for administrating the database.
They are responsible to look after its usage and by whom it should be used. They create
access profiles for users and apply limitations to maintain isolation and force security.
Administrators also look after DBMS resources like system license, required tools, and
other software and hardware related maintenance.
 Designers: A group of people who actually work on the designing part of the database.
They keep a close watch on what data should be kept and in what format. They identify and
design the whole set of entities, relations, constraints, and views.
 End Users: Are those who actually reap the benefits of having a DBMS. End users can
range from simple viewers who pay attention to the logs or market rates to sophisticated
users such as business analysts.
Architecture of DBMS
 DBMS architecture helps in development, implementation, design, and maintenance of a
database that store and organize information.
 It is the base of any database management system, which allows it to perform the functions
effectively and efficiently.
 It can be designed as centralized, decentralized, or hierarchical.
 The architecture of a DBMS can be seen as either single tier or multi-tier.
1-tier Architecture
In 1-tier architecture, the database is directly available to the DBMS user for executing the SQL
queries and storing data in it. Any changes or updates that are done here will be reflected directly
to the database in the database management system.
Generally, 1-tier architecture is used for the development of applications where a programmer or
developer directly communicates with the database for a quick response.
2-tier Architecture
The 2-tier Architecture of DBMS is based on a client-server machine. In this type of architecture,
applications on the client-side can interact directly with the database at the server-side. For this
interaction between client and the server, application programming interface (API) like Open
Database Connectivity (ODBC) and Java Database Connectivity (JDBC).
This architecture gives poor performance when there are a large number of users at the client
machine to access the database.
3-tier Architecture
The DBMS 3-tier architecture consists of another layer between the client and the server. In this
architecture, the client cannot directly interact with the server. Its features, such as data backup,
recovery, security, and concurrency control make it the most commonly used architecture for
designing the database management system.
The 3-tier architecture consists of the following layers:
 Presentation layer: This layer is also known as the client layer. It is the front end layer in
the 3-tier architecture and consists of a user interface. The main purpose of this layer is to
communicate with the application layer.
 Application layer: This layer is also known as the business logic layer. It acts as a
middle layer between the client and the database server for exchange of partially
processed data.
 Database layer: The data or information is stored in this layer. This layer contains a
method to connect with the database and to perform operations such as insert, update, and
delete.
Normalization
 Normalization is an important tool that allows quality database design
 Design the most appropriate structures for data
 It also derived from the work of Dr. Edgar Codd
 It re-organizes the relations based on rules
Why normalize?
 Reduces the chance of redundant data
 Leads to flexible database design
 Allows future changes to structure
 Addition of entities, attributes and relationships
 Easy insertion, modification and deletion of data
Integrity
Integrity = correctness and consistency of data.
Entity Integrity: In each table, each row has a unique and non-null primary key.
Data Integrity: Every attribute must have correct and meaningful data.
Referential Integrity: Data of one table does not contradict the data in another table.
Transactions
 A fundamental mechanism in the DB management.
 Is a very small unit of a program.
 May contain several low-level tasks.
 In a database system must maintain ACID properties.
 A major tool in the preservation of integrity and accuracy.
 Crucial for multi-users access.
SQL Server Management Studio (SSMS)
Topic 2 Normalization
Topic 3 Transactions
 A fundamental mechanism in the DB management.
 Is a very small unit of a program.
 May contain several low-level tasks.
 In a database system must maintain ACID properties.
 A major tool in the preservation of integrity and accuracy.
 Crucial for multi-users access.
ACID properties
 Atomicity: The property states that a transaction must be treated as an atomic unit, either all or
none of the updates of a transaction are performed.
 Consistency: The database must be left in a consistent state when the transaction terminates.
 Isolation: Concurrent transactions are kept isolated from each other, no transaction will affect the
existence of any other transaction.
 Durability: The database will remain consistent even in the event of a serious failure of the
system, such as loss of power to the computer.
Example:
Operations of transaction
 Read Operation: This operation transfers the data item from the database and then stores it in a
buffer in main memory.
 Write Operation: This operation writes the updated data value back to the database from the
buffer.
 Commit Operation: This operation is used to save the work done permanently in the database.
 Rollback Operation: This operation is used to undo the work done.
Commit example
Rollback
The rollback will effectively delete the new rows and correct the updated row.
Concurrency control
 Multiple transactions can be executed simultaneously.
 Ensure ACID properties.
 Decrease waiting time or turnaround time.
 Improve response time.
 Increased throughput or resource utilization.
Concurrency control problems

 The lost update: Data lost due to overlapping update to same record.
 Uncommitted dependency: Database corruption due to update of uncommitted transaction.
 Inconsistent retrieval: Erroneous query result due to overlapping transactions.
The Lost Update problem

If two or more people updating a record :
 Each takes copy of record
 Performs their updates
 Last one to commit overwrites others
 Integrity lost
Uncommitted Dependency problem

 Occurs when a transaction is allowed to see intermediate results
 Intermediate results used as basis for another transaction
 If then rolled back, integrity lost
Inconsistent Retrieval problem

 Allowed to read partial results of incomplete transactions
 Database integrity not lost, but results of transaction are inconsistent
Concurrency control solution protocols

 Lock-based protocols use a mechanism by which any transaction cannot read or write data until
it acquires an appropriate lock on it. They manage the order between the conflicting pairs among
transactions at the time of execution, whereas timestamp-based protocols start working as soon
as a transaction is created.
 The timestamp based protocol. This protocol uses either system time or logical counter as a
timestamp.
Lock-based protocols
Simplistic Lock Protocol
Allow transactions to obtain a lock on every object before a 'write' operation is performed.
Transactions may unlock the data item after completing the ‘write’ operation.
Pre-claiming Lock Protocol

Evaluate their operations and create a list of data items on which they need locks. Before initiating an
execution, the transaction requests the system for all the locks it needs beforehand. If all the locks
are granted, the transaction executes and releases all the locks when all its operations are over. If all
the locks are not granted, the transaction rolls back and waits until all the locks are granted .
Two-Phase Locking 2PL

When the transaction starts executing, it seeks permission for the locks it requires. As soon as the
transaction releases its first lock, the transaction cannot demand any new locks; it only releases the
acquired locks.
Two-phase locking has two phases:
 The growing phase, where all the locks are being acquired by the transaction.
 The shrinking phase, where the locks held by the transaction are being released.
Strict Two-Phase Locking

The first phase of Strict-2PL is same as 2PL. After acquiring all the locks in the first phase, the
transaction continues to execute normally. But in contrast to 2PL, Strict-2PL does not release a lock
after using it. Strict-2PL holds all the locks until the commit point and releases all the locks at a time.
Timestamp based protocols
 The most commonly used concurrency protocol is the timestamp based protocol.
 This protocol uses either system time or logical counter as a timestamp.
 Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution, whereas timestamp-based protocols start working as soon as a transaction is
created.
 Every data item is given the latest read and write-timestamp.
 This lets the system know when the last ‘read and write’ operation was performed on the data
item.
Preventing the Lost Update Problem
Preventing the Uncommitted Dependency Problem

Deadlock
 Locking can produce an undesirable effect known as 'deadlock‘.
 This occurs in situations where users can apply two or more locks at the same time; this can
cause a circular wait situation.
 Example: A transaction T1 holds a lock on some rows and needs to update some rows in the
order table. Simultaneously, transaction T2 holds a lock on some rows in the order table but
needs to update the rows in the account table held by Transaction T1.
 The main problem arises. Transaction T1 cannot complete its execution because it is waiting for
transaction T2 to release its lock. And similarly, transaction T2 is waiting for transaction T1 to
release its lock.
Deadlock Prevention
 Deadlocks are caused by the interleaving of lock applications.
 This can be avoided by obtaining all required locks at the same time.
 If a lock is required later in the transaction, all currently held locks must be initially released.

Databases 2 Course Material

Uploaded by

Copyright:

Available Formats

Databases 2 Course Material

Uploaded by

Copyright:

Available Formats

Databases 2 course material

Course type: Undergraduate

Topic 1 Introduction to database management system

 Database is a collection of data stored in tables

Database Management System (DBMS)

 Data storage, retrieval and update

Why to learn DBMS?

SQL Server Management Studio (SSMS)

 Rollback Operation: This operation is used to undo the work done.

Concurrency control problems

 Uncommitted dependency: Database corruption due to update of uncommitted transaction.

 Inconsistent retrieval: Erroneous query result due to overlapping transactions.

The Lost Update problem

Uncommitted Dependency problem

Inconsistent Retrieval problem

Concurrency control solution protocols

Pre-claiming Lock Protocol

Two-Phase Locking 2PL

Strict Two-Phase Locking

Preventing the Lost Update Problem

Preventing the Uncommitted Dependency Problem

You might also like