Unit 1 Module
Unit 1 Module
1
REGULATION 2019 ACADEMIC YEAR 2022-2023
Fast Retrieval of data: Along with storing the data in an optimized and systematic manner, it
is also important that we retrieve the data quickly when needed. Database systems ensure that
the data is retrieved as quickly as possible.
2
REGULATION 2019 ACADEMIC YEAR 2022-2023
3
REGULATION 2019 ACADEMIC YEAR 2022-2023
Example: In this we are storing customer information in a customer table. At physical level
these records can be described as blocks of storage (bytes, gigabytes, terabytes etc.) in
memory. These details are often hidden from the programmers.
At the logical level these records can be described as fields and attributes along with their
data types, their relationship among each other can be logically implemented. The
programmers generally work at this level because they are aware of such things about
database systems.
At view level, user just interact with system with the help of GUI and enter the details at the
screen, they are not aware of how the data is stored and what data is stored; such details are
hidden from them.
Definition of schema: Design of a database is called the schema. Schema is of three types:
Physical schema, logical schema and view schema.
For example: In the following diagram, a schema that shows the relationship between three
tables: Course, Student and Section. The diagram only shows the design of the database, it
doesn’t show the data present in those tables. Schema is only a structural view(design) of a
database as shown in the diagram below.
The design of a database at physical level is called physical schema, how the data stored in
blocks of storage is described at this level.
Design of database at logical level is called logical schema, programmers and database
administrators work at this level, at this level data can be described as certain types of data
records gets stored in data structures, however the internal details such as implementation of
data structure is hidden at this level (available at physical level).
4
REGULATION 2019 ACADEMIC YEAR 2022-2023
Design of database at view level is called view schema. This generally describes end user
interaction with database systems.
Definition of instance: The data stored in database at a particular moment of time is called
instance of database. Database schema defines the variable declarations in tables that belong
to a particular database; the value of these variables at a moment of time is called the instance
of that database. For example, lets say we have a single table student in the database, today
the table has 100 records, so today the instance of the database has 100 records. Lets say we
are going to add another 100 records in this table by tomorrow so the instance of database
tomorrow will have 200 records in table. In short, at a particular moment the data stored in
database is called the instance, that changes over time when we add or delete data from the
database.
5
REGULATION 2019 ACADEMIC YEAR 2022-2023
well as its own. Inheritance is defined as the ability of a lower-level object to inherit, or
access, the data items and behaviors associated with all classes which are above it in the class
hierarchy.
6
REGULATION 2019 ACADEMIC YEAR 2022-2023
by means of an application. Here the application tier is entirely independent of the database
in terms of operation, design, and programming.
1.5.1 3-tier Architecture
A 3-tier architecture separates its tiers from each other based on the complexity of the users
and how they use the data present in the database. It is the most widely used architecture to
design a DBMS.
Fig1.3 3tier-architecture
Database (Data) Tier − At this tier, the database resides along with its query
processing languages. We also have the relations that define the data and their
constraints at this level.
Application (Middle) Tier − At this tier reside the application server and the
programs that access the database. For a user, this application tier presents an
abstracted view of the database. End-users are unaware of any existence of the
database beyond the application. At the other end, the database tier is not aware of
any other user beyond the application tier. Hence, the application layer sits in the
middle and acts as a mediator between the end-user and the database.
User (Presentation) Tier − End-users operate on this tier and they know nothing
about any existence of the database beyond this layer. At this layer, multiple views
of the database can be provided by the application. All views are generated by
applications that reside in the application tier.
Multiple-tier database architecture is highly modifiable, as almost all its components are
independent and can be changed independently.
1.6 APPLICATIONS OF DBMS
Telecom: There is a database to keeps track of the information regarding calls made,
network usage, customer details etc. Without the database systems it is hard to maintain that
huge amount of data that keeps updating every millisecond.
Banking System: For storing customer info, tracking day to day credit and debit
transactions, generating bank statements etc. All this work has been done with the help of
Database management systems.
Sales: To store customer information, production information and invoice details.
Airlines: To travel though airlines, we make early reservations, this reservation information
along with flight schedule is stored in database.
7
REGULATION 2019 ACADEMIC YEAR 2022-2023
Education sector: Database systems are frequently used in schools and colleges to store and
retrieve the data regarding student details, staff details, course details, exam details, payroll
data, attendance details, fees details etc.
Online shopping: Online shopping websites such as Amazon, Flipkart etc. These sites store
the product information, your addresses and preferences, credit details and provide you the
relevant list of products based on your query. All this involves a Database management
system.
Table 1
Product_code Description Price
Table 2
Invoice_code Invoice_line Product_code Quantity
3804 1 A416 10
3804 2 C923 15
Let's take a closer look at the previous two tables to see how they are organized:
8
REGULATION 2019 ACADEMIC YEAR 2022-2023
Given this kind of structure, the database gives you a way to manipulate this data: SQL. SQL
(structured query language) is a powerful way to search for records or make changes. Almost
all DBMSs use SQL, although many have added their own enhancements to it.
1.7.1 Relational Model
Relational Model represents how data is stored in Relational Databases. A relational
database stores data in the form of relations (tables). Consider a relation STUDENT with
attributes ROLL_NO, NAME, ADDRESS, PHONE and AGE shown in below table
STUDENT
ROLL_NO NAME ADDRESS PHONE AGE
4 SURESH DELHI 18
1. Attribute: Attributes are the properties that define a relation.
e.g.; ROLL_NO, NAME
2. Relation Schema: A relation schema represents name of the relation with its
attributes. e.g.; STUDENT (ROLL_NO, NAME, ADDRESS, PHONE and AGE) is
relation schema for STUDENT. If a schema has more than 1 relation, it is called
Relational Schema.
3. Tuple: Each row in the relation is known as tuple. The above relation contains 4
tuples, one of which is shown as:
1 RAM DELHI 9455123451 18
4. Relation Instance: The set of tuples of a relation at a particular instance of time is
called as relation instance. Table 1 shows the relation instance of STUDENT at a
particular time. It can change whenever there is insertion, deletion or updation in the
database.
5. Degree: The number of attributes in the relation is known as degree of the relation.
The STUDENT relation defined above has degree 5.
6. Cardinality: The number of tuples in a relation is known as cardinality.
The STUDENT relation defined above has cardinality 4.
7. Column: Column represents the set of values for a particular attribute. The
column ROLL_NO is extracted from relation STUDENT.
9
REGULATION 2019 ACADEMIC YEAR 2022-2023
8. NULL Values: The value which is not known or unavailable is called NULL value. It
is represented by blank space. e.g.; PHONE of STUDENT having ROLL_NO 4 is
NULL.
1.7.2 Constraints in Relational Model
To design a relational Model, some conditions are defined which must hold data present in
database are called Constraints. These constraints are checked before performing any
operation (insertion, deletion and updation) in database. If there is a violation in any of
constrains, operation will fail.
Domain Constraints: These are attribute level constraints. An attribute can only take values
which lie inside the domain range. e.g,; If a constrains AGE>0 is applied on STUDENT
relation, inserting negative value of AGE will result in failure.
Key Integrity: Every relation in the database should have atleast one set of attributes which
defines a tuple uniquely. Those set of attributes is called key. e.g.; ROLL_NO in STUDENT
is a key. No two students can have same roll number. So a key has two properties:
It should be unique for all tuples.
It can’t have NULL values.
Referential Integrity: When one attribute of a relation can only take values from other
attribute of same relation or any other relation, it is called referential integrity.
1.8 KEYS
A DBMS key is an attribute or set of an attribute which helps us to identify a row(tuple) in a
relation(table). They allow us to find the relation between two tables. Keys help you
uniquely identify a row in a table by a combination of one or more columns in that table.
10
REGULATION 2019 ACADEMIC YEAR 2022-2023
• Surrogate Key
Let's take a simple Student table, with fields student_id, name, phone and age.
Super Key:
Super key is a set of one or more than one columns (attributes) which uniquely identifies each
record in a table. A Super key may have additional attributes that are not needed for unique
identification. Super Key is a superset of Candidate key.
In the table defined above super key would include student_id, (student_id, name), phone etc.
• The first one is pretty simple as student_id is unique for every row of data, hence it
can be used to identity each row uniquely.
• Next comes, (student_id, name), now name of two students can be same, but
their student_id can't be same hence this combination can also be a key.
• Similarly, phone number for every student will be unique, hence again, phone can
also be a key. So they all are super keys.
Primary Key:
A column or group of columns in a table which helps us to uniquely identifies every row in
that table is called a primary key. This DBMS can't be a duplicate. The same value can't
appear more than once in the table. A primary key is a minimal set of attributes (columns) in
a table that uniquely identifies tuples (rows) in that table.
Rules for defining Primary key:
Two rows can't have the same primary key value
It must for every row to have a primary key value.
The primary key field cannot be null.
The value in a primary key column can never be modified or updated if any foreign
key refers to that primary key.
11
REGULATION 2019 ACADEMIC YEAR 2022-2023
Example:
Candidate Key:
A super key with no repeated attribute is called candidate key. The Primary key should be
selected from the candidate keys. Every table must have at least a single candidate key.
Candidate keys are those keys which is candidate for primary key of a table. In simple words
we can understand that such type of keys which full fill all the requirements of primary key
which is not null and have unique records is a candidate for primary key. So thus type of key
is known as candidate key. Every table must have at least one candidate key but at the same
time can have several.
Example: In the given table Stud ID, Roll No, and email are candidate keys which help us to
uniquely identify the student record in the table.
12
REGULATION 2019 ACADEMIC YEAR 2022-2023
Alternate Key:
All the keys which are not primary key are called an alternate key. It is a candidate key which
is currently not the primary key. However, A table may have single or multiple choices for
the primary key. If any table have more than one candidate key, then after choosing primary
key from those candidate key, rest of candidate keys are known as an alternate key of that
table. Like here we can take a very simple example to understand the concept of alternate
key. Suppose we have a table named Employee which has two columns EmpID and
EmpMail, both have not null attributes and unique value. So both columns are treated as
candidate key. Now we make EmpID as a primary key to that table then EmpMail is known
as alternate key.
Example: In this table. StudID, Roll No, Email are qualified to become a primary key. But
since StudID is the primary key, Roll No, Email becomes the alternative key.
StudID Roll No First Name LastName Email
13
REGULATION 2019 ACADEMIC YEAR 2022-2023
Foreign Key:
A foreign key is a column which is added to create a relationship with another table. Foreign
keys help us to maintain data integrity and also allows navigation between two different
instances of an entity. Every relationship in the model needs to be supported by a foreign key.
Example:
DeptCode DeptName
001 Science
002 English
005 Computer
In this example, we have two table, teach and department in a school. However, there is no
way to see which search work in which department.In this table, adding the foreign key in
Deptcode to the Teacher name, we can create a relationship between the two tables.
Composite Key:
COMPOSITE KEY is a combination of two or more columns that uniquely identify rows in a
table. The combination of columns guarantees uniqueness, though individually uniqueness is
not guaranteed. Hence, they are combined to uniquely identify records in a table.
14
REGULATION 2019 ACADEMIC YEAR 2022-2023
Compound Key:
It has two or more attributes that allow you to uniquely recognize a specific record. It is
possible that each column may not be unique by itself within the database. However, when
combined with the other column or columns the combination of composite keys become
unique. The purpose of compound key is to uniquely identify each record in the table.
The difference between compound and the composite key is that any part of the compound
key can be a foreign key, but the composite key may or maybe not a part of the foreign key.
Surrogate Key
An artificial key which aims to uniquely identify each record is called a surrogate key. These
kind of key are unique because they are created when you don't have any natural primary key.
They do not lend any meaning to the data in the table. Surrogate key is usually an integer.
15
REGULATION 2019 ACADEMIC YEAR 2022-2023
Example 1:
Syntax:
ALTER TABLE <table_name>
ADD <column_name datatype>;
OR
ALTER TABLE <table_name>
CHANGE <old_column_name> <new_column_name>;
OR
ALTER TABLE <table_name>
DROP COLUMN <column_name>;
16
REGULATION 2019 ACADEMIC YEAR 2022-2023
SELECT/FROM/WHERE
INSERT INTO/VALUES
UPDATE/SET/WHERE
DELETE FROM/WHERE
These basic constructs allow database programmers and users to enter data and information
into the database and retrieve efficiently using a number of filter options.
SELECT/FROM/WHERE
SELECT − This is one of the fundamental query command of SQL. It is similar to
the projection operation of relational algebra. It selects the attributes based on the
condition described by WHERE clause.
FROM − This clause takes a relation name as an argument from which attributes are
to be selected/projected. In case more than one relation names are given, this clause
corresponds to Cartesian product.
17
REGULATION 2019 ACADEMIC YEAR 2022-2023
WHERE − This clause defines predicate or conditions, which must match in order to
qualify the attributes to be projected.
For example −
Select author_name
From book_author
Where age > 50;
This command will yield the names of authors from the relation book_author whose age is
greater than 50.
INSERT INTO/VALUES
This command is used for inserting values into the rows of a table (relation).
Syntax:
INSERT INTO table (column1 [, column2, column3 ... ]) VALUES (value1 [, value2,
value3 ... ])
Or
INSERT INTO table VALUES (value1, [value2, ... ])
For example −
INSERT INTO tutorialspoint (Author, Subject) VALUES ("anonymous", "computers");
UPDATE/SET/WHERE
This command is used for updating or modifying the values of columns in a table (relation).
Syntax −
UPDATE table_name SET column_name = value [, column_name = value ...] [WHERE
condition]
For example −
UPDATE tutorialspoint SET Author="webmaster" WHERE Author="anonymous";
DELETE/FROM/WHERE
This command is used for removing one or more rows from a table (relation).
Syntax −
DELETE FROM table_name [WHERE condition];
For example −
DELETE FROM tutorialspoints
WHERE Author="unknown";
Advanced SQL features
Accessing SQL From a Programming Language
Dynamic SQL
JDBC and ODBC
Embedded SQL
SQL Data Types and SchemasFunctions and Procedural Constructs
Triggers
Advanced Aggregation FeaturesOLAP
Accessing SQL From a Programming Language
18
REGULATION 2019 ACADEMIC YEAR 2022-2023
Database languages are used to read, update and store data in a database. There are several
such languages that can be used for this purpose; one of them is SQL (Structured Query
Language).
All of these commands either defines or update the database schema that’s why they come
under Data Definition language.
Data Control language (DCL)-DCL is used for granting and revoking user access on a
database
Transaction Control Language(TCL)-The changes in the database that we made using DML
commands are either performed or rollbacked using TCL.
19
REGULATION 2019 ACADEMIC YEAR 2022-2023
Select
Project
Union
Set different
Cartesian product
Rename
Select Operation (σ)-It selects tuples that satisfy the given predicate from a relation.
Notation − σp(r)
Where σ stands for selection predicate and r stands for relation. p is prepositional logic
formula which may use connectors like and, or, and not. These terms may use relational
operators like − =, ≠, ≥, < , >, ≤.
For example −
(i) σsubject = "database"(Books)
Output − Selects tuples from books where subject is 'database'.
(ii) σsubject = "database" and price = "450"(Books)
Output − Selects tuples from books where subject is 'database' and 'price' is 450.
r ∪ s = { t | t ∈ r or t ∈ s}
−
Notation − r U s
Where r and s are either database relations or relation result set (temporary relation).
For a union operation to be valid, the following conditions must hold −r, and s must have the
same number of attributes. Attribute domains must be compatible. Duplicate tuples are
Output − Projects the names of the authors who have either written a book or an article or
both.
Set Difference (−)The result of set difference operation is tuples, which are present in one
relation but are not in the second relation.
Notation − r − s
Finds all the tuples that are present in r but not in s.
∏ author (Books) − ∏ author (Articles)
Output − Provides the name of authors who have written books but not articles.
Cartesian Product (Χ)-Combines information of two different relations into one.
Notation − r Χ s
r Χ s = { q t | q ∈ r and t ∈ s}
Where r and s are relations and their output will be defined as −
20
REGULATION 2019 ACADEMIC YEAR 2022-2023
The results of relational algebra are also relations but without any name. The rename
operation allows us to rename the output relation. 'rename' operation is denoted with small
Greek letter rho ρ.
Notation − ρ x (E)
Where the result of expression E is saved with name of x.
Additional operations are :
Set intersection
Assignment
Natural join
Relational Calculus
In contrast to Relational Algebra, Relational Calculus is a non-procedural query language,
that is, it tells what to do but never explains how to do it. Relational calculus is a query
language which is non-procedural, and instead of algebra, it uses mathematical predicate
calculus. The relational calculus is not the same as that of differential and integral calculus in
mathematics but takes its name from a branch of symbolic logic termed as predicate calculus.
Relational calculus exists in two forms −
Tuple Relational Calculus (TRC)
Filtering variable ranges over tuples
Notation − {T | Condition}
Returns all tuples T that satisfies a condition.
For example −
Output − The above query will yield the same result as the previous one.
In the tuple relational calculus, you will have to find tuples for which a predicate is true. The
calculus is dependent on the use of tuple variables. A tuple variable is a variable that 'ranges
over' a named relation: i.e., a variable whose only permitted values are tuples of the relation.
For example, to specify the range of a tuple variable S as the Staff relation, we write:
Staff(S)
To express the query 'Find the set of all tuples S such that F(S) is true,' we can write:
{S | F(S)}
Here, F is called a formula (well-formed formula, or wff in mathematical logic). For example,
to express the query 'Find the staffNo, fName, lName, position, sex, DOB, salary, and
branchNo of all staff earning more than £10,000', we can write:
21
REGULATION 2019 ACADEMIC YEAR 2022-2023
Notation :
{ a1, a2, a3, ..., an | P (a1, a2, a3, ... ,an)}
Where a1, a2 are attributes and P stands for formulae built by inner attributes.
Output − Yields Article, Page, and Subject from the relation TutorialsPoint, where subject is
database.
Just like TRC, DRC can also be written using existential and universal quantifiers. DRC also
involves relational operators. The expression power of Tuple Relation Calculus and Domain
Relation Calculus is equivalent to Relational Algebra.
ODBC
Open Database Connectivity (ODBC) is an open standard application programming interface
(API) that allows application programmers to access any database. ODBC consists of four
components, working together to enable functions. ODBC allows programs to use SQL
requests that access databases without knowing the proprietary interfaces to the databases.
ODBC handles the SQL request and converts it into a request each database system
understands.
22
REGULATION 2019 ACADEMIC YEAR 2022-2023
Application: Processes and calls the ODBC functions and submits the SQL statements;
Driver manager: Loads drivers for each application;
Driver: Handles ODBC function calls, and then submits each SQL request to a data source;
and
Data source: The data being accessed and its database management system (DBMS) OS.
JDBC
The Java Database Connectivity (JDBC) API uses the Java programming language to access
a database. When writing programs in the Java language using JDBC APIs, users can employ
software that includes a JDBC-ODBC Bridge to access ODBC-supported databases.
However, the JDBC-ODBC Bridge (or JDBC type 1 driver) should be viewed as a
transitional approach, as it creates performance overhead because API calls must pass
through the JDBC bridge to the ODBC driver, then to the native database connectivity
interface. In addition, it was removed in Java Development Kit (JDK) 8, and Oracle does not
support the JDBC-ODBC Bridge. The use of JDBC drivers provided by database vendors,
rather than the JDBC-ODBC Bridge, is the recommended approach.
Using Static SQL has a benefit which is the optimization of the statement that results an
application with high performance as it offers a good flexibility better than Dynamic SQL,
and since access plans for dynamic statements are generated at run-time so they must be
prepared in the application, and this is something you will never look at in the static SQL, but
23
REGULATION 2019 ACADEMIC YEAR 2022-2023
these are not the only differences between them, so we can say that dynamic SQL has only
one advantage over static statements which can be clearly noticed once the application is
edited or upgraded, so with Dynamic statements there’s no need for pre-compilation or re-
building as long as the access plans are generated at run-time, whereas static statements
require regeneration of access plans if they were modified, in addition to the fact that
Dynamic SQL requires more permissions, it also might be a way to execute unauthorized
code, we don’t know what kind of users we will have, so for security it can be dangerous if
the programmer didn’t handle it.
When the pattern of database access is known in advance then static SQL is very adequate to
serve us. Sometimes, in many applications we may not know the pattern of database access in
advance. For example, a report writer must be able to decide at run time that which SQL
statements will be needed to access the database. Such a need can’t be fulfilled with static
SQL and requires an advanced form of static SQL known as dynamic SQL.
There are several limitations in static SQL. Although using the host variables (host variables
allows us to input values for search condition at run time), we can achieve a little bit
dynamicness, for e.g.,
exec sql select tname, sex from teacher where salary > :sal;
Here the salary will be asked on run time. But getting column name or table asked at run time
not possible with embedded SQL. For having such a feature we need dynamic SQL.
24
REGULATION 2019 ACADEMIC YEAR 2022-2023
25
REGULATION 2019 ACADEMIC YEAR 2022-2023
The embedded SQL is a mixture of SQL and programming language, so it cannot be fed
directly to a general purpose programming language compiler. Actually the program
execution is a multi-step which is as follows:
26
REGULATION 2019 ACADEMIC YEAR 2022-2023
exit();
}
27