DBMS Personal Notes

DBMS
A database management system (DBMS) refers to the technology for creating and managing databases. DBMS
is a software tool to organize (create, retrieve, update, and manage) data in a database.
The proper understanding of data structures and algorithms will help you to understand the DBMS quickly.
A database management system (DBMS) refers to the technology for creating and managing databases. DBMS
is a software tool to organize (create, retrieve, update, and manage) data in a database.
The main aim of a DBMS is to supply a way to store up and retrieve database information that is both
convenient and efficient. By data, we mean known facts that can be recorded and that have embedded meaning.
Usually, people use software such as DBASE IV or V, Microsoft ACCESS, or EXCEL to store data in the form
of a database. A datum is a unit of data. Meaningful data combined to form information. Hence, information is
interpreted data - data provided with semantics. MS. ACCESS is one of the most common examples of database
management software.
More on Data, Information, and Knowledge
Knowledge refers to the useful use of information. As you know, that information can be transported, stored,
and shared without any problems and difficulties, but the same cannot be said about knowledge. Knowledge
necessarily involves personal experience and practice.
Database systems are meant to handle an extensive collection of information. Management of data involves
both defining structures for storage of information and providing mechanisms that can do the manipulation of
those stored information. Moreover, the database system must ensure the safety of the information stored,
despite system crashes or attempts at unauthorized access.
WHY USE DBMS
 To develop software applications In less time.
 Data independence and efficient use of data.
 For uniform data administration.
 For data integrity and security.
 For concurrent access to data, and data recovery from crashes.
 To use user-friendly declarative query language.
Where is a Database Management System (DBMS) being Used?
 Airlines: reservations, schedules, etc
 Telecom: calls made, customer details, network usage, etc
 Universities: registration, results, grades, etc
 Sales: products, purchases, customers, etc
 Banking: all transactions etc
Advantages of DBMS
A DBMS manages data and has many benefits. These are:
 Data independence: Application programs should be as free or independent as possible from details of
data representation and storage. DBMS can supply an abstract view of the data for insulating application
code from such facts.
 Efficient data access: DBMS utilizes a mixture of sophisticated concepts and techniques for storing and
retrieving data competently. This feature becomes important in cases where the data is stored on external
storage devices.
 Data integrity and security: If data is accessed through the DBMS, the DBMS can enforce integrity
constraints on the data.
 Data administration: When several users share the data, integrating the administration of data can offer
significant improvements. Experienced professionals understand the nature of the data being managed
and can be responsible for organizing the data representation to reduce redundancy and make the data to
retrieve efficiently.
Components of DBMS
 Users: Users may be of any kind such as DB administrator, System developer, or database users.
 Database application: Database application may be Departmental, Personal, organization's
and / or Internal.
 DBMS: Software that allows users to create and manipulate database access,
 Database: Collection of logical data as a single unit.
Introduction to DataBase
The name indicates what the database is. A database is one of the essential components for many
applications and is used for storing a series of data in a single set. In other words, it is a group/package of
information that is put in order so that it can be easily accessed, manage, and update.
There are different types of databases. They are:
 Bibliographic
 full-text
 numeric
 images
In a database, even the smallest portion of information becomes the data. For example, a Student is a data, a
roll number is a data, and the address is data, height, weight, marks everything is data. In brief, all the living
and non-living objects in this world are data. In this chapter of the database, you will learn about the
fundamental terminologies that are used in DBMS.
DATABASE ENVIRONMENT
One of the primary aims of a database is to supply users with an abstract view of data, hiding a certain
element of how data is stored and manipulated. Therefore, the starting point for the design of a database
should be an abstract and general description of the information needs of the organization that is to be
represented in the database. And hence you will require an environment to store data and make it work as a
database. In this chapter, you will learn about the database environment and its architecture.
A database environment is a collective system of components that comprise and regulates the group of data,
management, and use of data, which consist of software, hardware, people, techniques of handling database,
and the data also.
Here, the hardware in a database environment means the computers and computer peripherals that are being
used to manage a database, and the software means the whole thing right from the operating system (OS) to the
application programs that include database management software like M.S. Access or SQL Server. Again the
people in a database environment include those people who administrate and use the system. The techniques are
the rules, concepts, and instructions given to both the people and the software along with the data with the
group of facts and information positioned within the database environment.
DATABASE ARCHITECTURE
Three-Level ANSI-SPARC Architecture

An early proposal for a standard terminology and general architecture for database systems was produced in
1971 by the DBTG (Data Base Task Group) appointed by the Conference on Data Systems and Languages
(CODASYL, 1971). The DBTG recognized the need for a two-level approach with a system view called the
schema and user views called sub-schemas.
The levels form a three-level architecture that includes an external, a conceptual, and an internal level. The way
users recognize the data is called the external level. The way the DBMS and the operating system distinguish
the data is the internal level, where the data is stored using the data structures and file. The conceptual level
offers both the mapping and the desired independence between the external and internal levels.
What is Database Architecture?
A DBMS architecture is depending on its design and can be of the following types:
 Centralized
 Decentralized
 Hierarchical
DBMS architecture can be seen as either a single-tier or multi-tier. An architecture having n-tier splits the entire
system into related but independent n modules that can be independently customized, changed, altered, or
replaced.
The architecture of a database system is very much influenced by the primary computer system on which the
database system runs. Database systems can be centralized, or client-server, where one server machine executes
work on behalf of multiple client machines. Database systems can also be designed to exploit parallel computer
architectures. Distributed databases span multiple geographically separated machines.
The Three-Tier Architecture
A 3-tier application is an application program that is structured into three major parts; each of them is
distributed to a different place or places in a network. These three divisions are as follows:
 The workstation or presentation layer
 The business or application logic layer
 The database and programming related to managing layer
RELATIONAL DATA MODEL

Nowadays, the relational model is the essential data model for commercial data processing applications, which
achieved its primary position because of its simplicity, which makes the job of the programmer easy, in contrast
to earlier data models such as the network model or the hierarchical model. In this chapter, you will study the
essential and primary uses of the relational model. A substantial theory exists for relational databases.
The Relational Database Management System (RDBMS) has become the leading data-processing software in
use nowadays with approximated new license sales of between US$6 billion and US$10 billion per year. This
software signifies the second generation of DBMSs and is based on the relational data model proposed by Mr.
E. F. Codd in the year 1970.
What is Relational Model?
The relational model is the theoretical basis of relational databases, which is a technique or way of structuring
data using relations, which are grid-like mathematical structures consisting of columns and rows. Codd
proposed the relational model for IBM, but the idea became extremely vital and prominent that his work would
become the basis of relational databases. You might be very familiar with the physical demonstration of a
relation in a database - which is known as a table.
In the relational model, all data is logically structured within relations, i.e., tables, as mentioned above. Each
relation has a name and is formed from named attributes or columns of data. Each tuple or row holds one value
per attribute. The greatest strength of the relational model is the simple logical structure that it forms. Behind
this simple structure is a sophisticated theoretical foundation that is lacking in the first generation of DBMSs.
Objectives of the Relational Model
The relational model's objectives were specified as follows:
 To allow a high degree of data independence, application programs must not be affected by alterations
to the internal data representation, mostly by changes to file organizations or access paths.
 To provide considerable grounds for dealing with data semantics, reliability, and redundancy problems.
In particular, Codd's theory for the relational model introduced the concept of normalized relations, were
relations that have no repeating groups, and the process is called normalization.
 To allow the expansion of set-oriented data manipulation languages.
Real-life Structure of a Relational Database
In general, a row in a table signifies a relationship among a group of values. Since a table is a collection of such
relationships, there is a close connection amongst the concept of the table and the mathematical concept of
relation, from which the relational data model gets its name. In mathematical terminology, a tuple is simply a
sequence or list of values. A relationship between n values is indicated mathematically by an n-tuple of values,
i.e., a tuple with n values, corresponds to a row in a table.
Database Schema
When you talk about the database, you must distinguish between the database schema, which is the logical
blueprint of the database, and the database instance, which is a snapshot of the data in the database at a given
instant in time. The concept of a relation corresponds to the programming language notion of a variable. In
contrast, the concept of a relation schema corresponds to the programming languages' notion of the type
definition. In other words, a database schema is a skeletal structure that represents the logical view of the
complete database. It describes how the data is organized and how the relations among them are associated and
formulates all the constraints that are to be applied to the data.
In general, a relation schema consists of a directory of attributes and their corresponding domain.
Some Common Relational Model Terms
 Relation: A relation is a table with columns and rows.

 Attribute: An attribute is a named column of a relation.
 Domain: A domain is the set of allowable values for one or more attributes.
 Tuple: A tuple is a row of a relation.
DBMS DATA SCHEMAS

In this chapter, you will learn about the basic concepts of data schemas and how data are independent of one
another within a database.
What is Schema in the Database Management System?
A schema can be defined as the design of a database. The overall description of the database is called the
database schema. It can be categorized into three parts. These are:
 Physical Schema
 Logical Schema
 View Schema
A physical schema can be defined as the design of a database at its physical level. In this level, it is expressed
how data is stored in blocks of storage.
A logical schema can be defined as the design of the database at its logical level. In this level, the programmers,
as well as the database administrator (DBA), work. At this level, data can be described as certain types of data
records that can be stored in the form of data structures. However, the internal details (such as an
implementation of data structure) will be remaining hidden at this level.
View schema can be defined as the design of the database at the view level, which generally describes end-user
interaction with database systems.
For example: Let suppose you are storing students' information on a student's table. At the physical level, these
records are described as chunks of storage (in bytes, gigabytes, terabytes, or higher) in memory, and these
elements often remain hidden from the programmers. Then comes the logical level; here at a logical level, these
records can be illustrated as fields and attributes along with their data type(s); their relationship with each other
can be logically implemented. Programmers generally work at this level because they are aware of such things
about database systems. At view level, a user can able to interact with the system, with the help of GUI, and
enter the details on the screen. The users are not aware of the fact of how the data is stored and what data is
stored; such features are hidden from them.
Detail Explanation on 3 Layers of Schema
As we came to know that there are three different types of schema in the database and these are defined
according to the levels of abstraction of the three-level architecture portrayed in the above figure, at the highest
level, there is multiple external schemas (view level schema) (also called sub-schemas) that match up to
different views of the data. At the conceptual level, there is the conceptual schema or the logical schema that
describes all the entities, attributes, and relationships together with integrity constraints. At the lowest level of
abstraction, there is the internal schema or the physical schema that creates a complete description of the
internal model, containing the classifications of stored records, the methods of representation, the data fields,
storage structures used, etc. It is to be noted that there will be only one conceptual schema and one internal
schema per database. The DBMS is responsible for mapping between these three types of schema.
It must also check the schemas for consistency; which means, the DBMS must verify that each external schema
is derivable from the conceptual schema, and must use the information in the conceptual schema for mapping
among those external schemas and the internal schema. It also allows any differences in entity names, attributes
names, attributes order, data types, and so on, to be determined. Lastly, each external schema is related to the
conceptual schema by the external/conceptual mapping. This enables the DBMS to map names in the user's
view on the relevant part of the conceptual schema.
DBMS Data Independence

An important objective of the three-tier architecture is to provide data independence, which means that the
upper levels are unaffected by changes in the lower levels. There are two kinds of data independence: logical
and physical.
Logical Data Independence
Logical data independence can be defined as the immunity of the external schemas to changes in the conceptual
schema.
Physical data independence
Physical data independence can be defined as the immunity of the conceptual schema to changes in the internal
schema.
DBMS RELATIONAL CALCULUS
In this chapter, you will learn about the relational calculus and its concept about the database management
system. A certain arrangement is explicitly stated in relational algebra expression, and a plan for assessing the
query is implied. In the relational calculus, there is no description and depiction of how to assess a query;
Instead, a relational calculus query focuses on what is to retrieve rather than how to retrieve it.
What is Relational Calculus?

Relational calculus is a non-procedural query language, and instead of algebra, it uses mathematical predicate
calculus. The relational calculus is not the same as that of differential and integral calculus in mathematics but
takes its name from a branch of symbolic logic termed as predicate calculus. When applied to databases, it is
found in two forms. These are
 Tuple relational calculus which was originally proposed by Codd in the year 1972 and
 Domain relational calculus which was proposed by Lacroix and Pirotte in the year 1977
In first-order logic or predicate calculus, a predicate is a truth-valued function with arguments. When we replace
with values for the arguments, the function yields an expression, called a proposition, which will be either true
or false.
Example:
For example, steps involved in listing all the employees who attend the 'Networking' Course would be:
SELECT the tuples from COURSE relation with COURSENAME = 'NETWORKING'
PROJECT the COURSE_ID from above result
SELECT the tuples from EMP relation with COURSE_ID resulted above.
Tuple Relational Calculus
In the tuple relational calculus, you will have to find tuples for which a predicate is true. The calculus is
dependent on the use of tuple variables. A tuple variable is a variable that 'ranges over' a named relation: i.e., a
variable whose only permitted values are tuples of the relation.
Example:
For example, to specify the range of a tuple variable S as the Staff relation, we write:
Staff(S)
To express the query 'Find the set of all tuples S such that F(S) is true,' we can write:
{S | F(S)}
Here, F is called a formula (well-formed formula, or wff in mathematical logic). For example, to express the
query 'Find the staffNo, fName, lName, position, sex, DOB, salary, and branchNo of all staff earning more than
£10,000', we can write:
{S | Staff(S) ∧ S.salary > 10000}
Example:
{t | TEACHER (t) and t.SALARY>20000}
- It implies that it selects the tuples from the TEACHER in such a way that the resulting teacher tuples will have
a salary higher than 20000. This is an example of selecting a range of values.
{t | TEACHER (t) AND t.DEPT_ID = 6}
- T select all the tuples of teachers' names who work under Department 8. Any tuple variable with 'For All' (?)
or 'there exists' (?) condition is termed as a bound variable. In the last example, for any range of values of
SALARY greater than 20000, the meaning of the condition does not alter. Bound variables are those ranges of
tuple variables whose meaning will not alter if another tuple variable replaces the tuple variable.
In the second example, you have used DEPT_ID= 8, which means only for DEPT_ID = 8 display the teacher
details. Such a variable is called a free variable. Any tuple variable without any 'For All' or 'there exists'
condition is called Free Variable.
Domain Relational Calculus
In the tuple relational calculus, you have use variables that have a series of tuples in a relation. In the domain
relational calculus, you will also use variables, but in this case, the variables take their values from domains of
attributes rather than tuples of relations. A domain relational calculus expression has the following general
format:
{d1, d2, . . . , dn | F(d1, d2, . . . , dm)} m ≥ n
where d1, d2, . . . , dn, . . . , dm stand for domain variables and F(d1, d2, . . . , dm) stands for a formula
composed of atoms.
Example:
select TCHR_ID and TCHR_NAME of teachers who work for department 8, (where suppose - dept. 8 is
Computer Application Department)
{<tchr_id, tchr_name=""> | <tchr_id, tchr_name=""> ? TEACHER Λ DEPT_ID = 10}
Get the name of the department name where Karlos works:
{DEPT_NAME |< DEPT_NAME > ? DEPT Λ ? DEPT_ID ( ? TEACHER Λ TCHR_NAME = Karlos)}
It is to be noted that these queries are safe. The use domain relational calculus is restricted to safe expressions;
moreover, it is equivalent to the tuple relational calculus, which in turn is similar to the relational algebra
DBMS DATABASE LANGUAGES

In the previous chapters, you have learned about the various forms of relational algebra and relational calculus
and their uses with the database management system. In this chapter, you will get to know about the various
forms of languages that are used to deal with the database.
What are database Sub languages?

A data sublanguage mainly has two parts:
 Data Definition Language (DDL) and
 Data Manipulation Language (DML).
The Data Definition Language is used for specifying the database schema, and the Data Manipulation Language
is used for both reading and updating the database. These languages are called data sub-languages as they do
not include constructs for all computational requirements.
Computation purposes include conditional or iterative statements that are supported by the high-level
programming languages. Many DBMSs can embed the sublanguage is a high-level programming language such
as 'Fortran,' 'C,' C++, Java, or Visual Basic. Here, the high-level language is sometimes referred to as the host
language as it is acting as a host for this language. To compile the embedded file, the commands in the data sub-
language are first detached from the host-language program and are substituted by function calls. The pre-
processed file is then compiled and placed in an object module, which gets linked with a DBMS-specific library
that is having the replaced functions and executed based on the requirement. Most data sub-languages also
supply non-embedded or interactive commands which can be input directly using the terminal.
Data Definition Language
Data Definition Language (DDL) statements are used to classify the database structure or schema. It is a type of
language that allows the DBA or user to depict and name those entities, attributes, and relationships that are
required for the application along with any associated integrity and security constraints. Here are the lists of
tasks that come under DDL:
 CREATE - used to create objects in the database
 ALTER - used to alters the structure of the database
 DROP - used to delete objects from the database
 TRUNCATE - used to remove all records from a table, including all spaces allocated for the records are
removed
 COMMENT - used to add comments to the data dictionary
 RENAME - used to rename an object
Data Manipulation Language
A language that offers a set of operations to support the fundamental data manipulation operations on the data
held in the database. Data Manipulation Language (DML) statements are used to manage data within schema
objects. Here are the lists of tasks that come under DML:
 SELECT - It retrieves data from a database
 INSERT - It inserts data into a table
 UPDATE - It updates existing data within a table
 DELETE - It deletes all records from a table, the space for the records remain
 MERGE - UPSERT operation (insert or update)
 CALL - It calls a PL/SQL or Java subprogram
 EXPLAIN PLAN - It explains the access path to data
 LOCK TABLE - It controls concurrency
Data Control Language
There are two other forms of database sub-languages. The Data Control Language (DCL) is used to control
privilege in Databases. To perform any operation in the database, such as for creating tables, sequences, or
views, we need privileges. Privileges are of two types,
 System - creating a session, table, etc. are all types of system privilege.
 Object - any command or query to work on tables comes under object privilege. DCL is used to define
two commands. These are:
 Grant - It gives user access privileges to a database.
 Revoke - It takes back permissions from the user.
Transaction Control Language (TCL)
Transaction Control statements are used to run the changes made by DML statements. It allows statements to be
grouped into logical transactions.
 COMMIT - It saves the work done
 SAVEPOINT - It identifies a point in a transaction to which you can later roll back
 ROLLBACK - It restores the database to original since the last COMMIT
 SET TRANSACTION - It changes the transaction options like isolation level and what rollback segment
to use
DATABASE NORMALISATION
Database normalization is a database schema design technique, by which an existing schema is modified to
minimize redundancy and dependency of data.
Normalization split a large table into smaller tables and define relationships between them to increases the
clarity in organizing data.
Some Facts About Database Normalization
 The words normalization and normal form refer to the structure of a database.
 Normalization was developed by IBM researcher E.F. Codd In the 1970s.
 Normalization increases clarity in organizing data in Databases.
Normalization of a Database is achieved by following a set of rules called 'forms' in creating the database.
Database Normalization Rules
The database normalization process is divided into following the normal form:
 First Normal Form (1NF)
 Second Normal Form (2NF)
 Third Normal Form (3NF)
 Boyce-Codd Normal Form (BCNF)
 Fourth Normal Form (4NF)
 Fifth Normal Form (5NF)
First Normal Form (1NF)
Each column is unique in 1NF.
Example:
Sample Employee table, it displays employees are working with multiple departments.
Employee table following 1NF:
Employee Age Department
Melvin 32 Marketing
Melvin 32 Sales
Edward 45 Quality Assurance
Alex 36 Human Resource
Second Normal Form (2NF)

The entity should be considered already in 1NF, and all attributes within the entity should depend solely on the
unique identifier of the entity.
Example:
Sample Products table:
productID product Brand
1 Monitor Apple
2 Monitor Samsung
3 Scanner HP
4 Head phone JBL
Product table following 2NF:

Products Category table:
productID product
1 Monitor
2 Scanner
3 Head phone
Brand table:
brandID brand
1 Apple
2 Samsung
3 HP
4 JBL
Products Brand table:

pbID productID brandID
1 1 1
2 1 2
3 2 3
4 3 4
Third Normal Form (3NF)

The entity should be considered already in 2NF, and no column entry should be dependent on any other entry
(value) other than the key for the table.
If such an entity exists, move it outside into a new table.
3NF is achieved, considered as the database is normalized.
Boyce-Codd Normal Form (BCNF)

3NF and all tables in the database should be only one primary key.
Fourth Normal Form (4NF)

Tables cannot have multi-valued dependencies on a Primary Key.
Fifth Normal Form (5NF)
A composite key shouldn't have any cyclic dependencies.
Well, this is a highly simplified explanation for Database Normalization. One can study this process
extensively, though. After working with databases for some time, you'll automatically create Normalized
databases, as it's logical and practical.

DBMS Personal Notes

Uploaded by

DBMS Personal Notes

Uploaded by

DBMS

Three-Level ANSI-SPARC Architecture

RELATIONAL DATA MODEL

 Relation: A relation is a table with columns and rows.

DBMS DATA SCHEMAS

DBMS Data Independence

What is Relational Calculus?

DBMS DATABASE LANGUAGES

What are database Sub languages?

First Normal Form (1NF)

Each column is unique in 1NF.

Edward 45 Quality Assurance

Alex 36 Human Resource

Second Normal Form (2NF)

4 Head phone JBL

Product table following 2NF:

Products Brand table:

Third Normal Form (3NF)

Boyce-Codd Normal Form (BCNF)

Fourth Normal Form (4NF)

You might also like