0% found this document useful (0 votes)
3 views27 pages

Unit 1 Module

The document outlines the fundamentals of Database Management Systems (DBMS), focusing on relational databases, their architecture, and the importance of data organization and retrieval. It discusses the purpose of DBMS, the need for data abstraction, and the various data models including relational, entity-relationship, and object-oriented models. Additionally, it highlights the advantages of using DBMS over traditional file-processing systems, such as reducing data redundancy and improving data integrity and security.

Uploaded by

Aruna
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
Download as doc, pdf, or txt
0% found this document useful (0 votes)
3 views27 pages

Unit 1 Module

The document outlines the fundamentals of Database Management Systems (DBMS), focusing on relational databases, their architecture, and the importance of data organization and retrieval. It discusses the purpose of DBMS, the need for data abstraction, and the various data models including relational, entity-relationship, and object-oriented models. Additionally, it highlights the advantages of using DBMS over traditional file-processing systems, such as reducing data redundancy and improving data integrity and security.

Uploaded by

Aruna
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1/ 27

REGULATION 2019 ACADEMIC YEAR 2022-2023

IFET COLLEGE OF ENGINEERING


DEPARTMENT OF CSE & IT
19UCSPC301- DATABASE MANAGEMENT SYSTEMS
UNIT 1
RELATIONAL DATABASES
UNIT I RELATIONAL DATABASES 11
Purpose of Database System – Views of data – Data Models – Database System Architecture
– Applications of DBMS ,Introduction to relational databases – Keys – SQL fundamentals –
DDL,DML,DCL– Relational Algebra - Embedded SQL– Dynamic SQL - Activity - Creating
a simple database and perform basic SQL Commands.
1.1 INTRODUCTION
Data is a collection of a distinct small unit of information. It can be used in a variety of forms
like text, numbers, media, bytes, etc. A database is an organized collection of data, so that it
can be easily accessed and managed.So it can be organized data into tables, rows, columns,
and index it to make it easier to find relevant information. The main purpose of the database
is to operate a large amount of information by storing, retrieving, and managing data. There
are many databases available like MySQL, Sybase, Oracle, MongoDB, Informix,
PostgreSQL, SQL Server, etc. Modern databases are managed by the database management
system (DBMS). SQL or Structured Query Language is used to operate on the data stored in
a database.
DBMS:
DBMS stands for Database Management System. DBMS = Database + Management System.
Database is a collection of data and Management System is a set of programs to store and
retrieve those data. A database-management system (DBMS) is a collection of interrelated
data and a set of programs to access those data. The primary goal of a DBMS is to provide a
way to store and retrieve database information that is both convenient and efficient.
Where is a DBMS being used?
• Airlines: reservations, schedules, etc
• Telecom: calls made, customer details, network usage, etc
• Universities: registration, results, grades, etc
• Sales: products, purchases, customers, etc
• Banking: all transactions etc
What is the need of DBMS?
Database systems are basically developed for large amount of data. When dealing with huge
amount of data, there are two things that require optimization: Storage of data and retrieval of
data.
Storage: According to the principles of database systems, the data is stored in such a way that
it acquires lot less space as the redundant data (duplicate data) has been removed before
storage.
• In a banking system, suppose a customer is having two accounts, one is saving
account and another is salary account.
• Let’s say bank stores saving account data at one place and salary account data at
another place, in that case if the customer information such as customer name, address
etc. are stored at both places then this is just a wastage of storage (redundancy/
duplication of data), to organize the data in a better way the information should be
stored at one place and both the accounts should be linked to that information
somehow.

1
REGULATION 2019 ACADEMIC YEAR 2022-2023

Fast Retrieval of data: Along with storing the data in an optimized and systematic manner, it
is also important that we retrieve the data quickly when needed. Database systems ensure that
the data is retrieved as quickly as possible.

1.2 PURPOSE OF DATABASE SYSTEM


A database management system (DBMS) is a software tool that makes it possible to organize
data in a database. As an example consider part of a university organization that, among other
data, keeps information about all instructors, students, departments, and course offerings. One
way to keep the information on a computer is to store it in operating system files. To allow
users to manipulate the information, the system has a number of application programs that
manipulate the files, including programs to:
1. Add new students, instructors, and courses
2. Register students for courses and generate class rosters
3. Assign grades to students, compute grade point averages (GPA), and generate transcripts
System programmers wrote these application programs to meet the needs of the university.
This typical file-processing system is supported by a conventional operating system. The
system stores permanent records in various files, and it needs different application programs
to extract records from, and add records to, the appropriate files. Before database
management systems (DBMSs) were introduced, organizations usually stored information in
such systems.
Keeping organizational information in a file-processing system has a number of major
disadvantages:
 Data redundancy and inconsistency: Since different programmers create the files
and application programs over a long period, the various files are likely to have
different structures and the programs may be written in several programming
languages. Moreover, the same information may be duplicated in several places
(files). In addition, it may lead to data inconsistency; that is, the various copies of the
same data may no longer agree. For example, a changed student address may be
reflected in the Music department records but not elsewhere in the system.
 Difficulty in accessing data: Suppose that one of the university clerks needs to find
out the names of all students who live within a particular postal-code area. The clerk
asks the data-processing department to generate such a list. Because the designers of
the original system did not anticipate this request, there is no application program on
hand to meet it. There is, however, an application program to generate the list of all
students. As expected, a program to generate such a list does not exist. The
conventional file-processing environments do not allow needed data to be retrieved in
a convenient and efficient manner. More responsive data-retrieval systems are
required for general use.
 Data isolation: Because data are scattered in various files, and files may be in
different formats, writing new application programs to retrieve the appropriate data is
difficult.
 Integrity problems: The data values stored in the database must satisfy certain types
of consistency constraints. Developers enforce these constraints in the system by
adding appropriate code in the various application programs. However, when new
constraints are added, it is difficult to change the programs to enforce them. The
problem is compounded when constraints involve several data items from different
files.
 Atomicity problems: A computer system, like any other device, is subject to failure.
In many applications, it is crucial that, if a failure occurs, the data be restored to the
consistent state that existed prior to the failure.

2
REGULATION 2019 ACADEMIC YEAR 2022-2023

 Concurrent-access anomalies: For the sake of overall performance of the system


and faster response, many systems allow multiple users to update the data
simultaneously. Indeed, today, the largest Internet retailers may have millions of
accesses per day to their data by shoppers. In such an environment, interaction of
concurrent updates is possible and may result in inconsistent data.
 Security problems: Not every user of the database system should be able to access
all the data. For example, in a university, payroll personnel need to see only that part
of the database that has financial information. They do not need access to information
about academic records. But, since application programs are added to the file-
processing system in an ad hoc manner, enforcing such security constraints is
difficult.
1.3 VIEWS OF DATA
Abstraction is one of the main features of database systems. Hiding irrelevant details from
user and providing abstract view of data to users, helps in easy and efficient user-database
interaction, there are three levels of DBMS architecture, the top level of that architecture is
“view level”. The view level provides the “view of data” to the users and hides the irrelevant
details such as data relationship, database schema, constraints, security etc from the user.

1.3.1 Data Abstraction in DBMS


Database systems are made-up of complex data structures. To ease the user interaction with
database, the developers hide internal irrelevant details from users. This process of hiding
irrelevant details from user is called data abstraction.
There are three levels of abstraction:
Physical level: This is the lowest level of data abstraction. It describes how data is actually
stored in database. Hence the complex data structure details is obtained at this level.
Logical level: This is the middle level of 3-level data abstraction architecture. It describes
what data is stored in database. The next-higher level of abstraction describes what data are
stored in the database, and what relationships exist among those data. The logical level thus
describes the entire database in terms of a small number of relatively simple structures.
Database administrators, who must decide what information to keep in the database, use the
logical level of abstraction.
View level: The highest level of abstraction describes only part of the entire database. Even
though the logical level uses simpler structures, complexity remains because of the variety of
information stored in a large database. Many users of the database system do not need all this
information; instead, they need to access only a part of the database. The view level of
abstraction exists to simplify their interaction with the system. The system may provide
many views for the same database.

3
REGULATION 2019 ACADEMIC YEAR 2022-2023

Example: In this we are storing customer information in a customer table. At physical level
these records can be described as blocks of storage (bytes, gigabytes, terabytes etc.) in
memory. These details are often hidden from the programmers.
At the logical level these records can be described as fields and attributes along with their
data types, their relationship among each other can be logically implemented. The
programmers generally work at this level because they are aware of such things about
database systems.
At view level, user just interact with system with the help of GUI and enter the details at the
screen, they are not aware of how the data is stored and what data is stored; such details are
hidden from them.

1.3.2 Instance and Schema


The Schema and Instance are the essential terms related to databases. The major difference
between schema and instance lies within their definition where Schema is the formal
description of the structure of database whereas Instance is the set of information currently
stored in a database at a specific time.

Definition of schema: Design of a database is called the schema. Schema is of three types:
Physical schema, logical schema and view schema.
For example: In the following diagram, a schema that shows the relationship between three
tables: Course, Student and Section. The diagram only shows the design of the database, it
doesn’t show the data present in those tables. Schema is only a structural view(design) of a
database as shown in the diagram below.

Fig 1.2 Design of database

The design of a database at physical level is called physical schema, how the data stored in
blocks of storage is described at this level.
Design of database at logical level is called logical schema, programmers and database
administrators work at this level, at this level data can be described as certain types of data
records gets stored in data structures, however the internal details such as implementation of
data structure is hidden at this level (available at physical level).

4
REGULATION 2019 ACADEMIC YEAR 2022-2023

Design of database at view level is called view schema. This generally describes end user
interaction with database systems.

Definition of instance: The data stored in database at a particular moment of time is called
instance of database. Database schema defines the variable declarations in tables that belong
to a particular database; the value of these variables at a moment of time is called the instance
of that database. For example, lets say we have a single table student in the database, today
the table has 100 records, so today the instance of the database has 100 records. Lets say we
are going to add another 100 records in this table by tomorrow so the instance of database
tomorrow will have 200 records in table. In short, at a particular moment the data stored in
database is called the instance, that changes over time when we add or delete data from the
database.

1.4 DATA MODEL


This is defined as a logical structure of Database. It describes the design of database to
reflect entities, attributes, relationship among data, constrains etc.
The four major categories of data models are:
 Relational Model
 Entity Relationship model
 Object oriented data model
 Semi structured Data Model
In relational model, the data and relationships are represented by collection of inter-related
tables. Each table is a group of column and rows, where column represents attribute of an
entity and rows represents records.
An Entity–relationship model (ER model) describes the structure of a database with the
help of a diagram, which is known as Entity Relationship Diagram (ER Diagram). An ER
model is a design or blueprint of a database that can later be implemented as a database. The
main components of E-R model are: entity set and relationship set.
Object oriented data model is based upon real world situations. These situations are
represented as objects, with different attributes. All these object have multiple relationships
between them.
1.4.1 Elements of Object oriented data model
Objects
The real world entities and situations are represented as objects in the Object oriented
database model.
Attributes and Method
Every object has certain characteristics. These are represented using Attributes. The
behaviour of the objects is represented using Methods.
Class
Similar attributes and methods are grouped together using a class. An object can be called as
an instance of the class.
Inheritance
Inheritance is basically the process of basing a class on another class i.e to build a class on a
existing class. The new class contains all the features and functionalities of the old class in
addition to its own. The class which is newly created is known as the subclass or child class
and the original class is the parent class or the superclass. A new class can be derived from
the original class. The derived class contains attributes and methods of the original class as

5
REGULATION 2019 ACADEMIC YEAR 2022-2023

well as its own. Inheritance is defined as the ability of a lower-level object to inherit, or
access, the data items and behaviors associated with all classes which are above it in the class
hierarchy.

In DBMS inheritance can be achieved by the “ISA-relationship”. ISA-relationship is a


Extended E-R feature.Here are the two terms to consider:
 Generalization
 Specialization
Generalization: There are similarities between common entity set in the sense that they have
several attributes in common. This commonality can be expressed by generalization, which is
a containment relationship that exist between a higher-level entity-set and one or more lower
level entity-set.
Specialization: The process of designating sub groupings within an entity set is called
specialization.
Ex: Suppose a person have attributes like name, address, phone number, email etc.. The
person is somewhere employee and also may be a customer. As a customer or a employee the
person have some more attributes according to their profile. Then the common attributes of
the person like name, phone, email …etc is is inherit to the lower attributes (i.e to the
specialized attributes) In the above example, the person with name, add, email attributes is its
general attributes and this the generalization part of the E-R.And the attributes of customer
and employee of the person comes under the specialization part.
Semi-structured data is the data which does not conforms to a data model but has some
structure. It lacks a fixed or rigid schema. It is the data that does not reside in a rational
database but that have some organisational properties that make it easier to analyse. With
some process, we can store them in the relational database.
1.4.2 Characteristics of semi-structured Data
 Data does not conforms to a data model but has some structure.
 Data can not be stored in the form of rows and columns as in Databases
 Semi-structured data contains tags and elements (Metadata) which is used to group
data and describe how the data is stored
 Similar entities are grouped together and organised in a hierarchy
 Entities in the same group may or may not have the same attributes or properties
 Does not contains sufficient metadata which makes automation and management of
data difficult
 Size and type of the same attributes in a group may differ
 Due to lack of a well defined structure, it can not used by computer programs easily
1.5 DATABASE SYSTEM ARCHITECTURE
The design of a DBMS depends on its architecture. It can be centralized or decentralized or
hierarchical. The architecture of a DBMS can be seen as either single tier or multi-tier. An n-
tier architecture divides the whole system into related but independent n modules, which
can be independently modified, altered, changed, or replaced.
In 1-tier architecture, the DBMS is the only entity where the user directly sits on the DBMS
and uses it. Any changes done here will directly be done on the DBMS itself. It does not
provide handy tools for end-users. Database designers and programmers normally prefer to
use single-tier architecture.
If the architecture of DBMS is 2-tier, then it must have an application through which the
DBMS can be accessed. Programmers use 2-tier architecture where they access the DBMS

6
REGULATION 2019 ACADEMIC YEAR 2022-2023

by means of an application. Here the application tier is entirely independent of the database
in terms of operation, design, and programming.
1.5.1 3-tier Architecture
A 3-tier architecture separates its tiers from each other based on the complexity of the users
and how they use the data present in the database. It is the most widely used architecture to
design a DBMS.

Fig1.3 3tier-architecture

 Database (Data) Tier − At this tier, the database resides along with its query
processing languages. We also have the relations that define the data and their
constraints at this level.
 Application (Middle) Tier − At this tier reside the application server and the
programs that access the database. For a user, this application tier presents an
abstracted view of the database. End-users are unaware of any existence of the
database beyond the application. At the other end, the database tier is not aware of
any other user beyond the application tier. Hence, the application layer sits in the
middle and acts as a mediator between the end-user and the database.
 User (Presentation) Tier − End-users operate on this tier and they know nothing
about any existence of the database beyond this layer. At this layer, multiple views
of the database can be provided by the application. All views are generated by
applications that reside in the application tier.
Multiple-tier database architecture is highly modifiable, as almost all its components are
independent and can be changed independently.
1.6 APPLICATIONS OF DBMS
Telecom: There is a database to keeps track of the information regarding calls made,
network usage, customer details etc. Without the database systems it is hard to maintain that
huge amount of data that keeps updating every millisecond.
Banking System: For storing customer info, tracking day to day credit and debit
transactions, generating bank statements etc. All this work has been done with the help of
Database management systems.
Sales: To store customer information, production information and invoice details.
Airlines: To travel though airlines, we make early reservations, this reservation information
along with flight schedule is stored in database.

7
REGULATION 2019 ACADEMIC YEAR 2022-2023

Education sector: Database systems are frequently used in schools and colleges to store and
retrieve the data regarding student details, staff details, course details, exam details, payroll
data, attendance details, fees details etc.
Online shopping: Online shopping websites such as Amazon, Flipkart etc. These sites store
the product information, your addresses and preferences, credit details and provide you the
relevant list of products based on your query. All this involves a Database management
system.

1.7 INTRODUCTION TO RELATIONAL DATABASES


A relational database consists of a collection of tables, each of which is assigned a unique
name. In relational model, data is stored in relations(tables) and is represented in form
of tuples(rows). Relational database is a collection of organized set of tables related to each
other, and from which data can be accessed easily. Relational Database is the most commonly
used database these days.
A relational database has following major components:
1. Table
2. Record or Tuple
3. Field or Column name or Attribute
4. Domain
5. Instance
6. Schema
7. Keys
The easiest way to understand a database is as a collection of related files. Imagine a file
(either paper or digital) of sales orders in a shop. Then there's another file of products,
containing stock records. To fulfil an order, the product in the order file has to be considered
and then adjust the stock levels for that particular product in the product file. A database and
the software that controls the database, called a database management system (DBMS), helps
with this kind of task. Most databases today are relational databases, named such because
they deal with tables of data related by a common field. For example, Table 1 below shows
the product table, and Table 2 shows the invoice table. The relation between the two tables is
based on the common field product_code. Any two tables can relate to each other simply by
having a field in common.

Table 1
Product_code Description Price

A416 Nails, box $0.14

C923 Drawing pins, box $0.08

Table 2
Invoice_code Invoice_line Product_code Quantity

3804 1 A416 10

3804 2 C923 15
Let's take a closer look at the previous two tables to see how they are organized:

8
REGULATION 2019 ACADEMIC YEAR 2022-2023

 Each table consists of many rows and columns.


 Each new row contains data about one single entity (such as one product or one order
line). This is called a record. For example, the first row in Table 1 is a record; it
describes the A416 product, which is a box of nails that costs fourteen cents. The
terms row and record are interchangeable.
 Each column (also called a tuple) contains one piece of data that relates to the record,
called an attribute. Examples of attributes are the quantity of an item sold or the price
of a product. An attribute, when referring to a database table, is called a field. For
example, the data in the Description column in Table 1 are fields. The
terms attribute and field are interchangeable.

Given this kind of structure, the database gives you a way to manipulate this data: SQL. SQL
(structured query language) is a powerful way to search for records or make changes. Almost
all DBMSs use SQL, although many have added their own enhancements to it.
1.7.1 Relational Model
Relational Model represents how data is stored in Relational Databases. A relational
database stores data in the form of relations (tables). Consider a relation STUDENT with
attributes ROLL_NO, NAME, ADDRESS, PHONE and AGE shown in below table
STUDENT
ROLL_NO NAME ADDRESS PHONE AGE

1 RAM DELHI 9455123451 18

2 RAMESH GURGAON 9652431543 18

3 SUJIT ROHTAK 9156253131 20

4 SURESH DELHI 18
1. Attribute: Attributes are the properties that define a relation.
e.g.; ROLL_NO, NAME
2. Relation Schema: A relation schema represents name of the relation with its
attributes. e.g.; STUDENT (ROLL_NO, NAME, ADDRESS, PHONE and AGE) is
relation schema for STUDENT. If a schema has more than 1 relation, it is called
Relational Schema.
3. Tuple: Each row in the relation is known as tuple. The above relation contains 4
tuples, one of which is shown as:
1 RAM DELHI 9455123451 18
4. Relation Instance: The set of tuples of a relation at a particular instance of time is
called as relation instance. Table 1 shows the relation instance of STUDENT at a
particular time. It can change whenever there is insertion, deletion or updation in the
database.
5. Degree: The number of attributes in the relation is known as degree of the relation.
The STUDENT relation defined above has degree 5.
6. Cardinality: The number of tuples in a relation is known as cardinality.
The STUDENT relation defined above has cardinality 4.
7. Column: Column represents the set of values for a particular attribute. The
column ROLL_NO is extracted from relation STUDENT.

9
REGULATION 2019 ACADEMIC YEAR 2022-2023

8. NULL Values: The value which is not known or unavailable is called NULL value. It
is represented by blank space. e.g.; PHONE of STUDENT having ROLL_NO 4 is
NULL.
1.7.2 Constraints in Relational Model
To design a relational Model, some conditions are defined which must hold data present in
database are called Constraints. These constraints are checked before performing any
operation (insertion, deletion and updation) in database. If there is a violation in any of
constrains, operation will fail.
Domain Constraints: These are attribute level constraints. An attribute can only take values
which lie inside the domain range. e.g,; If a constrains AGE>0 is applied on STUDENT
relation, inserting negative value of AGE will result in failure.

Key Integrity: Every relation in the database should have atleast one set of attributes which
defines a tuple uniquely. Those set of attributes is called key. e.g.; ROLL_NO in STUDENT
is a key. No two students can have same roll number. So a key has two properties:
 It should be unique for all tuples.
 It can’t have NULL values.

Referential Integrity: When one attribute of a relation can only take values from other
attribute of same relation or any other relation, it is called referential integrity.

1.8 KEYS
A DBMS key is an attribute or set of an attribute which helps us to identify a row(tuple) in a
relation(table). They allow us to find the relation between two tables. Keys help you
uniquely identify a row in a table by a combination of one or more columns in that table.

1.8.1 Needs of Key


• In real world applications, number of tables required for storing the data is huge, and
the different tables are related to each other as well.
• Also, tables store a lot of data in them. A table generally extends to thousands of
records stored in them, unsorted and unorganised.
• Allows establishing a relationship between and identifying the relation between
tables.
• Help to enforce identity and integrity in the relationship.
• To avoid all this, Keys are defined to easily identify any row of data in a table.

Various Keys in Database Management System


DBMS has following seven types of Keys each have their different functionality:
• Super Key
• Candidate Key
• Primary Key
• Alternate Key
• Foreign Key
• Compound Key
• Composite Key

10
REGULATION 2019 ACADEMIC YEAR 2022-2023

• Surrogate Key

Let's take a simple Student table, with fields student_id, name, phone and age.

Super Key:
Super key is a set of one or more than one columns (attributes) which uniquely identifies each
record in a table. A Super key may have additional attributes that are not needed for unique
identification. Super Key is a superset of Candidate key.
In the table defined above super key would include student_id, (student_id, name), phone etc.
• The first one is pretty simple as student_id is unique for every row of data, hence it
can be used to identity each row uniquely.
• Next comes, (student_id, name), now name of two students can be same, but
their student_id can't be same hence this combination can also be a key.
• Similarly, phone number for every student will be unique, hence again, phone can
also be a key. So they all are super keys.

Primary Key:
A column or group of columns in a table which helps us to uniquely identifies every row in
that table is called a primary key. This DBMS can't be a duplicate. The same value can't
appear more than once in the table. A primary key is a minimal set of attributes (columns) in
a table that uniquely identifies tuples (rows) in that table.
Rules for defining Primary key:
 Two rows can't have the same primary key value
 It must for every row to have a primary key value.
 The primary key field cannot be null.
 The value in a primary key column can never be modified or updated if any foreign
key refers to that primary key.

Points to Note regarding Primary Key


 The value of primary key should be unique for each row of the table. The column(s)
that makes the key cannot contain duplicate values.
 The attribute(s) that is marked as primary key is not allowed to have null values.
 Primary keys are not necessarily to be a single attribute (column). It can be a set of
more than one attributes (columns).

11
REGULATION 2019 ACADEMIC YEAR 2022-2023

Example:

In the following example, <code>StudID</code> is a Primary Key.


StudID Roll No First Name LastName Email

1 11 Tom Price abc@gmail.com

2 12 Nick Wright xyz@gmail.com

3 13 Dana Natan mno@yahoo.com

Candidate Key:
A super key with no repeated attribute is called candidate key. The Primary key should be
selected from the candidate keys. Every table must have at least a single candidate key.
Candidate keys are those keys which is candidate for primary key of a table. In simple words
we can understand that such type of keys which full fill all the requirements of primary key
which is not null and have unique records is a candidate for primary key. So thus type of key
is known as candidate key. Every table must have at least one candidate key but at the same
time can have several.

Properties of Candidate key:


 It must contain unique values
 Candidate key may have multiple attributes
 Must not contain null values
 It should contain minimum fields to ensure uniqueness
 Uniquely identify each record in a table

Example: In the given table Stud ID, Roll No, and email are candidate keys which help us to
uniquely identify the student record in the table.

StudID Roll No First Name LastName Email

1 11 Tom Price abc@gmail.com

2 12 Nick Wright xyz@gmail.com

3 13 Dana Natan mno@yahoo.com

12
REGULATION 2019 ACADEMIC YEAR 2022-2023

Fig 1.4 schematic representation of keys

Alternate Key:

All the keys which are not primary key are called an alternate key. It is a candidate key which
is currently not the primary key. However, A table may have single or multiple choices for
the primary key. If any table have more than one candidate key, then after choosing primary
key from those candidate key, rest of candidate keys are known as an alternate key of that
table. Like here we can take a very simple example to understand the concept of alternate
key. Suppose we have a table named Employee which has two columns EmpID and
EmpMail, both have not null attributes and unique value. So both columns are treated as
candidate key. Now we make EmpID as a primary key to that table then EmpMail is known
as alternate key.

Example: In this table. StudID, Roll No, Email are qualified to become a primary key. But
since StudID is the primary key, Roll No, Email becomes the alternative key.
StudID Roll No First Name LastName Email

1 11 Tom Price abc@gmail.com

2 12 Nick Wright xyz@gmail.com

3 13 Dana Natan mno@yahoo.com

13
REGULATION 2019 ACADEMIC YEAR 2022-2023

Fig 1.4 Examples of candidate key

Foreign Key:
A foreign key is a column which is added to create a relationship with another table. Foreign
keys help us to maintain data integrity and also allows navigation between two different
instances of an entity. Every relationship in the model needs to be supported by a foreign key.

Example:

DeptCode DeptName

001 Science

002 English

005 Computer

Teacher ID Fname Lname

B002 David Warner

B017 Sara Joseph

B009 Mike Brunton

In this example, we have two table, teach and department in a school. However, there is no
way to see which search work in which department.In this table, adding the foreign key in
Deptcode to the Teacher name, we can create a relationship between the two tables.

Teacher ID DeptCode Fname Lname

B002 002 David Warner

B017 002 Sara Joseph

B009 001 Mike Brunton

Composite Key:
COMPOSITE KEY is a combination of two or more columns that uniquely identify rows in a
table. The combination of columns guarantees uniqueness, though individually uniqueness is
not guaranteed. Hence, they are combined to uniquely identify records in a table.

14
REGULATION 2019 ACADEMIC YEAR 2022-2023

Compound Key:
It has two or more attributes that allow you to uniquely recognize a specific record. It is
possible that each column may not be unique by itself within the database. However, when
combined with the other column or columns the combination of composite keys become
unique. The purpose of compound key is to uniquely identify each record in the table.
The difference between compound and the composite key is that any part of the compound
key can be a foreign key, but the composite key may or maybe not a part of the foreign key.

Surrogate Key
An artificial key which aims to uniquely identify each record is called a surrogate key. These
kind of key are unique because they are created when you don't have any natural primary key.
They do not lend any meaning to the data in the table. Surrogate key is usually an integer.

1.9 SQL FUNDAMENTALS


SQL is a programming language for Relational Databases. It is designed over relational
algebra and tuple relational calculus. SQL comes as a package with all major distributions of
RDBMS.
SQL comprises both data definition and data manipulation languages. Using the data
definition properties of SQL, one can design and modify database schema, whereas data
manipulation properties allows SQL to store and retrieve data from database.
1.9.1 Data Definition Language
Data definition language (DDL) refers to the set of SQL commands that can create and
manipulate the structures of a database. DDL statements are used to create, change, and
remove objects including indexes, triggers, tables, and views. Common DDL statements
include:

 CREATE (generates a new table)


 ALTER (alters table)
 DROP (removes a table from the database)
SQL uses the following set of commands to define database schema −
CREATE
The syntax for creating a table is this:

15
REGULATION 2019 ACADEMIC YEAR 2022-2023

CREATE TABLE table name (field name data type);

Example 1:

CREATE TABLE Artists (artistName, varchar);


The semi-colon is required at the end of the statement which refers to the system to process
everything before it. If its left out, you may have strange results, or even receive errors.
When creating a table, the data types most often used include strings (VARCHAR or
CHAR); numbers (NUMBER or INTEGER); and dates (DATE). Each system varies in how
to specify the data type.
Creates new databases, tables and views from RDBMS.
Example 2:
Create database tutorialspoint;
Create table article;
Create view for_students;
DROP
DROP command allows to remove entire database objects from the database.It
removes entire data structure from the database.It deletes a table, index or view.
Syntax:
DROP TABLE <table_name>;
OR
DROP DATABASE <database_name>;

Example 1 : DROP Command


DROP TABLE employee;
OR
DROP DATABASE employees;
Drops commands, views, tables, and databases from RDBMS.
Example 2:
Drop object_type object_name;
Drop database tutorialspoint;
Drop table article;
Drop view for_students;
ALTER
An ALTER command allows to alter or modify the structure of the database.It modifies an
existing database object.Using this command, you can add additional column, drop existing
column and even change the data type of columns.

Syntax:
ALTER TABLE <table_name>
ADD <column_name datatype>;
OR
ALTER TABLE <table_name>
CHANGE <old_column_name> <new_column_name>;
OR
ALTER TABLE <table_name>
DROP COLUMN <column_name>;

16
REGULATION 2019 ACADEMIC YEAR 2022-2023

Example : ALTER Command


ALTER TABLE employee
ADD (address varchar2(50));
OR
ALTER TABLE employee
CHANGE (phone_no) (contact_no);
OR
ALTER TABLE employee
DROP COLUMN age;
ALTER command Modifies database schema.Alter object_type object_name parameters;
For example−
Alter table article add subject varchar;
This command adds an attribute in the relation article with the name subject of string type.
1.9.2 Data Manipulation Language
DML is short name of Data Manipulation Language which deals with data
manipulation and includes most common SQL statements such SELECT, INSERT,
UPDATE, DELETE, etc., and it is used to store, modify, retrieve, delete and update data in a
database.

 SELECT - retrieve data from a database


 INSERT - insert data into a table
 UPDATE - updates existing data within a table
 DELETE - Delete all records from a database table
 MERGE - UPSERT operation (insert or update)
 CALL - call a PL/SQL or Java subprogram
 EXPLAIN PLAN - interpretation of the data access path
 LOCK TABLE - concurrency Control
SQL is equipped with data manipulation language (DML). DML modifies the database
instance by inserting, updating and deleting its data. DML is responsible for all forms data
modification in a database. SQL contains the following set of commands in its DML section

 SELECT/FROM/WHERE
 INSERT INTO/VALUES
 UPDATE/SET/WHERE
 DELETE FROM/WHERE

These basic constructs allow database programmers and users to enter data and information
into the database and retrieve efficiently using a number of filter options.
SELECT/FROM/WHERE
 SELECT − This is one of the fundamental query command of SQL. It is similar to
the projection operation of relational algebra. It selects the attributes based on the
condition described by WHERE clause.
 FROM − This clause takes a relation name as an argument from which attributes are
to be selected/projected. In case more than one relation names are given, this clause
corresponds to Cartesian product.

17
REGULATION 2019 ACADEMIC YEAR 2022-2023

 WHERE − This clause defines predicate or conditions, which must match in order to
qualify the attributes to be projected.
For example −
Select author_name
From book_author
Where age > 50;
This command will yield the names of authors from the relation book_author whose age is
greater than 50.
INSERT INTO/VALUES
This command is used for inserting values into the rows of a table (relation).
Syntax:
INSERT INTO table (column1 [, column2, column3 ... ]) VALUES (value1 [, value2,
value3 ... ])
Or
INSERT INTO table VALUES (value1, [value2, ... ])
For example −
INSERT INTO tutorialspoint (Author, Subject) VALUES ("anonymous", "computers");
UPDATE/SET/WHERE
This command is used for updating or modifying the values of columns in a table (relation).
Syntax −
UPDATE table_name SET column_name = value [, column_name = value ...] [WHERE
condition]
For example −
UPDATE tutorialspoint SET Author="webmaster" WHERE Author="anonymous";
DELETE/FROM/WHERE
This command is used for removing one or more rows from a table (relation).
Syntax −
DELETE FROM table_name [WHERE condition];
For example −
DELETE FROM tutorialspoints
WHERE Author="unknown";
Advanced SQL features
 Accessing SQL From a Programming Language
 Dynamic SQL
 JDBC and ODBC
 Embedded SQL
 SQL Data Types and SchemasFunctions and Procedural Constructs
 Triggers
 Advanced Aggregation FeaturesOLAP
Accessing SQL From a Programming Language
18
REGULATION 2019 ACADEMIC YEAR 2022-2023

Database languages are used to read, update and store data in a database. There are several
such languages that can be used for this purpose; one of them is SQL (Structured Query
Language).

1.9.3 Data Definition Language (DDL)


DDL is used for specifying the database schema. It is used for creating tables, schema,
indexes, constraints etc. in database. Lets see the operations that we can perform on database
using DDL:
 To create the database instance – CREATE
 To alter the structure of database – ALTER
 To drop database instances – DROP
 To delete tables in a database instance – TRUNCATE
 To rename database instances – RENAME
 To drop objects from database such as tables – DROP
 To Comment – Comment

All of these commands either defines or update the database schema that’s why they come
under Data Definition language.

1.9.4 Data Control language


DML is used for accessing and manipulating data in a database. The following operation on
database comes under DML:

 To read records from table(s) – SELECT


 To insert record(s) into the table(s) –INSERT
 Update the data in table(s) – UPDATE
 Delete all the records from the table – DELETE

Data Control language (DCL)-DCL is used for granting and revoking user access on a
database

 To grant access to user – GRANT


 To revoke access from user – REVOKE

Transaction Control Language(TCL)-The changes in the database that we made using DML
commands are either performed or rollbacked using TCL.

 To persist the changes made by DML commands in database – COMMIT


 To rollback the changes made to the database – ROLLBACK

1.10 RELATIONAL ALGEBRA


Relational algebra is a procedural query language, which takes instances of relations as input
and yields instances of relations as output. It uses operators to perform queries. An operator
can be either unary or binary. They accept relations as their input and yield relations as their
output. Relational algebra is performed recursively on a relation and intermediate results are
also considered relations.
The fundamental operations of relational algebra are as follows −

19
REGULATION 2019 ACADEMIC YEAR 2022-2023

 Select
 Project
 Union
 Set different
 Cartesian product
 Rename
Select Operation (σ)-It selects tuples that satisfy the given predicate from a relation.
Notation − σp(r)
Where σ stands for selection predicate and r stands for relation. p is prepositional logic
formula which may use connectors like and, or, and not. These terms may use relational
operators like − =, ≠, ≥, < , >, ≤.
For example −
(i) σsubject = "database"(Books)
Output − Selects tuples from books where subject is 'database'.
(ii) σsubject = "database" and price = "450"(Books)
Output − Selects tuples from books where subject is 'database' and 'price' is 450.

(iii) σsubject = "database" and price = "450" or year > "2010"(Books)


Output − Selects tuples from books where subject is 'database' and 'price' is
450 or those books published after 2010.
Project Operation (∏)-It projects column(s) that satisfy a given predicate.
Notation − ∏A1, A2, An (r)
Where A1, A2 , An are attribute names of relation r.Duplicate rows are automatically
eliminated, as relation is a set.
For example −∏subject, author (Books)
Selects and projects columns named as subject and author from the relation Books.
Union Operation (∪)-It performs binary union between two given relations and is defined as

r ∪ s = { t | t ∈ r or t ∈ s}

Notation − r U s
Where r and s are either database relations or relation result set (temporary relation).
For a union operation to be valid, the following conditions must hold −r, and s must have the
same number of attributes. Attribute domains must be compatible. Duplicate tuples are

∏ author (Books) ∪ ∏ author (Articles)


automatically eliminated.

Output − Projects the names of the authors who have either written a book or an article or
both.
Set Difference (−)The result of set difference operation is tuples, which are present in one
relation but are not in the second relation.
Notation − r − s
Finds all the tuples that are present in r but not in s.
∏ author (Books) − ∏ author (Articles)
Output − Provides the name of authors who have written books but not articles.
Cartesian Product (Χ)-Combines information of two different relations into one.
Notation − r Χ s

r Χ s = { q t | q ∈ r and t ∈ s}
Where r and s are relations and their output will be defined as −

σauthor = 'tutorialspoint'(Books Χ Articles)


Output − Yields a relation, which shows all the books and articles written by tutorialspoint.
Rename Operation (ρ)

20
REGULATION 2019 ACADEMIC YEAR 2022-2023

The results of relational algebra are also relations but without any name. The rename
operation allows us to rename the output relation. 'rename' operation is denoted with small
Greek letter rho ρ.
Notation − ρ x (E)
Where the result of expression E is saved with name of x.
Additional operations are :
 Set intersection
 Assignment
 Natural join
 Relational Calculus
In contrast to Relational Algebra, Relational Calculus is a non-procedural query language,
that is, it tells what to do but never explains how to do it. Relational calculus is a query
language which is non-procedural, and instead of algebra, it uses mathematical predicate
calculus. The relational calculus is not the same as that of differential and integral calculus in
mathematics but takes its name from a branch of symbolic logic termed as predicate calculus.
Relational calculus exists in two forms −
 Tuple Relational Calculus (TRC)
 Filtering variable ranges over tuples
Notation − {T | Condition}
Returns all tuples T that satisfies a condition.
For example −

{ T.name | Author(T) AND T.article = 'database' }


Output − Returns tuples with 'name' from Author who has written article on 'database'.
TRC can be quantified. We can use Existential (∃) and Universal Quantifiers (∀).

{ R| ∃T ∈ Authors(T.article='database' AND R.name=T.name)}


For example :

Output − The above query will yield the same result as the previous one.

Tuple Relational Calculus

In the tuple relational calculus, you will have to find tuples for which a predicate is true. The
calculus is dependent on the use of tuple variables. A tuple variable is a variable that 'ranges
over' a named relation: i.e., a variable whose only permitted values are tuples of the relation.
For example, to specify the range of a tuple variable S as the Staff relation, we write:

Staff(S)

To express the query 'Find the set of all tuples S such that F(S) is true,' we can write:

{S | F(S)}

Here, F is called a formula (well-formed formula, or wff in mathematical logic). For example,
to express the query 'Find the staffNo, fName, lName, position, sex, DOB, salary, and
branchNo of all staff earning more than £10,000', we can write:

21
REGULATION 2019 ACADEMIC YEAR 2022-2023

{S | Staff(S) ∧ S.salary > 10000}

Domain Relational Calculus (DRC)


In DRC, the filtering variable uses the domain of attributes instead of entire tuple values (as
done in TRC, mentioned above). select TCHR_ID and TCHR_NAME of teachers who work
for department 8, (where suppose - dept. 8 is Computer Application Department)

{<tchr_id, tchr_name=""> | <tchr_id, tchr_name=""> ? TEACHER Λ DEPT_ID = 10}


Get the name of the department name where Karlos works:

{DEPT_NAME |< DEPT_NAME > ? DEPT Λ ? DEPT_ID ( ? TEACHER Λ TCHR_NAME


= Karlos)}
It is to be noted that these queries are safe. The use domain relational calculus is restricted to
safe expressions; moreover, it is equivalent to the tuple relational calculus which in turn is
similar to the relational algebra.

Notation :
{ a1, a2, a3, ..., an | P (a1, a2, a3, ... ,an)}
Where a1, a2 are attributes and P stands for formulae built by inner attributes.

{< article, page, subject > | ∈ TutorialsPoint ∧ subject = 'database'}


For example −

Output − Yields Article, Page, and Subject from the relation TutorialsPoint, where subject is
database.
Just like TRC, DRC can also be written using existential and universal quantifiers. DRC also
involves relational operators. The expression power of Tuple Relation Calculus and Domain
Relation Calculus is equivalent to Relational Algebra.

ODBC
Open Database Connectivity (ODBC) is an open standard application programming interface
(API) that allows application programmers to access any database. ODBC consists of four
components, working together to enable functions. ODBC allows programs to use SQL
requests that access databases without knowing the proprietary interfaces to the databases.
ODBC handles the SQL request and converts it into a request each database system
understands.

22
REGULATION 2019 ACADEMIC YEAR 2022-2023

Fig 1.6 Components of ODBC


The four different components of ODBC are:

Application: Processes and calls the ODBC functions and submits the SQL statements;
Driver manager: Loads drivers for each application;
Driver: Handles ODBC function calls, and then submits each SQL request to a data source;
and
Data source: The data being accessed and its database management system (DBMS) OS.

JDBC
The Java Database Connectivity (JDBC) API uses the Java programming language to access
a database. When writing programs in the Java language using JDBC APIs, users can employ
software that includes a JDBC-ODBC Bridge to access ODBC-supported databases.

However, the JDBC-ODBC Bridge (or JDBC type 1 driver) should be viewed as a
transitional approach, as it creates performance overhead because API calls must pass
through the JDBC bridge to the ODBC driver, then to the native database connectivity
interface. In addition, it was removed in Java Development Kit (JDK) 8, and Oracle does not
support the JDBC-ODBC Bridge. The use of JDBC drivers provided by database vendors,
rather than the JDBC-ODBC Bridge, is the recommended approach.

1.11 DYNAMIC SQL


Dynamic SQL is a programming technique that enables you to build SQL statements
dynamically at runtime. You can create more general purpose, flexible applications by using
dynamic SQL because the full text of a SQL statement may be unknown at compilation.

Using Static SQL has a benefit which is the optimization of the statement that results an
application with high performance as it offers a good flexibility better than Dynamic SQL,
and since access plans for dynamic statements are generated at run-time so they must be
prepared in the application, and this is something you will never look at in the static SQL, but

23
REGULATION 2019 ACADEMIC YEAR 2022-2023

these are not the only differences between them, so we can say that dynamic SQL has only
one advantage over static statements which can be clearly noticed once the application is
edited or upgraded, so with Dynamic statements there’s no need for pre-compilation or re-
building as long as the access plans are generated at run-time, whereas static statements
require regeneration of access plans if they were modified, in addition to the fact that
Dynamic SQL requires more permissions, it also might be a way to execute unauthorized
code, we don’t know what kind of users we will have, so for security it can be dangerous if
the programmer didn’t handle it.

When the pattern of database access is known in advance then static SQL is very adequate to
serve us. Sometimes, in many applications we may not know the pattern of database access in
advance. For example, a report writer must be able to decide at run time that which SQL
statements will be needed to access the database. Such a need can’t be fulfilled with static
SQL and requires an advanced form of static SQL known as dynamic SQL.

There are several limitations in static SQL. Although using the host variables (host variables
allows us to input values for search condition at run time), we can achieve a little bit
dynamicness, for e.g.,

exec sql select tname, sex from teacher where salary > :sal;

Here the salary will be asked on run time. But getting column name or table asked at run time
not possible with embedded SQL. For having such a feature we need dynamic SQL.

Dynamic SQL Concepts


 In dynamic SQL, the SQL statements are not hard coded in the programming
language. The text of the SQL statement is asked at the run time to the user.
 In dynamic SQL, the SQL statements that are to be executed are not known until
runtime, so DBMS can’t get prepared for executing the statements in advanced.
 When the program is executed, the DBMS takes the text of SQL statements to execute
the statements that are executed in such a manner called statement string. Once DBMS
receives the text, it goes through a five steps execution as illustrated below.

24
REGULATION 2019 ACADEMIC YEAR 2022-2023

Fig 1.7 Dynamic SQL


1.12 EMBEDDED SQL
This is a method for combining data manipulation capabilities of SQL and computing power
of any programming language. Then embedded statements are in line with the program
source code of the host language. The code of embedded SQL is parsed by a preprocessor
which is also embedded and is replaced by the host language called for the code library it is
then compiled via the compiler of the host.

The structured query language provides us 2 features:


 It allows us to write queries.
 It allows us to use it in programming languages so that
database can be accessed through application programs also.
Due to this duality SQL is sometimes called dual mode language. Actually all the queries
cannot be expressed in SQL alone. There are many queries that are expressed in
programming languages like C, C++, Java but cannot be expressed in SQL. For writing such
queries we need to embed the SQL in general purpose programming languages. The mixture
of SQL and general purpose programming languages is called embedded SQL.
There are some special embedded SQL statements which are
used to retrieve the data into the program. There is a special SQL precompiler
that accepts the combined source code with other programming tools and converts
them into an executable program.

25
REGULATION 2019 ACADEMIC YEAR 2022-2023

Fig 1.8 Embedded SQL

The embedded SQL is a mixture of SQL and programming language, so it cannot be fed
directly to a general purpose programming language compiler. Actually the program
execution is a multi-step which is as follows:

1. First, the embedded SQL source code is fed to the SQL


precompiler. The precompiler scans the program and processes the embedded SQL
statements present in the code. There can be different precompilers for
different type of programming languages.
2. After processing the source code, the precompiler produces
2 files as its output. The first file contains the source program without
embedded SQL statements and the second file contains all the embedded SQL
statements used in the program.
3. The first file prodiced by precompiler (that contains the
source program) is fed to the compiler for the host programming language (like
C compiler). The compiler processes the source code and produces object code as
its output.
4. Now the linker takes the object modules produced by the
compiler and link them with various library routines and produces an executable
program.
5. The database request modules, produced by the precompiler
(in steps) are submitted to a special BIND program. The BIND program examines
the SQL statements, parse them, validates them, optimizes them and finally

26
REGULATION 2019 ACADEMIC YEAR 2022-2023

produces an application plan for each statement. The result is a combined


application plan for the entire program, that represents a DBMS-executable
version of its embedded SQL statements. The BIND program stores the plan in the
database, usually assigning it the name of the application program that has
created it.
Need for Embedded SQL in DBMS
When you embed SQL with another language. The language that is embedded is known as
host language and the SQL standard which defines the embedding of SQL is known as
embedded SQL.
 The result of a query is made available to the program which is embedded as one
tuple or record at a time
 For identification of this, we request to the preprocessor via EXEC SQL statement:
EXEC SQL embedded SQL statement END-EXEC
 Its statements are declare cursor, fetch and open statements.
 It can execute the update, insert a delete statement
An Embedded SQL Example in C
Although the SQL statements can be embedded in any general purpose programming
language, still we just take an example in C language so that a clear picture can be drawn.
We just take an interactive SQL statement and see how it can be embedded in C language.
Eg:
Increase the salary of teacher by 10% who are B.Tech
update teacher set salary=1.1*salary where qualification=’B.Tech’;
The embedded SQL program for above written SQL statement
will be:
main()
{
exec sql include sqlca;
exec sql declare table teacher (tid char(6) not null,tname char(20),sex char(1),age
number(3),qualification char(7),salary number(7),city varchar(15));

//Display a message to user


printf("updating teacher salary who are B.Techn");

//this code executes the SQL statement


exec sql update teacher set salary=1.1*salary where qualification='B.Tech';
printf(update done");

exit();
}

27

You might also like