Notes

Database
Management
Systems
Database
 Databases are the collection of data in order to store and retrieve data.
 The database consists of data which can be a numeric, alphabetic and also alphanumeric
form.
 Analyzing data is a key feature of database management system that is DBMS.
Basic Concepts
Why do we need DBMS ?
 A DBMS provides mechanisms to deal with this kind of data inconsistency while allowing
users to access data concurrently.
 ACID(atomicity, durability, isolation, consistency) properties to ensure efficient transaction
management without data corruption.
Benefits Of Using Database Management
Systems
Improved Data Sharing and Data Security
 Effective Data Integration
 Consistent Data That Complies With Regulations
 Increase In Productivity Of The End User
 Quick Decision Making
Applications of DBMS
Introduction to
DBMS
Data:
– It is known facts and statistics stored over a network, generally it's
Raw and unprocessed.
For Example: Id, Name, designation, Address etc
Record:
--Collection of related data items, e.g. in the above example the three
data items had no meaning. But if we organize them in the following way, then
they collectively represent meaningful information.
Table or Relation: Collection of related records.
Roll No Name Address

1 X Xxxxxxx
2 Y Yyyyyyyy
3 Z zzzzzzz
Overview
 A database-management system (DBMS) is a collection of interrelated
data and a set of programs to access those data.
 The collection of data, usually referred to as the database, contains
information relevant to an enterprise.
 The primary goal of a DBMS is to provide a way to store and retrieve
database information that is both convenient and efficient.
 Database systems are designed to manage large bodies of information.
 Management of data involves both defining structures for storage of
information and providing mechanisms for the manipulation of information.
 The database system must ensure the safety of the information stored,
despite system crashes or attempts at unauthorized access.
 If data are to be shared among several users, the system must avoid
possible anomalous results.
Database is a collection of related data organised in a way that data can be
easily accessed, managed and updated. Database can be software based or
hardware based, with one sole purpose, storing data.
Definition: A database-management system (DBMS) is a collection of
interrelated data and a set of programs to access those data. The collection of
data, usually referred to as the database
Ø DBMS software that allows creation, definition and manipulation of
database, allowing users to store, process and analyse data easily
 DBMS provides us with an interface or a tool, to perform various
operations like creating database, storing data in it, updating data, creating
tables in the database and a lot more.
 Here are some examples of popular DBMS used these days:
• MySql
• Oracle
• SQL Server
• IBM DB2
• PostgreSQL
• Amazon SimpleDB (cloud based) etc.
Database-System Applications
Databases are widely used. Here are some representative applications:
Enterprise Information
 Sales: For customer, product, and purchase information.
 Accounting: For payments, receipts, account balances, assets and other
accounting information.
 Human resources: For information about employees, salaries, payroll
taxes, and benefits, and for generation of paychecks.
 Manufacturing: For management of the supply chain and for tracking
production of items in factories, inventories of items in warehouses and
stores, and orders for items.
 Online retailers: For sales data noted above plus online order tracking,
generation of recommendation lists, and maintenance of online product
evaluations.
Banking and Finance:
Ø Banking: For customer information, accounts, loans, and banking
transactions.
 Credit card transactions: For purchases on credit cards and generation of
monthly statements.
 Finance: For storing information about holdings, sales, and purchases of
financial instruments such as stocks and bonds.
 Universities: For student information, course registrations, and grades (in
addition to standard enterprise information such as human resources and
accounting).
 Airlines: For reservations and schedule information. Airlines were among
the first to use databases in a geographically distributed manner.
 Telecommunication: For keeping records of calls made, generating
monthly bills, maintaining balances on prepaid calling cards, and storing
information about the communication networks.
Real time Applications:
 Online bookstore
 Online food delivery sites- Swiggy, Zomatto
 Online E-Commerce – Amazon, Flipkart.
Characteristics of DBMS
• Data stored into Tables: Data is stored into tables, created inside the database. DBMS also
allows to have relationships between tables which makes the data more meaningful and
connected. Y
• Reduced Redundancy: In the modern world hard drives are very cheap, but earlier when hard
drives were too expensive, unnecessary repetition of data in database was a big problem. But
DBMS follows Normalisation which divides the data in such a way that repetition is minimum.
• Data Consistency: On Live data, i.e. data that is being continuously updated and added,
maintaining the consistency of data can become a challenge. But DBMS handles it all by itself.
• Support Multiple user and Concurrent Access: DBMS allows multiple users to work on
it(update, insert, delete data) at the same time and still manages to maintain the data
consistency.
• Query Language: DBMS provides users with a simple Query language, using which data can be
easily fetched, inserted, deleted and updated in a database.
• Security: The DBMS also takes care of the security of data, protecting the
data from un-authorised access. In a typical DBMS, we can create user
accounts with different access permissions, using which we can easily
secure our data by restricting user access.
• DBMS supports transactions, which allows us to better handle and
manage data integrity in real world applications where multi-threading is
extensively used.
Purpose of DBMS
 To allow users to manipulate the information, the system has a number
of application programs that manipulate the files, including programs to:
• Add new students, instructors, and courses
• Register students for courses and generate class rosters
• Assign grades to students, compute grade point averages (GPA), and
generate Transcripts.
Tradinational storage/Conventional storage
Ø A file system is a technique of arranging the files in a storage medium like a hard
disk, pen drive, DVD, etc. It helps you to organizes the data and allows easy
retrieval of files when they are required. It mostly consists of different types of files
like mp3, mp4, txt, doc, etc. that are grouped into directories.
Ø A file system enables you to handle the way of reading and writing data to the
storage medium. It is directly installed into the computer with the Operating
systems such as Windows and Linux.
Drawbacks of Using File Processing Systems
1.Data redundancy and inconsistency
 Data Redundancy and Data Inconsistency are the important terms used in
the Database. A good Database Design is the one in which there is minimum
Data Redundancy and Data Inconsistency.
 Data Redundancy
It is defined as the redundancy means duplicate data and it is also stated that
the same parts of data exist in multiple locations into the database. This
condition is known as Data Redundancy.
 Problems with Data Redundancy :
- Wasted Storage Space.
- More Difficult Database Update.
- It will lead to Data Inconsistency.
- Retrieval of data is slow and inefficient.
Data Inconsistency :
When the same data exists in different formats in multiple tables. This condition is
known as Data Inconsistency. It means that different files contain different information about
a particular object or person. This can cause unreliable and meaningless information. Data
Redundancy leads to Data Inconsistency.
2.Difficulty in accessing data
Ø Suppose that one of the university clerks needs to find out the names of all students who
live within a particular postal-code area.
Ø The clerk asks the data-processing department to generate such a list.
Ø Because the designers of the original system did not anticipate this request, there is no
application program on hand to meet it. There is, however, an application program to
generate the list of all students. The university clerk has now two choices: either obtain
the list of all students and extract the needed information manually or ask a programmer
to write the necessary application program.
 Several days later, the same clerk needs to trim that list to include only those students
who have taken at least 60 credit hours. Which is then is iterative process.
Note: The point here is that conventional file-processing environments do not allow needed
data to be retrieved in a convenient and efficient manner.
3.Data isolation.
Because data are scattered in various files, and files may be in
different formats, writing new application programs to retrieve the appropriate
data is difficult.
4.Integrity problems.
The data values stored in the database must satisfy certain types of
consistency constraints. Suppose the university maintains an account for each
department, and records the balance amount in each account.
 Suppose also that the university requires that the account balance of
a department may never fall below zero.
Note: The problem is compounded when constraints involve several data items
from different files.
5.Atomicity problems
Ø A computer system, like any other device, is subject to failure. In many
applications, it is crucial that, if a failure occurs, the data be restored to the
consistent state that existed prior to the failure.
Ø Consider a program to transfer $500 from the account balance of department A
to the account balance of department B.
Ø If a system failure occurs during the execution of the program, it is possible that
the $500 was removed from the balance of department A but was not credited to
the balance of department B, resulting in an inconsistent database state.
Ø Clearly, it is essential to database consistency that either both the credit and
debit occur, or that neither occur.
Ø That is, the funds transfer must be atomic—it must happen in its entirety or not
at all.
6.Concurrent-access anomalies.
Ø For the sake of overall performance of the system and faster response, many
systems allow multiple users to update the data simultaneously.
Ø Today, the largest Internet retailers may have millions of accesses per day to their
data by shoppers.
Ø In such environment, interaction of concurrent updates is possible and may result in
inconsistent data.
7.Security problems.
Ø Not every user of the database system should be able to access all the data.
Ø For example, in a university, payroll personnel need to see only that part of the
database that has financial information.
Ø They do not need access to information about academic records.
Ø But, since application programs are added to the file-processing system in an ad
hoc manner, enforcing such security constraints is difficult.
Advantages of DBMS
 Segregation of application program.
 Minimal data duplicacy or data redundancy.
 Easy retrieval of data using the Query Language.
 Reduced development time and maintenance need.
 With Cloud Datacenters, we now have Database Management Systems capable of
storing almost infinite data.
 Seamless integration into the application programming languages which makes it very
easier to add a database to almost any application or website.
Disadvantages of DBMS
 It's Complexity
 Except MySQL, which is open source, licensed DBMSs are generally
costly.
 They are large in size.
Components of DBMS
The database management system can be divided into five major components, they are:
• Hardware - computer, hard disks, I/O channels for data, and any other physical component
involved before any data is successfully stored into the memory.
• Software - provides us with an easy-to-use interface to store, access and update data.
• Data - source of DBMS, Metadata(data about data) is stored
• Procedures -general instructions to use a database management system
• Database Access Language - simple language designed to write commands to
access, insert, update and delete data stored in any database.
Users
• Database Administrators: Database Administrator or DBA is the one who manages the
complete database management system.
• Application Programmer or Software Developer: This user group is involved in
developing and designing the parts of DBMS.
• End User: These days all the modern applications, web or mobile, store user data.
DBMS ARCHITECHTURE
• A Database Management system is not always directly available for users and applications
to access and store data in it.
• A Database Management system can be Centralised(all the data stored at one
location), Decentralised(multiple copies of database at different locations) or hierarchical,
depending upon its architecture.
3 Tier Architecture
2 Tier Architecture
DBMS Schemas-Level of Abstraction
A major purpose of a database system is to provide users with an abstract view of the data
Three Levels of Abstraction
• Internal Level/Physical : Physical storage structures and access paths(how the data are actually
stored)
• Conceptual level or Logical: Structure and constraints for the entire database(what data
are stored in the database, and what relationships exist among those data).This is referred as
Physical Data Independence.
• External or View level: The highest level of abstraction describes only part of the
entire database. The view level of abstraction exists to simplify their interaction with the system.
Three Levels
DBMS Schemas-Level of Abstraction
Database Schema
Schema
1. Schema may be a structural read of a info or database.
2. Schema once declared mustn’t be changed often.
3. In schema, Tables name, fields name, its sorts as well as constraints are included.
4. For a info, Schema is specified by DDL.
Database
1. The info or database may be a assortment of reticulate knowledge.
2. Knowledge during a info or database keeps on change all time, therefore database or info
modifies often.
3. Database or info includes such schema, records, constraints for the information.
4. In a database, The operations such as updates and adds are done using DML.
Database Schema
Ø A database schema is the skeleton structure that represents the logical view of the entire
database. It defines how the data is organized and how the relations among them are
associated. It formulates all the constraints that are to be applied on the data.
Ø A database schema defines its entities and the relationship among them. It contains a
descriptive detail of the database, which can be depicted by means of schema diagrams.
It’s the database designers who design the schema to help programmers understand the
database and make it useful.
Database Schema
Database Schema
A database schema can be divided broadly into two categories −
 Physical Database Schema − This schema pertains to the actual storage of data and its
form of storage like files, indices, etc. It defines how the data will be stored in a
secondary storage.
 Logical Database Schema − This schema defines all the logical constraints that need to
be applied on the data stored. It defines tables, views, and integrity constraints.
Database Instance
 It is important that we distinguish these two terms individually. Database schema is the
skeleton of database. It is designed when the database doesn't exist at all. Once the
database is operational, it is very difficult to make any changes to it. A database
schema does not contain any data or information.
 A database instance is a state of operational database with data at any given time. It
contains a snapshot of the database. Database instances tend to change with time. A
DBMS ensures that its every instance (state) is in a valid state, by diligently following all
the validations, constraints, and conditions that the database designers have imposed.
Database Schema
Data Independence
 A database system normally contains a lot of data in addition to users’ data.
 For example, it stores data about data, known as metadata, to locate and retrieve data
easily.
 It is rather difficult to modify or update a set of metadata once it is stored in the database.
 But as a DBMS expands, it needs to change over time to satisfy the requirements of the
users.
 If the entire data is dependent, it would become a tedious and highly complex job.
Ø Metadata itself follows a layered architecture, so that when we change data at one
layer, it does not affect the data at another level. This data is independent but
mapped to each other.
Logical Data Independence
Ø Logical data is data about database, that is, it stores information about how data is managed
inside. For example, a table (relation) stored in the database and all its constraints, applied on that
relation.
Ø Logical data independence is a kind of mechanism, which liberalizes itself from actual data stored
on the disk. If we do some changes on table format, it should not change the data residing on the
disk.
Physical Data Independence
Ø All the schemas are logical, and the actual data is stored in bit format on the disk. Physical data
independence is the power to change the physical data without impacting the schema or logical
data.
Ø For example, in case we want to change or upgrade the storage system itself − suppose we want
to replace hard-disks with SSD − it should not have any impact on the logical data or schemas.
FORMAL RELATIONAL QUERY LANGUAGES
Relational Query Languages
Ø Relational query languages use relational algebra to break the user requests and instruct the
DBMS to execute the requests.
Ø It is the language by which user communicates with the database. These relational query languages
can be procedural or non-procedural.
Procedural Query Language
Ø A procedural query language will have set of queries instructing the DBMS to perform various
transactions in the sequence to meet the user request.
Ø For example, get_CGPA procedure will have various queries to get the marks of student in each
subject, calculate the total marks, and then decide the CGPA based on his total marks. This
procedural query language tells the database what is required from the database and how to get
them from the database. Relational algebra is a procedural query language.
Non-Procedural Query Language
Ø Non-procedural queries will have single query on one or more tables to get result
from the database.
Ø For example, get the name and address of the student with particular ID will
have single query on STUDENT table. Relational Calculus is a non procedural
language which informs what to do with the tables, but doesn’t inform how to
accomplish this.
Ø These query languages basically will have queries on tables in the database.
Ø In the relational database, a table is known as relation. Records / rows of the
table are referred as tuples. Columns of the table are also known as attributes.
All these names are used interchangeably in relational database.
Structured Query Language (SQL)
 Structured Query Language is a standard Database language which is used to create, maintain and
retrieve the relational database. Following are some interesting facts about SQL.
Ø SQL is case insensitive. But it is a recommended practice to use keywords (like SELECT, UPDATE,
CREATE, etc) in capital letters and use user defined things (liked table name, column name, etc) in
small letters.
Ø We can write comments in SQL using “–” (double hyphen) at the beginning of any line.
Ø SQL is the programming language for relational databases (explained below) like MySQL, Oracle,
Sybase, SQL Server, Postgre, etc. Other non-relational databases (also called NoSQL) databases
like MongoDB, DynamoDB, etc do not use SQL
Ø Although there is an ISO standard for SQL, most of the implementations slightly vary in syntax. So
we may encounter queries that work in SQL Server but do not work in MySQL..
SQL | Datatypes
1. Binary Datatypes :
There are four subtypes of this datatype which are given below : Exact Numeric Datatype :
Character String Datatype :
Unicode Character String Datatype :
Date and Time Datatype :
SQL COMMANDS| DDL, DML, TCL and DCL
DDL (Data Definition Language) :
Data Definition Language is used to define the database structure or
schema. DDL is also used to specify additional properties of the data. The
storage structure and access methods used by the database system by a set of
statements in a special type of DDL called a data storage and definition
language.
Some Commands:
CREATE : to create objects in database
ALTER : alters the structure of database
DROP : delete objects from database
RENAME : rename an objects
create table department (dept_name char(20), building char(15), budget numeric(12,2));
SQL | Create
There are two CREATE statements available in SQL:
1. CREATE DATABASE
2. CREATE TABLE
A Database is defined as a structured set of data. So, in SQL the very first step to store the data
in a well structured manner is to create a database. The CREATE DATABASE statement is used
to create a new database in SQL.
Syntax:
CREATE DATABASE database_name;
database_name: name of the database.
CREATE TABLE
 The CREATE TABLE statement is used to create a table in SQL. We know that a table
comprises of rows and columns. So while creating tables we have to provide all the
information to SQL about the names of the columns, type of data to be stored in columns,
size of the data etc. Let us now dive into details on how to use CREATE TABLE
 statement to create tables in SQL.
 Syntax:
 CREATE TABLE table_name ( column1 data_type(size), column2 data_type(size),
column3 data_type(size), .... );
table_name: name of the table. column1 name of the first column. data_type: Type of data we
want to store in the particular column. For example,int for integer data. size: Size of the data
we can store in a particular column. For example if for a column we specify the data_type as
int and size as 10 then this column can store an integer number of maximum 10 digits.
SQL | Create
Example Query:
This query will create a table named Students with three columns, ROLL_NO, NAME and SUBJECT.
CREATE TABLE Students ( ROLL_NO int(3), FirstNAME varchar2(20), SUBJECT varchar(20) );
This query will create a table named Students. The ROLL_NO field is of type int and can store an integer number of
size 3. The next two columns NAME and SUBJECT are of type varchar and can store characters and the size 20
specifies that these two fields can hold maximum of 20 characters.
SQL | ALTER (ADD, DROP, MODIFY)
ALTER TABLE is used to add, delete/drop or modify columns in the existing table. It is also used to add and drop
various constraints on the existing table.
ALTER TABLE – ADD
ADD is used to add columns into the existing table. Sometimes we may require to add additional information, in that
case we do not require to create the whole database again, ADD comes to our rescue.
Syntax:
ALTER TABLE table_name ADD (Columnname_1 datatype, Columnname_2 datatype, … Columnname_n
datatype);
DROP COLUMN is used to drop column in a table. Deleting the unwanted columns from the table.
Syntax:
DROP COLUMN column_name;
SQL | ALTER (ADD, DROP, MODIFY)
ALTER TABLE-MODIFY
It is used to modify the existing columns in a table. Multiple columns can also be modified at once.
*Syntax may vary slightly in different databases.
Syntax(Oracle,MySQL,MariaDB):
ALTER TABLE table_name MODIFY column_name column_type;
Syntax(SQL Server):
ALTER TABLE table_name ALTER COLUMN column_name column_type;
SQL | DROP, TRUNCATE
DROP is used to delete a whole database or just a table.The DROP statement destroys the objects like an existing
database, table, index, or view.
A DROP statement in SQL removes a component from a relational database management system (RDBMS).
Syntax:
DROP object object_name
Examples:
DROP TABLE table_name;
table_name: Name of the table to be deleted.
DROP DATABASE database_name;
database_name: Name of the database to be deleted.
SQL | DROP, TRUNCATE
TRUNCATE
TRUNCATE statement is a Data Definition Language (DDL) operation that is used to mark the extents of a table for
deallocation (empty for reuse). The result of this operation quickly removes all data from a table, typically bypassing a
number of integrity enforcing mechanisms.
The TRUNCATE TABLE mytable statement is logically (though not physically) equivalent to the DELETE FROM
mytable statement (without a WHERE clause).
Syntax:
TRUNCATE TABLE table_name;
table_name: Name of the table to be truncated.
DATABASE name - student_data
SQL | RENAME
SQL RENAME TABLE syntax is used to change the name of a table. Sometimes, we choose non-meaningful name
for the table. So it is required to be changed.
SYNTAX
ALTER TABLE table_name RENAME TO new_table_name;
OR
RENAME old_table _name To new_table_name;
Examples
ALTER TABLE STUDENTS RENAME TO ARTISTS;
OR
RENAME STUDENTS TO ARTISTS;
DML (Data Manipulation Language) :
DML statements are used for managing data with in schema objects.
DML are of two types –
• Procedural DMLs : require a user to specify what data are needed and how to get those data.
• Declerative DMLs (also referred as Non-procedural DMLs) : require a user to specify what data are needed without
specifying how to get those data.
Declarative DMLs are usually easier to learn and use than procedural DMLs. However, since a user does not have to
specify how to get the data, the database system has to figure out an efficient means of accessing data.
Some Commands:
SELECT: retrieve data from the database
INSERT: insert data into a table
UPDATE: update existing data within a table
DELETE: deletes all records from a table, space for the records remain
Case 1: If we want to retrieve attributes ROLL_NO and NAME of all
students, the query will be: ROLL_NO NAME
1 RAM
SELECT ROLL_NO, NAME FROM STUDENT;
2 RAMESH
3 SUJIT
4 SURESH
Case 2: If we want to retrieve ROLL_NO and NAME of the students whose ROLL_NO is greater than 2, the query
will be:
SELECT ROLL_NO, NAME FROM STUDENT WHERE ROLL_NO>2;
ROLL_NO NAME
3 SUJIT
4 SURESH
CASE 3: If we want to retrieve all attributes of students, we can write * in place of writing all attributes as:
ROLL_NO NAME ADDRESS PHONE AGE

SELECT * FROM STUDENT WHERE ROLL_NO>2;
3 SUJIT ROHTAK 9156253131 20
4 SURESH DELHI 9156768971 18
CASE 4: If we want to represent the relation in ascending order by AGE, we can use ORDER BY clause as:
SELECT * FROM STUDENT ORDER BY AGE; ROLL_NO NAME ADDRESS PHONE AGE
1 RAM DELHI 9455123451 18
Note: ORDER BY AGE is equivalent to ORDER 2 RAMESH GURGAON 9652431543 18

BY AGE ASC. If we want to retrieve the results
4 SURESH DELHI 9156768971 18
in descending order of AGE, we can use
ORDER BY AGE DESC. 3 SUJIT ROHTAK 9156253131 20
CASE 5: If we want to retrieve distinct values of an attribute or group of attribute, DISTINCT is used as in:
ADDRESS
SELECT DISTINCT ADDRESS FROM STUDENT;
DELHI
If DISTINCT is not used, DELHI will be repeated twice in result set. Before
understanding GROUP BY and HAVING, we need to understand aggregations GURGAON
functions in SQL.
ROHTAK
AGGRATION FUNCTIONS: Aggregation functions are used to perform mathematical operations on data values
of a relation. Some of the common aggregation functions used in SQL are:
•COUNT: Count function is used to count the number of rows in a relation. e.g;
SELECT COUNT (PHONE) FROM STUDENT; COUNT(PHON
E)
4
•SUM: SUM function is used to add the values of an attribute in a relation. e.g;
SELECT SUM (AGE) FROM STUDENT; SUM(AGE)
74
In the same way, MIN, MAX and AVG can be used. As we have seen above, all aggregation functions return only 1 row.
AVERAGE: It gives the average values of the tupples. It is also defined as sum divided by count values.
Syntax:AVG(attributename)
OR
Syntax: SUM(attributename)/COUNT(attributename)
The above mentioned syntax also retrieves the average value of tupples.
MAXIMUM: It extracts the maximum value among the set of tupples.
Syntax: MAX(attributename)
MINIMUM: It extracts the minimum value amongst the set of all the tupples.
Syntax: MIN(attributename)
GROUP BY: Group by is used to group the tuples of a relation based on an attribute or group of attribute. It is always
combined with aggregation function which is computed on group. e.g.;
ADDRESS SUM(AGE)
SELECT ADDRESS, SUM(AGE) FROM STUDENT GROUP BY (ADDRESS);DELHI 36
GURGAON 18
ROHTAK 20
In this query, SUM(AGE) will be computed but not for entire table but for each address. i.e.; sum of AGE for address
DELHI(18+18=36) and similarly for other address as well.
NOTE: An attribute which is not a part of GROUP BY clause can’t be used for selection. Any attribute which is part of
GROUP BY CLAUSE can be used for selection but it is not mandatory. But we could use attributes which are not a
part of the GROUP BY clause in an aggregrate function.
SQL | INSERT INTO Statement
The INSERT INTO statement of SQL is used to insert a new row in a table. There are two ways of using INSERT INTO statement for
inserting rows:
1.Only values: First method is to specify only the value of data to be inserted without the column names.
INSERT INTO table_name VALUES (value1, value2, value3,…);
table_name: name of the table.
value1, value2,.. : value of first column, second column,… for the new record
2.Column names and values both: In the second method we will specify both the columns which we want to fill and their
corresponding values as shown below:
INSERT INTO table_name (column1, column2, column3,..) VALUES ( value1, value2, value3,..);
table_name: name of the table.
column1: name of first column, second column …
value1, value2, value3 : value of first column, second column,… for the new record
INSERT INTO Student VALUES (‘5′,’HARSH’,’WEST BENGAL’,’XXXXXXXXXX’,’19’);
SQL | UPDATE
The UPDATE statement in SQL is used to update the data of an existing table in database. We can update single
columns as well as multiple columns using UPDATE statement as per our requirement.
Basic Syntax
UPDATE table_name SET column1 = value1, column2 = value2,... WHERE condition;
table_name: name of the table column1: name of first , second, third column.... value1: new value for first, second,
third column.... condition: condition to select the rows for which the values of columns needs to be updated.
Updating single column: Update the column NAME and set the value to ‘PRATIK’ in all the rows where Age is 20.
UPDATE Student SET NAME = 'PRATIK' WHERE Age = 20;
SQL | UPDATE
UPDATE Student SET NAME =
'NAKSH';
Updating multiple columns: Update the columns NAME to ‘PRATIK’ and ADDRESS to ‘SIKKIM’ where ROLL_NO
is 1.
UPDATE Student SET NAME = 'PRATIK', ADDRESS = 'SIKKIM' WHERE ROLL_NO = 1;
SQL | DELETE
The DELETE Statement in SQL is used to delete existing records from a table. We can delete a single record or
multiple records depending on the condition we specify in the WHERE clause.
Basic Syntax:
DELETE FROM table_name WHERE some_condition;
table_name: name of the table
some_condition: condition to choose particular record.
Deleting single record: Delete the rows where NAME = ‘Ram’. This will delete only the first row.
DELETE FROM Student WHERE NAME = 'Ram';
SQL | DELETE
Deleting multiple records: Delete the rows from the table Student where Age is 20. This will delete 2 rows(third row
and fifth row).
DELETE FROM Student WHERE Age = 20;
Delete all of the records: There are two queries to do this as shown below,
query1: "DELETE FROM Student";
query2: "DELETE * FROM Student";
TCL (Transaction Control Language) :
Transaction Control Language commands are used to manage transactions in the database. These are used to
manage the changes made by DML-statements. It also allows statements to be grouped together into logical
transactions.
COMMIT: Commit command is used to permanently save any transaction
into the database.
ROLLBACK: This command restores the database to last committed state.
It is also used with savepoint command to jump to a savepoint
in a transaction.
SAVEPOINT: Savepoint command is used to temporarily save a transaction so
that you can rollback to that point whenever necessary.
DCL (Data Control Language) :
A Data Control Language is a syntax similar to a computer programming language used to control access to data
stored in a database (Authorization). In particular, it is a component of Structured Query Language (SQL).
GRANT: allow specified users to perform specified tasks.
REVOKE: cancel previously granted or denied permissions.
Structure of Relational Database
Ø A relational database consists of a collection of tables, each of which is assigned a unique name.
Ø Row in a table represents a relationship among a set of values.
Ø Tuple is simply a sequence (or list) of values. A relationship between n values is represented mathematically by
an n-tuple of values.
Ø Relation instance to refer to a specific instance of a relation, i.e., containing a specific set of rows.
Ø A domain is atomic if elements of the domain are considered to be indivisible units.
KEYS-DBMS
Ø KEYS in DBMS is an attribute or set of attributes which helps you to identify a row(tuple) in a relation(table).
Ø They allow you to find the relation between two tables.
Ø Keys help you uniquely identify a row in a table by a combination of one or more columns in that table.
Ø Key is also helpful for finding unique record or row from the table.
Ø Database key is also helpful for finding unique record or row from the table.
Example:
Employee ID FirstName LastName

11 Andrew Johnson
22 Tom Wood
33 Alex Hale
In the above-given example, employee ID is a primary key because it uniquely identifies an employee record. In this
table, no other employee can have the same employee ID.
Why we need a Key?
Here are some reasons for using sql key in the DBMS system.
Ø Keys help you to identify any row of data in a table. In a real-world application, a table could contain thousands of
records. Moreover, the records could be duplicated. Keys in RDBMS ensure that you can uniquely identify a table
record despite these challenges.
Ø Allows you to establish a relationship between and identify the relation between tables
Ø Help you to enforce identity and integrity in the relationship.
Types of Keys in DBMS (Database Management System)
There are mainly Eight different types of Keys in DBMS and each key has it’s different functionality:
Ø Super Key
Ø Primary Key
Ø Candidate Key
Ø Alternate Key
Ø Foreign Key
Ø Compound Key
Ø Composite Key
Ø Super Key - A super key is a group of single or multiple keys which identifies rows in a table.
Ø Primary Key - is a column or group of columns in a table that uniquely identify every row in that table.
Ø Candidate Key - is a set of attributes that uniquely identify tuples in a table. Candidate Key is a super key with no
repeated attributes.
Ø Alternate Key - is a column or group of columns in a table that uniquely identify every row in that table.
Ø Foreign Key - is a column that creates a relationship between two tables. The purpose of Foreign keys is to
maintain data integrity and allow navigation between two different instances of an entity.
Ø Compound Key - has two or more attributes that allow you to uniquely recognize a specific record. It is possible
that each column may not be unique by itself within the database.
Ø Composite Key - is a combination of two or more columns that uniquely identify rows in a table. The combination
of columns guarantees uniqueness, though individual uniqueness is not guaranteed.
What is the Super key?
A superkey is a group of single or multiple keys which identifies rows in a table. A Super key may have additional
attributes that are not needed for unique identification.
Example:
EmpSSN EmpNum Empname

9812345098 AB05 Shown
9876512345 AB06 Roslyn
199937890 AB07 James
In the above-given example, EmpSSN and EmpNum name are superkeys.
What is a Primary Key?
PRIMARY KEY in DBMS is a column or group of columns in a table that uniquely identify every row in that table. The
Primary Key can't be a duplicate meaning the same value can't appear more than once in the table. A table cannot
have more than one primary key.
Rules for defining Primary key:
•Two rows can't have the same primary key value
•It must for every row to have a primary key value.
•The primary key field cannot be null.
•The value in a primary key column can never be modified or updated if any foreign key refers to that primary key.
Example: In the following example, <code>StudID</code> is a Primary Key.
StudID Roll No First Name LastName Email
1 11 Tom Price abc@gmail.com

2 12 Nick Wright xyz@gmail.com
3 13 Dana Natan mno@yahoo.com
Properties :
Ø No duplicate values are allowed, i.e. Column assigned as primary key should have UNIQUE values only.
Ø NO NULL values are present in column with Primary key. Hence there is Mandatory value in column having
Primary key.
Ø Only one primary key per table exist although Primary key may have multiple columns.
Ø No new row can be inserted with the already existing primary key.
Ø Classified as : a) Simple primary key that has a Single column 2) Composite primary key has Multiple column.
The primary key can be created in a table using PRIMARY KEY constraint. It can be created at two levels.
1.Column
2.Table.
SQL PRIMARY KEY at Column Level :
If Primary key contains just one column, it should be defined at column level. The following code creates the Primary
key “ID” on the person table.
Syntax :
Create Table Person
Here NOT NULL is added to make
(
sure ID should have unique values.
Id int NOT NULL PRIMARY KEY,
SQL will automatically set null values
Name varchar2(20),
to the primary key if it is not specified.
Address varchar2(50)
);
Whenever the primary key contains multiple columns it has to be specified at Table level.
Syntax:
Create Table Person (Id int NOT NULL, Name varchar2(20), Address varchar2(50), PRIMARY KEY(Id, Name) );
Since multiple columns make up Primary Key so
both the rows are considered different. SQL
permits either of the two values can be
duplicated but the combination must be unique.
SQL PRIMARY KEY with ALTER TABLE :
Most of the time, Primary Key is defined during the creation of the table but sometimes the Primary key may not be
created in the already existing table. We can however add Primary Key using Alter Statement.
Syntax :
Alter Table Person add Primary Key(Id);
To add Primary key in multiple columns using the following query.
Alter Table Person add Primary Key(Id, Name);
DELETE PRIMARY KEY CONSTRAINT :
To remove Primary Key constraint on table use given SQL as follows.
ALTER table Person DROP PRIMARY KEY;
What is a Candidate Key?
CANDIDATE KEY in SQL is a set of attributes that uniquely identify tuples in a table. Candidate Key is a super key with
no repeated attributes. The Primary key should be selected from the candidate keys. Every table must have at least a
single candidate key. A table can have multiple candidate keys but only a single primary key.
Properties of Candidate key:
•It must contain unique values
•Candidate key in SQL may have multiple attributes
•Must not contain null values
•It should contain minimum fields to ensure uniqueness
•Uniquely identify each record in a table
Candidate key Example: In the given table Stud ID, Roll No, and email are candidate keys which help us to uniquely
identify the student record in the table.
StudID Roll No First Name LastName Email

1 11 Tom Price abc@gmail.com
2 12 Nick Wright xyz@gmail.com
3 13 Dana Natan mno@yahoo.com
All the keys which are
not primary key are
called an Alternate Key.
What is the Foreign key?
Ø FOREIGN KEY is a column that creates a relationship between two tables.
Ø The purpose of Foreign keys is to maintain data integrity and allow navigation between two different instances of an
entity.
Ø It acts as a cross-reference between two tables as it references the primary key of another table.
Example:
DeptCode DeptName Teacher ID Fname Lname
001 Science B002 David Warner

002 English B017 Sara Joseph
005 Computer B009 Mike Brunton
In this key in dbms example, we have two table, teach and department in a school.
However, there is no way to see which search work in which department.
In this table, adding the foreign key in Deptcode to the Teacher name, we can create a relationship between the
two tables.
Teacher ID DeptCode Fname Lname

B002 002 David Warner
B017 002 Sara Joseph
B009 001 Mike Brunton
This concept is also known as Referential Integrity.
Rules for FOREIGN KEY
Ø NULL is allowed in SQL Foreign key.
Ø The table being referenced is called the Parent Table
Ø The table with the Foreign Key in SQL is called Child Table.
Ø The SQL Foreign Key in child table references the primary key in the parent table.
Ø This parent-child relationship enforces the rule which is known as "Referential Integrity."
Syntax:
CREATE TABLE childTable
( column_1 datatype [ NULL |NOT NULL ], column_2 datatype [ NULL |NOT NULL ], ...
CONSTRAINT fkey_name
FOREIGN KEY (child_column1, child_column2, ... child_column_n)
REFERENCES parentTable (parent_column1, parent_column2, ... parent_column_n)
[ ON DELETE { NO ACTION |CASCADE |SET NULL |SET DEFAULT } ]
[ ON UPDATE { NO ACTION |CASCADE |SET NULL |SET DEFAULT } ]
);
Foreign Key references the primary key of another Table! It helps connect your
Tables
Ø A foreign key can have a different name from its primary key
Ø It ensures rows in one table have corresponding rows in another
Ø Unlike the Primary key, they do not have to be unique. Most often they aren't
Ø Foreign keys can be null even though primary keys can not
Why do you need a foreign key? You will only be able to insert values into your foreign key that exist in the unique key in
the parent table.
Suppose, a novice inserts a record in Table B such as This helps in referential integrity.
The above problem can be overcome by declaring membership id from Table2
as foreign key of membership id from Table1
Now, if somebody tries to insert a value in the membership id field that does not exist in
the parent table, an error will be shown!
Here is a description of the above parameters:
Ø childTable is the name of the table that is to be created.
Ø column_1, column_2- the columns to be added to the table.
Ø fkey_name- the name of the foreign key constraint to be created.
Ø child_column1, child_column2…child_column_n- the name of chidTable columns to reference the primary key in parentTable.
Ø parentTable- the name of parent table whose key is to be referenced in the child table.
Ø parent_column1, parent_column2, ... parent_column3- the columns making up the primary key of parent table.
Ø ON DELETE. An optional parameter. It specifies what happens to the child data after deletion of the parent data. Some of the values for this
parameter include NO ACTION, SET NULL, CASCADE, or SET DEFAULT.
Ø ON UPDATE- An optional parameter. It specifies what happens to the child data after update on the parent data. Some of the values for this
parameter include NO ACTION, SET NULL, CASCADE, or SET DEFAULT.
Ø NO ACTION- used together with ON DELETE and ON UPDATE. It means that nothing will happen to the child data after the update or deletion of the
parent data.
Ø CASCADE- used together with ON DELETE and ON UPDATE. The child data will either be deleted or updated after the parent data has been deleted
or updated.
Ø SET NULL- used together with ON DELETE and ON UPDATE. The child will be set to null after the parent data has been updated or deleted.
Ø SET DEFAULT- used together with ON DELETE and ON UPDATE. The child data will be set to default values after an update or delete on the parent
data.
Example Child Table
CREATE TABLE Course_Strength_TSQL
(
Course_ID Int, Course_Strength Varchar(20)
CONSTRAINT FK FOREIGN KEY (Course_ID)
REFERENCES COURSE (Course_ID)
)
Parent
Table
Using ALTER TABLE
Now we will learn how to use Foreign Key in SQL and add Foreign Key in SQL server using the ALTER TABLE
statement, we will use the syntax given below:
ALTER TABLE childTable
ADD CONSTRAINT fkey_name
FOREIGN KEY (child_column1, child_column2, ... child_column_n)
REFERENCES parentTable (parent_column1, parent_column2, ... parent_column_n);
Ø childTable is the name of the table that is to be created.
Ø column_1, column_2- the columns to be added to the table.
Ø fkey_name- the name of the foreign key constraint to be created.
Ø child_column1, child_column2…child_column_n- the name of chidTable columns to reference the primary key in
parentTable.
Ø parentTable- the name of parent table whose key is to be referenced in the child table.
Ø parent_column1, parent_column2, ... parent_column3- the columns making up the primary key of parent table.
What is the Compound key?
Ø COMPOUND KEY has two or more attributes that allow you to uniquely recognize a specific record.
Ø It is possible that each column may not be unique by itself within the database.
Ø However, when combined with the other column or columns the combination of composite keys become unique.
Ø The purpose of the compound key in database is to uniquely identify each record in the table.
OrderNo PorductID Product Name Quantity

B005 JAP102459 Mouse 5
B005 DKT321573 USB 10
B005 OMG446789 LCD Monitor 20
B004 DKT321573 USB 15
B002 OMG446789 Laser Printer 3
In this example, OrderNo and ProductID can't be a primary key as it does not uniquely identify a record. However,
a compound key of Order ID and Product ID could be used as it uniquely identified each record.
Composite Key in SQL
To know what a composite key is we need to have the knowledge of what a primary key is, a primary key is a column
that has a unique and not null value in an SQL table.
Now a composite key is also a primary key, but the difference is that it is made by the combination of more than one
column to identify the particular row in the table.
Composite Key:
A composite key is made by the combination of two or more columns in a table that can be used to uniquely identify
each row in the table when the columns are combined uniqueness of a row is guaranteed, but when it is taken
individually it does not guarantee uniqueness, or it can also be understood as a primary key made by the combination
of two or more attributes to uniquely identify every row in a table.
CREATE TABLE student
(
rollNumber INT, name VARCHAR(30), class
VARCHAR(30), section VARCHAR(1), mobile
VARCHAR(10),
PRIMARY KEY (rollNumber, mobile)
);
Database Schema , Relational
operations and Database Design
Database Schema
 Differentiate between the database schema, which is the logical design of the database, and
the database instance, which is a snapshot of the data in the database at a given instant in
time.
 The concept of a relation corresponds to the programming-language notion of a variable,
while the concept of a relation schema corresponds to the programming-language notion of
type definition.
 The concept of a relation instance corresponds to the programming-language notion of a
value of a variable.
 The value of a given variable may change with time.
 Similarly the contents of a relation instance may change with time as the relation is updated.
Fig- instructor relation
Fig. Department Relation
The schema for that relation is
department (dept name, building, budget)
For example-
 suppose we wish to find the information about all the instructors who work in the Watson
building.
 We look first at the department relation to find the dept name of all the departments housed
in Watson.
 Then, for each such department, we look in the instructor relation to find the information
about the instructor associated with the corresponding dept name.
Let us continue with our university database example.
 Each course in a university may be offered multiple times, across different semesters, or
even within a semester.
 A relation to describe each individual offering, or section, of the class.
 The schema is
section (course id, sec id, semester, year, building, room number, time slot id)
Fig. Section Relation
Following relations in this text:
• student (ID, name, dept name, tot cred)
• takes (ID, course id, sec id, semester, year, grade)
• classroom (building, room number, capacity)
• time slot (time slot id, day, start time, end time)
Relational Operations
 All procedural relational query languages provide a set of operations that can be applied
to either a single relation or a pair of relations.
 These operations have the nice and desired property that their result is always a single
relation.
 This property allows one to combine several of these operations in a modular way.
 Specifically, since the result of a relational query is itself a relation, relational operations
can be applied to the results of queries as well as to the given set of relations.
 The most frequent operation is the selection of specific tuples from a single relation (say
instructor) that satisfies some particular predicate
(say salary > $85,000).
The result is a new relation that is a subset of the original relation (instructor).
RELATIONAL ALGEBRA
RELATIONAL ALGEBRA
 The relational algebra defines a set of operations on relations, paralleling the usual
algebraic operations such as addition, subtraction or multiplication, which operate on
numbers.
 Just as algebraic operations on numbers take one or more numbers as input and
return a number as output, the relational algebra operations typically take one or two
relations as input and return a relation as output.
RELATIONAL ALGEBRA
RELATIONAL ALGEBRA
 The relational algebra is a procedural query language.
 It consists of a set of operations that take one or two relations as input and
produce a new relation as their result.
 The fundamental operations in the relational algebra are select, project, union,
set difference, Cartesian product, and rename.
 In addition to the fundamental operations, there are several other operations—
namely, set intersection, natural join, and assignment
The Select Operation
 The select operation selects tuples that satisfy a given predicate. We use
 The lowercase Greek letter sigma () to denote selection.
 The predicate appears as a subscript to .
 The argument relation is in parentheses after the . Thus, to select those tuples of the
instructor relation where the instructor is in the “Physics” department, we write:
Find all instructors with salary greater than $90,000 by writing:
 In general, we allow comparisons using =, =, <, ≤, >, and ≥ in the selection
 predicate.
 Furthermore,we can combine several predicates into a larger predicate
by using the connectives and (∧), or (∨), and not (￢).
 Thus, to find the instructors in Physics with a salary greater than $90,000,
we write:
Database Model
Data Model -Definition
A Database model defines the logical design and structure of a database and defines how data will be stored,
accessed and updated in a database management system
Types of Data
Model
Hierarchical Model
Network Model
Entity-Relationship Model
Relational Model
Database Model
Hierarchical model
It organises data into a tree-like-structure, with a single root, to which all the other data is linked.
Heirarchy starts from the Root data, and expands like a tree, adding child nodes to the parent nodes.
In this model, a child node will only have a single parent node.
This model efficiently describes many real-world relationships like index of a book, recipes etc.
one-to-many relationship
Database Model
Network
model
It organises data into more like a graph, and are allowed to have more than one parent node.
In this database model data is more related as more relationships are established in this database model.
Also, as the data is more related, hence accessing the data is also easier and fast.
This model efficiently describes many real-world relationships like index of a book, recipes etc.
This was the most widely used database model, before Relational
Model was introduced.
many-to-many relationship
Database Model
Entity-relationship
Model
In this database model relationships are created by dividing object of interest into entity and its characteristics into
attributes.
Different entities are related using relationships.
E-R Models are defined to represent the relationships into pictorial form to make it easier for different stakeholders to
understand.
This model is good to design a database, which can then be turned into tables in relational model
Entity –Student
Attributes – Name, Age, Address, id
Database Model
Relational Model
In this database model data is organised in two-dimensional tables and the relationship is maintained by storing a
common field.
This model was introduced by E.F Codd in 1970, and since then it has been the most widely used database model,
infact, we can say the only database model used around the world.
The basic structure of data in the relational model is tables. All the information related to a particular type is stored in
rows of that table. Tables are also known as relations in relational model.
E-R MODEL
Entity-relationship model is a model used for design and representation of relationships between data.
To understand about the ER Model, we must understand about:
• Entity and Entity Set
• What are Attributes? And Types of Attributes.
• Keys
• Relationships
ER Model: Entity and Entity Set
An Entity is generally a real-world object with a physical existence – a particular person, car, house, or
employee – or it may be an object with a conceptual existence – a company, a job, or a university
course. which has characteristics and holds relationships in a DBMS.
If a Student is an Entity, then the complete dataset of all the students will be the Entity Set
E-R MODEL
E-R MODEL
Types of Attributes Description
Simple attribute Simple attributes can't be divided any further. For
example, a student's contact number. It is also
called an atomic value.
Composite attribute It is possible to break down composite attribute.
For example, a student's full name may be
further divided into first name, second name, and
last name.
Derived attribute This type of attribute does not include in the
physical database. However, their values are
derived from other attributes present in the
database. For example, age should not be
stored directly. Instead, it should be derived from
the DOB of that employee.
Multivalued attribute Multivalued attributes can have more than one
values. For example, a student can have more
than one mobile number, email address, etc.
E-R MODEL
ER Model: Attributes
Attributes are the properties which define the entity type. For example, Roll_No, Name, DOB, Age, Address,
Mobile_No are the attributes which defines entity type Student. In ER diagram, attribute is represented by an oval.
Key Attribute –
The attribute which uniquely identifies each entity in the entity set is called key attribute.For example, Roll_No
will be unique for each student. In ER diagram, key attribute is represented by an oval with underlying lines.
Composite Attribute –
An attribute composed of many other attribute is called as composite attribute. For example, Address attribute of
student Entity type consists of Street, City, State, and Country. In ER diagram, composite attribute is represented by
an oval comprising of ovals.
E-R MODEL
Multivalued Attribute –
An attribute consisting more than one value for a given entity. For example, Phone_No (can be more than one for a
given student). In ER diagram, multivalued attribute is represented by double oval.
Derived Attribute –
An attribute which can be derived from other attributes of
the entity type is known as derived attribute. e.g.; Age (can
be derived from DOB). In ER diagram, derived attribute is
represented by dashed oval.
E-R MODEL-
Relationship Type and Relationship Set:
A relationship type represents the association between entity types. For example,‘Enrolled in’ is a relationship type
that exists between entity type Student and Course. In ER diagram, relationship type is represented by a diamond and
connecting the entities with lines.
A set of relationships of same type is known as Relationship set. The following relationship set depicts S1 is enrolled
in C2, S2 is enrolled in C1 and S3 is enrolled in C3.
E-R MODEL-
Degree of a relationship
set:
The number of different entity sets participating in a relationship set is called as degree of a relationship set.
Unary Relationship –
When there is only ONE entity set participating in a relation, the relationship is called as unary relationship. For
example, one person is married to only one person.
Binary Relationship –
When there are TWO entities set participating in a relation, the relationship is called as binary relationship. For
example, Student is enrolled in Course.
n-ary Relationship –
When there are n entities set participating in a relation, the relationship is called as n-ary relationship.
E-R MODEL-
Degree of a relationship
set:
The number of different entity sets participating in a relationship set is called as degree of a relationship set.
Unary Relationship –
When there is only ONE entity set participating in a relation, the relationship is called as unary relationship. For
example, one person is married to only one person.
Binary Relationship –
When there are TWO entities set participating in a relation, the relationship is called as binary relationship. For
example, Student is enrolled in Course.
E-R MODEL-
Cardinality
Defines the numerical attributes of the relationship between two entities or entity sets.
Different types of cardinal relationships are:
•One-to-One Relationships
•One-to-Many Relationships
•May to One Relationships
•Many-to-Many Relationships
E-R MODEL-
Cardinality:
The number of times an entity of an entity set participates in a relationship set is known as cardinality.
Cardinality can be of different types:
One to one – When each entity in each entity set can take part only once in the relationship, the cardinality is one
to one. Let us assume that a male can marry to one female and a female can marry to one male. So the relationship
will be one to one.
Using Sets, it can be represented as:
E-R MODEL-
Cardinality:
One-to-many:
One entity from entity set X can be associated with multiple entities of entity set Y, but an entity from entity set Y can
be associated with at least one entity.
For example, one class is consisting of multiple students.
E-R MODEL-
Many to one – When entities in one entity set can take part only once in the relationship set and entities in other
entity set can take part more than once in the relationship set, cardinality is many to one. Let us assume that a
student can take only one course but one course can be taken by many students. So the cardinality will be n to 1. It
means that for one course there can be n students but for one student, there will be only one course.
Using Sets, it can be represented as:
E-R MODEL-
Many to many – When entities in all entity sets can take part more than once in the relationship cardinality is many
to many. Let us assume that a student can take more than one course and one course can be taken by many students.
So the relationship will be many to many.
Using sets, it can be represented as:
student S1 is enrolled in C1 and C3 and Course C3 is enrolled by S1, S3 and S4. So it is many to many
relationships.
E-R MODEL-
Participation Constraint:
Participation Constraint is applied on the entity participating in the relationship set.
1.Total Participation – Each entity in the entity set must participate in the relationship. If each student must enroll in
a course, the participation of student will be total. Total participation is shown by double line in ER diagram.
2.Partial Participation – The entity in the entity set may or may NOT participate in the relationship. If some courses
are not enrolled by any of the student, the participation of course will be partial.The diagram depicts the ‘Enrolled in’
relationship set with Student Entity set having total participation and Course Entity set having partial participation.
E-R MODEL-
Weak Entity Type and Identifying
Relationship:
An entity type has a key attribute which uniquely identifies each entity in the entity set. But there exists some entity
type for which key attribute can’t be defined. These are called Weak Entity type.
For example, A company may store the information of dependants (Parents, Children, Spouse) of an Employee. But
the dependents don’t have existence without the employee. So Dependent will be weak entity type and Employee will
be Identifying Entity type for Dependant.
A weak entity type is represented by a double rectangle. The participation of weak entity type is always total. The
relationship between weak entity type and its identifying strong entity type is called identifying relationship and it is
represented by double diamond.
Example-1:
In the below ER Diagram, ‘Payment’ is the weak entity. ‘Loan Payment’ is the
identifying relationship and ‘Payment Number’ is the partial key. Primary Key of
the Loan along with the partial key would be used to identify the records.
E-R MODEL-
Strong Entity Set Weak Entity Set
Strong entity set always has a primary key. It does not have enough attributes to build a
primary key.
It is represented by a rectangle symbol. It is represented by a double rectangle symbol.
It contains a Primary key represented by the It contains a Partial Key which is represented by
underline symbol. a dashed underline symbol.
The member of a strong entity set is called as The member of a weak entity set called as a
dominant entity set. subordinate entity set.
Primary Key is one of its attributes which helps In a weak entity set, it is a combination of
to identify its member. primary key and partial key of the strong entity
set.
In the ER diagram the relationship between two The relationship between one strong and a weak
strong entity set shown by using a diamond entity set shown by using the double diamond
symbol. symbol.
The connecting line of the strong entity set with The line connecting the weak entity set for
the relationship is single. identifying relationship is double.
E-R MODEL-
Example- ER Diagram  In the diagram, entity Doctor has key attribute 'doctor_id'
which will be used to identify the doctors.
 It also has two multivalued attributes as 'specialization'
and 'qualification' as a doctor may have more than one
qualification and may be specialized in more than one
fields.
 The Doctor and Patient entity have a one-to-many
relationship as a Doctor may treat more than one patient.
 Similarly, Patient and Medicine have a many-to-many
relationship as a patient may buy more than one
medicine and vice-versa.
 'Code' is the key attribute for Medicine which is unique
for every medicine.
 The Patient has many attributes Patient_id, DOB, Age,
etc. 'Age' is the derived attribute here.
 Also, it has a composite attribute 'Address' which can
further be divided into two attributes 'Locality' and 'Town'.
Enhanced ER Model
EER Model
EER is a high-level data model that incorporates the extensions to the original ER model.
It is a diagrammatic technique for displaying the following concepts
• Sub Class and Super Class
• Specialization and Generalization
• Union or Category
• Aggregation
These concepts are used when the comes in EER schema and the resulting schema diagrams called as EER Diagrams.
Features of EER Model
• EER creates a design more accurate to database schemas.
• It reflects the data properties and constraints more precisely.
• It includes all modeling concepts of the ER model.
• Diagrammatic technique helps for displaying the EER schema.
• It includes the concept of specialization and generalization.
• It is used to represent a collection of objects that is union of objects of different of different entity types.
Enhanced ER Model
A. Sub Class and Super Class

•Sub class and Super class relationship leads the concept of Inheritance.
•The relationship between sub class and super class is denoted with symbol.
1. Super Class
•Super class is an entity type that has a relationship with one or more subtypes.
•An entity cannot exist in database merely by being member of any super class.
For example: Shape super class is having sub groups as Square, Circle, Triangle.
2. Sub Class
•Sub class is a group of entities with unique attributes.
•Sub class inherits properties and attributes from its super class.
For example: Square, Circle, Triangle are the sub class of Shape super class.
Enhanced ER Model
Generalization –
Generalization is the process of generalizing the entities which contain the properties of all the generalized entities.
Ø It is a bottom approach, in which two lower level entities combine to form a higher level entity.
Ø Generalization is the reverse process of Specialization.
Ø It defines a general entity type from a set of specialized entity type.
Ø It minimizes the difference between the entities by identifying the common features.
In the given example, Tiger, Lion, Elephant can all be
generalized as Animals.
Enhanced ER Model
Specialization –
Specialization is a process that defines a group entities which is divided into sub groups based on their characteristic.
Ø It is a top down approach, in which one higher entity can be broken down into two lower level entity.
Ø It maximizes the difference between the members of an entity by identifying the unique characteristic or attributes of each
member.
Ø It defines one or more sub class for the super class and also forms the superclass/subclass relationship.
In the given example, Employee can be specialized as Developer or
Tester, based on what role they play in an Organization.
Enhanced ER Model
Aggregation –
Aggregation is a process that represent a relationship between a whole object and its component parts.
Ø It abstracts a relationship between objects and viewing the relationship as an object.
Ø It is a process when two entity is treated as a single entity.
In the given example, the relation between College and
Course is acting as an Entity in Relation with Student.
Enhanced ER Model
Category or Union
Ø Category represents a single super class or sub class relationship with more than one super class.
Ø It can be a total or partial participation.
For example Car booking, Car owner can be a person, a bank (holds a possession on a Car) or a
company. Category (sub class) → Owner is a subset of the union of the three super classes →
Company, Bank, and Person. A Category member must exist in at least one of its super classes.
Overview of Unified Modeling Language (UML).
UML
➢ The UML stands for Unified modeling language, is a standardized general-purpose visual modeling language in the field
of Software Engineering.
➢ It is used for specifying, visualizing, constructing, and documenting the primary artifacts of the software system.
➢ It helps in designing and characterizing, especially those software systems that incorporate the concept of Object
orientation.
➢ It describes the working of both the software and hardware systems.
Goals of UML
•Since it is a general-purpose modeling language, it can be utilized by all the modelers.
•UML came into existence after the introduction of object-oriented concepts to systemize and consolidate the object-oriented development
due to the absence of standard methods at that time.
•The UML diagrams are made for business users, developers, ordinary people, or anyone who is looking forward to understand the system, such
that the system can be software or non-software.
•Thus it can be concluded that the UML is a simple modeling approach that is used to model all the practical systems.
Characteristics of UML
The UML has the following features:
➢ It is a generalized modeling language.
➢ It is distinct from other programming languages like C++, Python, etc.
➢ It is interrelated to object-oriented analysis and design.
➢ It is used to visualize the workflow of the system.
➢ It is a pictorial language, used to generate powerful modeling artifacts.
Conceptual Modeling
Before moving ahead with the concept of UML, we should first understand the basics of the conceptual model.
A conceptual model is composed of several interrelated concepts. It makes it easy to understand the objects and how
they interact with each other. This is the first step before drawing UML diagrams.
Following are some object-oriented concepts that are needed to begin with UML:
➢ Object: An object is a real world entity. There are many objects present within a single system. It is a fundamental
building block of UML.
➢ Class: A class is a software blueprint for objects, which means that it defines the variables and methods common to all
the objects of a particular type.
➢ Abstraction: Abstraction is the process of portraying the essential characteristics of an object to the users while hiding
the irrelevant information. Basically, it is used to envision the functioning of an object.
➢ Inheritance: Inheritance is the process of deriving a new class from the existing ones.
➢ Polymorphism: It is a mechanism of representing objects having multiple forms used for different purposes.
➢ Encapsulation: It binds the data and the object together as a single unit, enabling tight coupling between them.
OO Analysis and Design
OO is an analysis of objects, and design means combining those identified objects. So, the main purpose of
OO analysis is identifying the objects for designing a system. The analysis can also be done for an existing
system. The analysis can be more efficient if we can identify the objects. Once we have identified the
objects, their relationships are then identified, and the design is also produced.
The purpose of OO is given below:
•To identify the objects of a system.
•To identify their relationships.
•To make a design that is executable when the concepts of OO are employed.
Step 1: OO Analysis
The main purpose of OO analysis is identifying the objects and describing them correctly. After the objects
are identified, the designing step is easily carried out. It is a must to identify the objects with responsibilities.
Here the responsibility refers to the functions performed by the objects. Each individual object has its own
functions to perform. The purpose of the system is fulfilled by collaborating these responsibilities.
Step 2: OO Design
This phase mainly emphasizes on meeting the requirements. In this phase, the objects are joined together as
per the intended associations. After the association is completed, the designing phase also gets complete.
Step 3: OO Implementation
This is the last phase that comes after the designing is done. It implements the design using any OO
languages like C++, Java, etc.
Role of UML in OO design

➢ As the UML is a modeling language used to model software as well as non-software systems, but here it
focuses on modeling OO software applications. It is essential to understand the relation between the OO
design and UML. The OO design can be converted into the UML as and when required. The OO
languages influence the programming world as they model real world objects.
➢ The UML itself is an amalgamation of object-oriented notations like Object-Oriented Design (OOD),
Object Modeling Technique (OMT), and Object-Oriented Software Engineering (OOSE). The strength of
these three approaches is utilized by the UML to represent more consistency.
Some of the parts of UML are:
• Class diagram. A class diagram is similar to an E-R diagram. Later in this
section we illustrate a few features of class diagrams and how they relate
to E-R diagrams.
• Use case diagram. Use case diagrams show the interaction between
users and the system, in particular the steps of tasks that users perform (such
as withdrawing money or registering for a course).
• Activity diagram.Activity diagrams depict the flow of tasks between
various components of a system.
• Implementation diagram. Implementation diagrams show the system
components and their interconnections, both at the software component
level and the hardware component level.
Alternative ER Notations
ER Diagram Design Issues
 Use of Entity Sets v/s Attributes
 Use of Entity Sets v/s Relationship Sets
 Binary v/s n-ary relationships
 Placement of Relationship Attributes
Use of Entity Sets v/s Attributes
 Consider the entity set instructor with the
additional attribute phone number. (Fig a)
 It can easily be argued that a phone is an
entity in its own right with attributes phone
number and location; the location may be
the office or home where the phone is
located with mobile.
 A phone entity set with attributes phone
number and location.
 A relationship set inst phone, denoting the
association between instructors and the
phones that they have. (Fig b)
Use of Entity Sets v/s Relationship
Sets
(a)
(b)
 Takes relationship set to model the situationwhere a student takes a
(section of a) course.
 An alternative is to imagine that there is a course-registration record
for each course that each student takes.
 Then, we have an entity set to represent the course-registration
record.
 Let us call that entity set registration.
 Each registration entity is related to exactly one student and to
exactly one section, so we have two relationship sets, one to relate
course registration records to students and one to relate course-
registration records to sections.
Binary v/s n-ary relationships
Ternary relationship versus three binary

relationships
Placement of Relationship Attributes
 The design decision of where to place descriptive attributes for a
relationship or for an entity should reflect the characteristics of the
enterprise being modeled.
Removing Redundant Attributes
Database design using E-R Model
 Identify the entity sets
 Choose appropriate attributes
 Form the relationship sets
 Eliminate redundant attributes that may:
➢ Exist in multiple entity sets
E.g. instructor_id, instructor_dept_id and
instructor_dept_name repeated in instructor and
student entity. The instructor_dept_id and
instructor_dept_name may be eliminated from
student entity set
➢ Exist even when not actually required
Fundamental Operations
 Select
 Project
 Union
 Set difference
 Cartesian product
 Rename
Note: select, project, rename – called as ‘unary’

operators and the remaining are ‘binary’
operators
Additional Operations
 Set intersection
 Natural join
 Outer joins
▪ Left outer join

▪ Right outer join
▪ Full outer join
 Assignment
Notations used
Select Operation
 The select operation display tuples that satisfy
a given predicate (condition).
 We use the lowercase Greek letter sigma
to denote selection.
 It produces a “horizontal” subset.
 Syntax: sC(R)
 where C is a selection condition(=, <, >, =<, >=, <>)
 and R is the relation over which the selection takes place
Example of Select
Student
sid name addr
123 Fred 3 Oxford
345 John 6 Hope Rd.
567 Ann 5 Garden
Find all students whose id is above 300.

Query: s sid > 300(Student)
Resulting Relation:
sid name addr
345 John 6 Hope Rd
567 Ann 5 Garden
Project Operation
 The project operation display attributes that
satisfy a given predicate (condition).
 We use the lowercase Greek letter pi to
denote selection.
 It produces a “vertical” subset.
 Eliminates duplicated values.
 Syntax: ПA(R)
 where A is a set of attributes of R
 and R is the relation over which the project takes place
Example of Project
Enrollment
sid cid grade
123 CS51T 76
234 CS52S 50
345 CS52S 55
Display all course id values

Query: Пcid(Enrollment)
cid
CS51T
CS52S
SELECTION & PROJECTION Example
Person
Id Name Address Hobby
1123 John 123 Main stamps
1123 John 123 Main coins

∏Name,
Hobby(Person)
5556 Mary 7 Lake Dr hiking
Name Hobby
9876 Bart 5 Pine St stamps
John stamps
σ Hobby=‘stamps’(Person) John coins

Id Name Address Hobby
Mary Hiking
1123 John 123 Main stamps
Bart stamps
9876 Bart 5 Pine St stamps
10
Examples
 s Id>3000 OR Hobby=‘hiking’ (Person)
 s Id>3000 AND Id <3999 (Person)
 s NOT(Hobby=‘hiking’) (Person)
 s Hobby‘hiking’ (Person)
  Id (Person)
  Hobby (Person)
Relational Algebra
Expressions
Union Operation
 Creates a relation that contains all the
values in both the relations.
 Two relations are union compatible if:
 Both have same number of columns
 Names of attributes are the same in both
 Represented using the symbol ‘∪’.

 Eliminate duplicates
 Syntax: S1 ∪ S2
 where S1, S2 are separate relations
Example
The Set-Difference Operation
 Allows us to find tuples that are in
one relation but are not in another.
 Denoted by the symbol ‘−’.
 Syntax: S1 – S2
 where S1, S2 are separate
relations
Example
Cartesian Product
 The Cartesian product of two tables
combines each row in one table with
each row in the other table.
 A relation is by definition a subset of a
Cartesian product of a set of domains.
 Denoted by the symbol ‘ X ‘.
 Syntax: S1 X R1
 where S1, R1 are separate relations
Example
Rename
 The results of relational-algebra expressions
do not have a name that we can use to
refer to them.
 It is useful to be able to give them names
using the rename operator.
 Denoted by the lowercase Greek letter rho (
)
 Syntax:
 where x is the new name for the relational algebra expression E.
Assignment Operator
 It is convenient at times to write a
relational-algebra expression by assigning
parts of it to temporary relation variables.
 The assignment operation, denoted by ←,
works like assignment in a programming
language.
 Example: r1 ← s1 X s2
 Here s1 and s2 are two separate relations , whereas r1 is a
temporary relation variable.
Set Intersection
 Creates a relation by taking only the
common values in both relations.
 Represented using the symbol ‘ ’.
 Syntax: S1  S2
 where S1, S2 are separate relations
Example
Natural Join
 JOIN clause is used to combine rows
from two or more tables, based on a
common field between them.
 The natural join is denoted as R S.
 Where R and S are two different relations.
Example
Left Outer Join
 The LEFT JOIN keyword returns all rows
from the left table (table1), with the
matching rows in the right table (table2).
 The result is NULL in the right side when
there is no match.
 In some databases LEFT JOIN is called
LEFT OUTER JOIN.
 Denoted using
Example
Right Outer Join
 The RIGHT JOIN keyword returns all rows
from the right table (table2), with the
matching rows in the left table (table1).
 The result is NULL in the left side when there
is no match.
 In some databases RIGHT JOIN is called
RIGHT OUTER JOIN.
 Denoted using
Example
Full Outer Join
 The FULL OUTER JOIN keyword returns all
rows from the left table (table1) and from
the right table (table2).
 The FULL OUTER JOIN keyword combines
the result of both LEFT and RIGHT joins.
 Denoted using
Example
Theta Join(-Join)
 R FS
 Defines a relation that contains tuples satisfying the predicate

F from the Cartesian product of R and S.
• Can rewrite Theta join using basic

Selection and Cartesian product
operations.
R FS = sF(R  S)
Contd..
 The join condition can be
 When the join condition operator is ‘= ‘,then
we call this an Equijoin.
 Note :The attributes in common are repeated

in equi-join whereas the redundancy is
eliminated in natural join.
Aggregate Functions
 Aggregate functions perform a calculation
on a set of values and return a single value.
 Denoted using the symbol ‘ .’(calligraphic
G)
 Examples:
➢ SUM
➢ MINIMUM (MIN)
➢ MAXIMUM (MAX)
➢ AVERAGE (AVG)
➢ COUNT
Example
 Assume the relation EMP has the following tuples:
 Find the minimum Salary: MIN (salary)(EMP):

Notes

Uploaded by

Notes

Uploaded by

Database

Roll No Name Address

ROLL_NO NAME ADDRESS PHONE AGE

4 SURESH DELHI 9156768971 18

SELECT * FROM STUDENT ORDER BY AGE; ROLL_NO NAME ADDRESS PHONE AGE

1 RAM DELHI 9455123451 18

Note: ORDER BY AGE is equivalent to ORDER 2 RAMESH GURGAON 9652431543 18

Employee ID FirstName LastName

EmpSSN EmpNum Empname

StudID Roll No First Name LastName Email

1 11 Tom Price abc@gmail.com

StudID Roll No First Name LastName Email

001 Science B002 David Warner

Teacher ID DeptCode Fname Lname

OrderNo PorductID Product Name Quantity

A. Sub Class and Super Class

Role of UML in OO design

Ternary relationship versus three binary

Note: select, project, rename – called as ‘unary’

▪ Left outer join

Find all students whose id is above 300.

Display all course id values

1123 John 123 Main coins

σ Hobby=‘stamps’(Person) John coins

 s Id>3000 AND Id <3999 (Person)

 Represented using the symbol ‘∪’.

 Defines a relation that contains tuples satisfying the predicate

• Can rewrite Theta join using basic

 Note :The attributes in common are repeated

 Find the minimum Salary: MIN (salary)(EMP):

You might also like