Module _1 DBMS

CSE3156 - Database
Management Systems
Presidency University, Bengaluru

Definitions

Data: Raw facts, Unprocessed facts or Known facts that can be recorded
and have an implicit meaning.
Ex: 25, suresh, Bangalore


Structured: numbers, text, dates

Unstructured: images, video, documents

Information: Processed Data
Ex: The age of Suresh is 25
Basic Definitions
• Database: A collection of related data.
• Data: Known facts that can be recorded and have an
implicit meaning.
• Mini-world: Some part of the real world about which
data is stored in a database. For example, student
grades and transcripts at a university.
• Database Management System (DBMS): A software
package/ system to facilitate the creation and
maintenance of a computerized database.
• Database System: The DBMS software together with
the data itself. Sometimes, the applications are also
included.
Slide 1-4
Definitions
Database:

Def 1: Database is an organized collection of logically related data

Def 2: A database is a shared collection of logically related data that is stored to
meet the requirements of different users of an organization

Def 3: A database is a self-describing collection of integrated records

Def 4: A database models a particular real world system in the computer in the
form of data
Ex: Online banking system, Library Management
Slide 1-5
What is DBMS ?
The DBMS provides users and programmers a systematic way to create, retrieve, update and manage database. or
A software package/ system to facilitate the creation and maintenance of a computerized database.
Database System:
The DBMS software together with the data itself. Sometimes, the applications are also included.
Advantages
 Multiple user interfaces
 Redundancy control
 Backup and Recovery
 Authorized Access
Main Characteristics of the Database
Approach
• Self-describing nature of a database system
• Insulation between programs and data
• Data Abstraction
Main Characteristics of the Database
Approach
• Support of multiple views of the data
• Sharing of data and multiuser transaction processing

Types of Databases and Database
Applications
Numeric and Textual Databases
Multimedia Databases
Geographic Information Systems (GIS)
Data Warehouses
Real-time and Active Databases

Database Users
Users may be divided into those who actually use and control the
content (called “Actors on the Scene”) and those who enable the
database to be developed and the DBMS software to be designed and
implemented (called “Workers Behind the Scene”).

Database Users
Actors on the scene
Database administrators
Database Designers
End-users
Categories of End-users
Casual
Naïve or Parametric
Sophisticated
Stand-alone
Workers behind the Scene
DBMS system designers and implementers
Tool developers
Operators and maintenance personnel

Advantages of Using the
Database Approach
 Controlling redundancy in data storage and in development and maintenance efforts.
 Sharing of data among multiple users.
 Restricting unauthorized access to data.
 Providing persistent storage for program Objects .
 Providing Storage Structures for efficient Query Processing
 Providing backup and recovery services.
 Providing multiple interfaces to different classes of users.
 Representing complex relationships among data.
 Enforcing integrity constraints on the database.
 Drawing Inferences and Actions using rules
Additional Implications of Using the Database
Approach
• Potential for enforcing standards
• Reduced application development time
• Flexibility to change data structures
• Availability of up-to-date information
• Economies of scale
Data Models
Data Model: A set of concepts to describe the structure of a
database, and certain constraints that the database should obey.
Data Model Operations: Operations for specifying database
retrievals and updates by referring to the concepts of the
data model. Operations on the data model may include basic
operations and user-defined operations.
Data Models
Data Model gives us an idea that how the final system will look like after
its complete implementation.
It defines the data elements and the relationships between the data
elements.
Data Models are used to show how data is stored, connected, accessed
and updated in the database management system.
Here, we use a set of symbols and text to represent the information so

that members of the organization can communicate and understand it.
Categories of data models
Conceptual (high-level, semantic) data models: Provide concepts that are close to the way
many users perceive data. (Also called entity-based or object-based data models.)
Physical (low-level, internal) data models: Provide concepts that describe details of how data
is stored in the computer.
Implementation (representational) data models: Provide concepts that fall between the above
two, balancing user views with some computer storage details.
Some of the Data Models in DBMS
are:
Hierarchical Model Flat Data Model
Network Model Semi-Structured Data Model
Entity-Relationship Model Associative Data Model
Relational Model Context Data Model
Object-Oriented Data Model
Object-Relational Data Model

Relational Model
The relational model represents
the database as a collection of
relations.
A relation is nothing but a table of
values.
Every row in the table represents
a collection of related data values.
These rows in the table denote a
real-world entity or relationship. 21
Network model
Is a database model that is designed
as a flexible approach to representing
objects and their relationships.
A unique feature of the network
model is its schema, which is viewed
as a graph where relationship types
are arcs and object types are nodes.
Network Model
ADVANTAGES:
Network Model is able to model complex relationships and represents
semantics of add/delete on the relationships.
Can handle most situations for modeling using record types and
relationship types.
DISADVANTAGES:
Navigational and procedural nature of processing
Database contains a complex array of pointers that thread through a set

of records. Little scope for automated "query optimization”
Hierarchical database model
Is a data model in which the data are organized into a tree-
like structure. The data are stored as records which are
connected to one another through links. A record is a
collection of fields, with each field containing only one
value. The type of a record defines which fields the record
contains.
The hierarchical database model mandates that each child
record has only one parent, whereas each parent record can
have one or more child records

Hierarchical Model
ADVANTAGES:
 It promotes data sharing.
 Parent/child relationship
 Promotes conceptual simplicity.
 Database security is provided and enforced by DBMS.
 Parent/child relationship promotes data integrity.
 It is efficient with 1:M relationships.

Hierarchical Model
DISADVANTAGES:
Complex implementation requires knowledge of physical data storage
characteristics.
Navigational system yields complex application development, management, and use;
requires knowledge of hierarchical path.
Changes in structure require changes in all application programs.
There are implementation limitations (no multiparent or M:N relationships).
There is no data definition or data manipulation language in the DBMS.
Object Oriented (OO) Data Model
Increasingly complex real-world problems
demonstrated a need for a data model that more
closely represented the real world. In the object
oriented data model (OODM), both data and their
relationships are contained in a single structure
known as an object.
Object relational model
Is a combination of a Object oriented database model and a Relational
database model. So, it supports objects, classes, inheritance etc. just like
Object Oriented models and has support for data types, tabular structures
etc. like Relational data model.
One of the major goals of Object relational data model is to close the gap
between relational databases and the object oriented practices frequently
used in many programming languages such as C++, C#, Java etc.
Both Relational data models and Object oriented data models are very
useful. But it was felt that they both were lacking in some characteristics and
so work was started to build a model that was a combination of them both.
DBMS Languages
Types of DBMS
languages:
30
Data Definition
Language (DDL)
• DDL is used for specifying the database schema.
• It is used for creating tables, schema, indexes, constraints etc. in
database.
• CREATE
• ALTER
• DROP
• TRUNCATE
• RENAME
• DROP
• Comment
31
Data Manipulation
Language (DML)
DML is used for accessing and manipulating data in a database. The following
operations on database comes under DML:
• SELECT
• INSERT
• UPDATE
• DELETE
Data Control language
(DCL)
DCL is used for granting and revoking user access on a database –
• GRANT
• REVOKE
Transaction Control
Language(TCL)
The changes in the database that we made using DML commands are
either performed or roll backed using TCL.
• COMMIT
• ROLLBACK
1.Schemas, Instances, and Database State
Database Schema
A database schema is the logical representation of a database, which

shows how the data is stored logically in the entire database.
Employee Schema
Department Schema
Dept_Location Schema
Schemas, Instances, and Database State
Instances
The instance of the database is the values of these variables or
attributes at any given time
Employee Schema
Instances
Schemas, Instances, and Database State
Database State
The data in the database at a particular moment in
time is called a database state or snapshot.

Schemas versus Instances
• Database Schema
• Schema Diagram
• Schema Construct
• Database Instance
Database Schema Vs. Database State
• Database State: Refers to the content of a database at a moment in time.
• Initial Database State: Refers to the database when it is loaded
• Valid State: A state that satisfies the structure and constraints of the database.
• Distinction
• The database schema changes very infrequently. The database state changes
every time the database is updated.
• Schema is also called intension, whereas state is called extension.
Example: University
Database
• Conceptual schema:
• Students(sid: string, name: string, login: string, age: integer, gpa:real)
• Courses(cid: string, cname:string, credits:integer)
• Enrolled(sid:string, cid:string, grade:string)
• Physical schema:
• Relations stored as unordered files.
• Index on first column of Students.
• External Schema (View):

• Course_info(cid:string,enrollment:integer)
Example of a Database
Schema
41
Example of a database state
42
2.Three-Schema Architecture
43
Three-Schema Architecture
Defines DBMS schemas at three levels:
1.Internal schema
2.Conceptual schema
3.External schemas
Type of Implementation
Schema
External View 1: Course info(cid:int,cname:string)
Schema
View 2: studeninfo(id:int. name:string)
Conceptual Students(id: int, name: string, login: string, age:

Shema integer)
Courses(id: int, cname.string, credits:integer)
Enrolled(id: int, grade:string)

Physical •Relations stored as unordered files.
Schema
•Index on the first column of Students.
Three-Schema Architecture
Three-Schema Architecture is Proposed to support DBMS
characteristics of:
Program-data independence.
Support of multiple views of the data.

3.Data Independence
When a schema at a lower level is changed, only the mappings between this
schema and higher-level schemas need to be changed in a DBMS that fully supports
data independence.
The higher-level schemas themselves are unchanged. Hence, the application
programs need not be changed since they refer to the external schemas.
Data Independence:
We can define two types of data independence:
1.Logical Data Independence
2.Physical Data Independence

4.What is a File system?
• A file system is a technique of arranging the files in a storage
medium like a hard disk, pen drive, DVD, etc.
• It mostly consists of different types of files like mp3, mp4, txt,
doc, etc. that are grouped into directories.

Drawbacks of File system
• Data redundancy: Data redundancy refers to the duplication of data, lets say we are
managing the data of a college where a student is enrolled for two courses, the same
student details in such case will be stored twice, which will take more storage than
needed. Data redundancy often leads to higher storage costs and poor access time.
• Data inconsistency: Data redundancy leads to data inconsistency, lets take the same
example that we have taken above, a student is enrolled for two courses and we have
student address stored twice, now lets say student requests to change his address, if
the address is changed at one place and not on all the records then this can lead to
data inconsistency.
50
Cont..
• Data Isolation: Because data are scattered in
various files, and files may be in different formats,
writing new application programs to retrieve the
appropriate data is difficult.
• Dependency on application programs: Changing
files would lead to change in application programs.
51
Cont..
• Atomicity issues: Atomicity of a transaction refers to “All or
nothing”, which means either all the operations in a
transaction executes or none.
• It is difficult to achieve atomicity in file processing
systems.
• Data Security: Data should be secured from unauthorised
access, for example a student in a college should not be
able to see the payroll details of the teachers, such kind of
security constraints are difficult to apply in file processing
systems.
52
5.File Systems Vs Database
System
• A file system is a software that manages and organizes the files in a storage medium, whereas
DBMS is a software application that is used for accessing, creating, and managing databases.
• The file system doesn't have a crash recovery mechanism on the other hand, DBMS provides a
crash recovery mechanism.
• Data inconsistency is higher in the file system. On the contrary Data inconsistency is low in a
database management system.
• File system does not provide support for complicated transactions, while in the DBMS system, it
is easy to implement complicated transactions using SQL.
• File system does not offer concurrency, whereas DBMS provides a concurrency facility.
Data Modelling using Entities and
Relationships
Chapter-2
Outline
Using High-Level Conceptual Data Models for Database Design
A Sample Database Application
Entity Types, Entity Sets, Attributes, and Keys
Relationship Types, Relationship Sets, Roles, and Structural Constraints
Weak Entity Types
Refining the ER Design for the COMPANY Database
ER Diagrams, Naming Conventions, and Design Issues
Example of Other Notation: UML Class Diagrams
Relationship Types of Degree Higher than Two
1.Data Modeling Using the Entity-Relationship
(ER) Model
• Entity-Relationship (ER) model
• Popular high-level conceptual data model
• ER diagrams
• Diagrammatic notation associated with the ER model

Using High-Level Conceptual Data Models for
Database Design
• Requirements collection and analysis
• Database designers interview prospective database users to understand and document data
requirements
• Result: data requirements
• Functional requirements of the application

Using High-Level Conceptual Data Models
(cont’d.)
• Conceptual schema
• Conceptual design
• Description of data requirements
• Includes detailed descriptions of the entity types, relationships, and constraints
• Transformed from high-level data model into implementation data model

Using High-Level Conceptual Data Models
(cont’d.)
• Logical design or data model mapping
• Result is a database schema in implementation data model of DBMS
• Physical design phase

• Internal storage structures, file organizations, indexes, access paths, and physical design
parameters for the database files specified
Entity Types, Entity Sets, Attributes, and Keys
• ER model describes data as:
• Entities
• Relationships
• Attributes
ER Diagrams, Naming Conventions, and Design Issues
Chapter 3-61
Chapter 3-62
Chapter 3-63
Components of the ER Diagram
• This model is based on three basic concepts:
64
Entities and Attributes
• Entity
• Thing in real world with independent existence
• Attributes
• Particular properties that describe entity
• Types of attributes:
• Composite versus simple (atomic) attributes
• Single-valued versus multivalued attributes
• Stored versus derived attributes
• NULL values
• Complex attributes
Entities and Attributes (cont’d.)
Entity Types, Entity Sets, Keys, and Value Sets
• Entity type
• Collection (or set) of entities that have the same attributes
Types of Attributes
68
Entity Types, Entity Sets, Keys, and Value
Sets (cont’d.)
• Key or uniqueness constraint
• Attributes whose values are distinct for each individual entity in entity set
• Key attribute
• Uniqueness property must hold for every entity set of the entity type
• Value sets (or domain of values)

• Specifies set of values that may be assigned to that attribute for each individual entity
What are Keys in DBMS?
• KEYS in DBMS is an attribute or set of attributes which helps
you to identify a row(tuple) in a relation(table).
• They allow you to find the relation between two tables. Keys
help you uniquely identify a row in a table by a combination of
one or more columns in that table.
• Key is also helpful for finding unique record or row from the
table.
Types of Keys
1.Super Key
2.Primary Key
3.Candidate Key
4.Alternate Key
71
Types of Keys
•Foreign Key
•Compound Key
•Composite Key
72
A Sample Database Application
• COMPANY
• Employees, departments, and projects
• Company is organized into departments
• Department controls a number of projects
• Employee: store each employee’s name, Social Security number, address, salary, sex
(gender), and birth date
• Keep track of the dependents of each employee
Initial Conceptual Design of the COMPANY
Database
Relationship Types, Relationship Sets, Roles, and
Structural Constraints
• Relationship
• When an attribute of one entity type refers to another entity type
• Represent references as relationships not attributes

Relationship Types, Sets, and Instances
• Relationship type R among n entity types E1, E2, ..., En

• Defines a set of associations among entities from these entity types
A Relationship Type is a type of association that can exist between

two different (or same) entity types.Relationship instances ri
• Each ri associates n individual entities (e1, e2, ..., en)
• Each entity ej in ri is a member of entity set Ej

Relationship Degree
• Degree of a relationship type
• Number of participating entity types
• Binary, ternary,unary
Role Names and Recursive Relationships
• Role names and recursive relationships

• Role name signifies role that a participating entity plays in each relationship instance
• Recursive relationships
• Same entity type participates more than once in a relationship type in different roles
• Must specify role name

Constraints on Binary Relationship Types
• Cardinality ratio for a binary relationship
• Specifies maximum number of relationship instances that entity can participate in
• Participation constraint
• Specifies whether existence of entity depends on its being related to another entity
• Types: total and partial

Attributes of Relationship Types
• Attributes of 1:1 or 1:N relationship types can be migrated to one entity type
• For a 1:N relationship type

• Relationship attribute can be migrated only to entity type on N-side of relationship
• For M:N relationship types

• Some attributes may be determined by combination of participating entities
• Must be specified as relationship attributes

Weak Entity Types
• Do not have key attributes of their own
• Identified by being related to specific entities from another entity type
• Identifying relationship
• Relates a weak entity type to its owner
• Always has a total participation constraint

Refining the ER Design for the
COMPANY Database
• Change attributes that represent relationships into relationship types
• Determine cardinality ratio and participation constraint of each relationship type

Proper Naming of Schema Constructs
• Choose names that convey meanings attached to different constructs in schema
• Nouns give rise to entity type names
• Verbs indicate names of relationship types
• Choose binary relationship names to make ER diagram readable from left to right
and from top to bottom
Alternative Notations for ER Diagrams
• Specify structural constraints on relationships
• Replaces cardinality ratio (1:1, 1:N, M:N) and single/double line notation for participation
constraints
• Associate a pair of integer numbers (min, max) with each participation of an entity type E in
a relationship type R, where 0 ≤ min ≤ max and max ≥ 1

Relationship Types of Degree Higher than Two
• Degree of a relationship type
• Number of participating entity types
• Binary
• Relationship type of degree two
• Ternary
• Relationship type of degree three
Choosing between Binary and Ternary (or
Higher-Degree) Relationships
• Some database design tools permit only binary relationships
• Ternary relationship must be represented as a weak entity type
• No partial key and three identifying relationships
• Represent ternary relationship as a regular entity type

• By introducing an artificial or surrogate key
Constraints on Ternary (or Higher-Degree)
Relationships
• Notations for specifying structural constraints on n-ary relationships
• Should both be used if it is important to fully specify structural constraints

Summary
• Basic ER model concepts of entities and their attributes
• Different types of attributes
• Structural constraints on relationships
• ER diagrams represent E-R schemas
• UML class diagrams relate to ER modeling concepts

Components of ER Diagram
You base an ER Diagram on three basic concepts:
•Entities
• Weak Entity
•Attributes
• Key Attribute
• Composite Attribute
• Multivalued Attribute
• Derived Attribute
•Relationships
• One-to-One Relationships
• One-to-Many Relationships
• Many-to-One Relationships
• Many-to-Many Relationships
Entities
•An entity can be either a living or non-living component.
•It showcases an entity as a rectangle in an ER diagram.
•For example, in a student study course, both the student and the course are entities.
Weak Entity
•An entity that makes reliance over another entity is called a weak entity
•You showcase the weak entity as a double rectangle in ER Diagram.
•In the example below, school is a strong entity because it has a primary key attribute - school
number. Unlike school, the classroom is a weak entity because it does not have any primary key
and the room number here acts only as a discriminator.
• Identifying Relationships
It links the strong and weak entity and is represented by a double diamond sign.
Chapter 3-97
Attribute
•An attribute exhibits the properties of an entity.
•You can illustrate an attribute with an oval shape in an ER diagram.
Key Attribute
•Key attribute uniquely identifies an entity from an entity set.
•It underlines the text of a key attribute.
•For example: For a student entity, the roll number can uniquely identify a student from a set of
student
Composite Attribute
•An attribute that is composed of several other attributes is known as a composite attribute.
•An oval showcases the composite attribute, and the composite attribute oval is further
connected with other ovals.
Multivalued Attribute
•Some attributes can possess over one value, those attributes are called multivalued attributes.
•The double oval shape is used to represent a multivalued attribute.
Derived Attribute
•An attribute that can be derived from other attributes of the entity is known as a derived
attribute.
•In the ER diagram, the dashed oval represents the derived attribute.
Relationship
•The diamond shape showcases a relationship in the ER diagram.
•It depicts the relationship between two entities.
•In the example below, both the student and the course are entities, and study is the relationship
between them.
Many-to-One Relationship
•When more than one element of an entity is related to a single element of another entity, then it
is called a many-to-one relationship.
•For example, students have to opt for a single course, but a course can have many students
Many-to-Many Relationship
•When more than one element of an entity is associated with more than one element of another
entity, this is called a many-to-many relationship.
•For example, you can assign an employee to many projects and a project can have many
employees.
105
Participation Constraints
•Total Participation − Each entity is involved in the relationship. Total participation is
represented by double lines.
•Partial participation − Not all entities are involved in the relationship. Partial participation is
represented by single lines.
107
108
The (min,max) notation
relationship constraints
(0,1) (1,1)
(1,1) (1,N)
COMPANY ER Schema Diagram
using (min, max) notation
Data Modeling Tools
A number of popular tools that cover conceptual modeling and mapping into relational schema
design. Examples:
• ERWin,
• S- Designer (Enterprise Application Suite),
• ER- Studio, etc.
POSITIVES: serves as documentation of application requirements, easy user interface - mostly

graphics editor support
Problems with Current Modeling Tools
• DIAGRAMMING
• Poor conceptual meaningful notation.
• To avoid the problem of layout algorithms and aesthetics of diagrams, they prefer
boxes and lines and do nothing more than represent (primary-foreign key) relationships
among resulting tables.(a few exceptions)
• METHODOLGY
• lack of built-in methodology support.
• poor tradeoff analysis or user-driven design preferences.
• poor design verification and suggestions for improvement.

PROBLEM with ER notation
• The entity relationship model in its original form did not
support the specialization/ generalization abstractions

Extended E-R Features
Generalization & Specialization
• Shows inheritance of attributes.
• A lower-level entity set inherits all the attributes and relationship

participation of the higher-level entity set to which it is linked.
• A lower-level entity set may have additional attributes and

participate in additional relationships.
• Also knows as Superclass-Subclass relationship.

Specialization
Generalization
More E-R Diagram Examples
Koushik De- - CSE, UEMK

Problem
Draw an ER model of the Banking database application considering
the following constraints −
• A bank has many entities.
• Each customer has multiple accounts.
• Multiple customers belong to a single branch.
• Single customer can borrow multiple loans.
• A branch has multiple employees.
Solution
• Follow the steps given below to draw an ER model of the Banking
database application −
• Step 1 − Identify the entity sets
• The entity set has multiple instances in a given business scenario.
• As per the given constraints the entity sets are as follows
• Customer
• Account
• Branch
• Loan
• Employee
Step 2 − Identify the attributes for the
given entities
• Customer − the relevant attributes are customerName, CustomerID, address.
• Account − The relevant attributes are AccountNo, balance.
• Branch − The relevant attributes are branchID, branchName, address.
• Loan − The relevant attributes are loanNo, paymentMode, dateOfLoan, and

amount.
• Employee − The relevant attributes are empID, empName, dateOfJoin,

experience, qualification.
Step 3 − Identify the Key
attributes
• CustomerID is the key attribute for a customer.
• AccountNo is the key attribute for Account entities.
• BranchID is the key attribute for branch entities.
• LoanNo is the key attribute for a loan entity.
• EmpID is the key attribute for an Employee entity.
Step 4 − Identify the relationship
between entity sets
• One customer is enrolled by multiple accounts and one account

for multiple customers. Hence, the relationship is many to many.
Step 4 − Identify the relationship between
entity sets (Cont…)
• Many customers belong to one branch but one branch belongs to
many customers. Hence, the relationship between customer and
branch is many to one.
• One customer can borrow multiple loans in the same way
multiple loans can borrow a single customer, hence the
relationship between customer and loan is one to many.
• One branch has many employees and in the same way the
number of employees works in a single branch.
Step 5 − Complete ER diagram
• The complete ER diagram is as follows −
126
Assignment
127
128
ER DIAGRAM FOR A BANK
DATABASE
© The Benjamin/Cummings Publishing Company, Inc. 1994, Elmasri/Navathe, Fundamentals of Database Systems, Second Edition
Chapter 3-129
ER DIAGRAM FOR A BANK
DATABASE
© The Benjamin/Cummings Publishing Company, Inc. 1994, Elmasri/Navathe, Fundamentals of Database Systems, Second Edition
Chapter 3-130
2.Relational Algebra
• The relational algebra is a procedural query language. Relational algebra is the basic set of
operations for the relational model
• It consists of a set of operations that take one or two relations as input and produce a new
relation as their result.
• Six basic operators
– select: 
– project: 
– union: 
– set difference: –
– Cartesian product: x
– rename: 
Relational Algebra Overview
• Relational Algebra consists of several groups of operations
• Unary Relational Operations
• SELECT (symbol: s (sigma))
• PROJECT (symbol: p (pi))
• RENAME (symbol: r (rho))
• Relational Algebra Operations From Set Theory
• UNION (È), INTERSECTION (Ç ), DIFFERENCE (or MINUS, – )
• CARTESIAN PRODUCT ( x )
• Binary Relational Operations
• JOIN (several variations of JOIN exist)
• DIVISION
• Additional Relational Operations
• OUTER JOINS, OUTER UNION
• AGGREGATE FUNCTIONS (These compute summary of information: for example, SUM, COUNT, AVG, MIN, MAX)
Project Operation
• Project is used to display the required attributes from a relation.  A1 , A2 ,, Ak (r )
• Notation:
where A1, A2 are attribute names and r is a relation name.
• The result is defined as the relation of k columns obtained by erasing the columns
that are not listed
• Duplicate rows removed from result, since relations are sets
Example:
• To list all instructors’ ID, name, and salary attributes of instructor
ID, name, salary (instructor)
• Find the name of all instructors in the Physics department
name ( dept name =“Physics” (instructor))
135
Select Operation
• The select operation selects tuples that satisfy a given predicate.
• Notation:  (r) p
• p is called the selection predicate

• The selection condition acts as a filter
• comparisons are done using =, ≠, <, ≤, >, and ≥ in the selection
predicate.
• we can combine several predicates into a larger predicate by using the connectives and (∧), or
(∨), and not ( ￢ )
Example of selection:
1. To select tuples of the instructor who is in the “Physics” department
 dept_name=“Physics”(instructor)
2. find all instructors with salary greater than 90,000
 salary>90000(instructor)
3. to find the instructors in Physics with a salary greater than $90,000
 dept name =“Physics”∧salary>90000 (instructor ) 136
RENAME
• The RENAME operator is denoted by r (rho)
• We may rename the attributes of a relation or the relation name or both
• The general RENAME operation r can be expressed
by any of the following forms:
r
•  S (B1, B2, …, Bn ) (R) changes both:
•
 the relation name to S, and
•
 the column (attribute) names to B1, B1, …..Bn
r (R) changes:
•  S
•
 the relation name only to S
r
•  (B1, B2, …, Bn ) (R) changes:
•
 the column (attribute) names only to B1, B1, …..Bn
Cartesian-Product Operation
• The Cartesian-product operation, denoted by a cross (×), allows us to combine information
from any two relations.
• Cartesian product of relations r1 and r2 is written as r1 × r2.
• Ex: to find the names of all instructors in the Physics department together with the course id of
all courses they taught.
Union Operation
• Notation: r  s
• Defined as:
r  s = {t | t  r or t  s}
• The result of this operation, denoted by R ∪ S, is a relation that includes all tuples that are either in R or
in S or in both R and S. Duplicate tuples are eliminated.
• For r  s to be valid (r and s should be union compatible).
1. r, s must have the same arity (same number of attributes)
2. The attribute domains must be compatible (example: 2nd column
of r deals with the same type of values as does the 2nd
column of s)
• Example: to find all courses taught in the Fall 2009 semester, or in the Spring 2010 semester, or in both
course_id ( semester=“Fall” Λ year=2009 (section)) 
course_id ( semester=“Spring” Λ year=2010 (section))

139
Intersection Operation
• Notation: r  s
• Defined as:
• r  s = { t | t  r and t  s }
• The result of this operation, denoted by r∩ s, is a relation that includes all tuples that are in
both r and s.
• Assume:
– r, s have the same arity
– attributes of r and s are compatible
• Example: to find all courses taught in the Fall 2009 semester, and in the Spring 2010 semester
course_id ( semester=“Fall” Λ year=2009 (section)) 
Set Difference Operation
• Notation r – s
• Defined as:
• r – s = {t | t  r and t  s}
• Set differences must be taken between compatible relations.

• r and s must have the same arity
• attribute domains of r and s must be compatible
• Example: to find all courses taught in the Fall 2009 semester, but not in the Spring 2010
semester
course_id ( semester=“Fall” Λ year=2009 (section)) −

Set operations example
Join / Cartesian Product
• Binary Operation between two relation A and B
• The operator generates all possible combination between all tuples of A and B
• Denoted by ‘× ‘
Properties of Join operation
Given two relation A and B with :
degree(A) = m degree(B)=n
cardinality(A)= c1 cardinality(B)=c2
Then
degree(A×B) = degree(A)+ degree(B)=>m+n
cardinality(A×B) = cardinality(c1) * cardinality(c2) => c1*c2 A×B
A B P Q M N
P Q M N p1 q1 m1 n1
p1 q1 m1 n1 p1 q1 m2 n2
m2 n2 p1 q1 m3 n3
p2 q2
m3 n3 p2 q2 m1 n1
p2 q2 m2 n2
147
p2 q2 m3 n3
Join in relational Algebra
Join is a combination of a Cartesian product followed by a selection process.
A Join operation pairs two tuples from different relations, if and only if a given join condition is satisfied.
Various forms of join operation are:
Inner Joins:
Theta join
EQUI join
Natural join
Outer join:
Left Outer Join
Right Outer Join
Full Outer Join
Inner Join:
In an inner join, only those tuples that satisfy the matching criteria are included, while the rest are
excluded.
148
149
150
151
152
153
LEFT JOIN ( )
• This join returns all the rows of the table on the left side of the join and matching rows for the
table on the right side of join.
• The rows for which there is no matching row on right side, the result-set will contain null.
• LEFT JOIN is also known as LEFT OUTER JOIN
id Name Marks
10 Jay NULL
20 Veer 18
30 John 14
RIGHT JOIN( )
• RIGHT JOIN is similar to LEFT JOIN.
• This join returns all the rows of the table on the right side of the join and matching rows for
the table on the left side of join.
• The rows for which there is no matching row on left side, the result-set will contain null.
• RIGHT JOIN is also known as RIGHT OUTER JOIN
id name marks
Null Rohan 20
20 Veer 18
30 John 14
Null Sam 13
FULL JOIN ( )
• FULL OUTER JOIN creates the result-set by combining result of both LEFT JOIN and RIGHT
JOIN.
• The result-set will contain all the rows from both the tables.
• The rows for which there is no matching, the result-set will contain NULL values.
ID Name Marks
10 Jay NULL
Table A Table B 20 Veer 18
30 John 14
NULL Rohan 20
Null Sam 13
OUTER UNION Operations
• The outer union operation was developed to take the union of
tuples from two relations if the relations are not type compatible.
• This operation will take the union of tuples in two relations R(X, Y)
and S(X, Z) that are partially compatible, meaning that only
some of their attributes, say X, are type compatible.
• The attributes that are type compatible are represented only once
in the result, and those attributes that are not type compatible
from either relation are also kept in the result relation T(X, Y, Z).
Division ÷
• Binary Operation between two relation C and B
• Implicitly C is A × B where A is any Relation
• C÷B => (A × B) ÷ B
• The operator ‘÷ ‘ splits B from C and produces A
• e.g C÷B=>
C= A × B B A
P Q M N M N P Q
p1 q1 m1 n1 m1 n1 p1 q1
p1 q1 m2 n2 m2 n2
p1 q1 m3 n3 p2 q2
m3 n3
p2 q2 m1 n1
p2 q2 m2 n2
p2 q2 m3 n3
• Division operation will return the result which is from A ,which are
associated with every tuple in B.
159
Division ÷
Student Subject TestQP(Student ×Subject)
120 combinations
1 1
2 DBMS 2
.. DBMS
p3 COA COA
..
60 60
TestQP(Student ×Subject) Student Subject(TestQP ÷ Student)

1,DBMS
1,COA 1
2,DBMS
2,COA
2
…. 3
60,DBMS ..
60,COA 60
Division ÷
TestQP(Student ×Subject)Subject Student(TestQP ÷ Subject)
1,DBMS
1,COA
1
2,DBMS 2
2,COA 3
…. ..
60,DBMS
60,COA 60
Division ÷
Formal examples:
Cases :
Case 1:
Given A,B,C are relations and X,Y are attributes
C(X,Y) ÷ A(X) => B(Y)
C(X,Y) ÷ A(Y) => B(X)
Case 2:
X Y ÷ Y = X
X1 Y1 X1
X2 Y2 Y1
÷
Y2
÷
X1 Y2
X4 y4
162
Division ÷
Formal examples:
Case 3:
X Y
÷ X = Y
X1 Y1 X1 Y1
X2 Y2 Y2
X1 Y2
Case 4: y4
X4
X Y Y X
X1 Y1 Null
X2 Y2 ÷ Y1 =
Y2
Y3
X1 Y2
X4 y4 Y4
Division ÷
Formal examples:
Case 5:
X Y ÷ Y = x
X1 Y1 Y1 X1
X2 Y1 X2
X3
X3 Y1 X4
Case 6:
X4 y1
X Y Y X
X1 Y1 X1
÷ Y1
=
X2 Y1 X2
Y2
X2 Y2
X1 y2
Recap of Relational Algebra Operations
165
Aggregate Function Operation
• We can define an AGGREGATE FUNCTION operation, using the symbol ℑ (pronounced
script F), to specify these types of requests as follows:
• where <grouping attributes> is a list of attributes of the relation specified in R, and <function
list> is a list of (<function> <attribute> ) pairs.
• In each such pair, <function> is one of the allowed functions—such as SUM, AVERAGE,
MAXIMUM, MINIMUM, COUNT—and is an attribute of the relation specified by R
Aggregate Function Operation
• Use of the Aggregate Functional operation ℱ
•  ℱMAX Salary (EMPLOYEE) retrieves the maximum salary value
from the EMPLOYEE relation
•  ℱMIN Salary (EMPLOYEE) retrieves the minimum Salary value
from the EMPLOYEE relation
•  ℱSUM Salary (EMPLOYEE) retrieves the sum of the Salary from
the EMPLOYEE relation
•  ℱCOUNT SSN, AVERAGE Salary (EMPLOYEE) computes the count (number) of employees and their average salary
•  Note: count just counts the number of rows, without removing duplicates
Examples of applying aggregate functions and
grouping
169
171
Examples of Queries in Relational
Algebra
• Query 1. Retrieve the name and address of all employees who work for the
‘Research’ department.
• Query 2. For every project located in ‘Stafford’, list the project number, the
controlling department number, and the department manager’s last name,
address, and birth date.
Query 3. List the names of managers who have at least
one dependent.
Query 4. Find the names of employees who work on all the
projects controlled by department number 5.
Query 5. Make a list of project numbers for projects that involve
an employee whose last name is ‘Smith’, either as a worker or as
a manager of the department that controls the project.
176
Query 6. List the names of all employees with two or
more dependents.
Query 7. Retrieve the names of employees who have no
dependents.
Thank YOU
179

Module _1 DBMS

Uploaded by

Module _1 DBMS

Uploaded by

CSE3156 - Database

Presidency University, Bengaluru

Ex: 25, suresh, Bangalore

 Backup and Recovery

• Self-describing nature of a database system

• Insulation between programs and data

• Support of multiple views of the data

• Sharing of data and multiuser transaction processing

Geographic Information Systems (GIS)

Real-time and Active Databases

database to be developed and the DBMS software to be designed and

implemented (called “Workers Behind the Scene”).

Actors on the scene

DBMS system designers and implementers

Operators and maintenance personnel

• Reduced application development time

• Flexibility to change data structures

• Availability of up-to-date information

Here, we use a set of symbols and text to represent the information so

Network Model Semi-Structured Data Model

Entity-Relationship Model Associative Data Model

Relational Model Context Data Model

Object-Oriented Data Model

Object-Relational Data Model

Database contains a complex array of pointers that thread through a set

like structure. The data are stored as records which are

connected to one another through links. A record is a

collection of fields, with each field containing only one

value. The type of a record defines which fields the record

The hierarchical database model mandates that each child

have one or more child records

 Promotes conceptual simplicity.

 Database security is provided and enforced by DBMS.

 Parent/child relationship promotes data integrity.

 It is efficient with 1:M relationships.

demonstrated a need for a data model that more

closely represented the real world. In the object

oriented data model (OODM), both data and their

relationships are contained in a single structure

A database schema is the logical representation of a database, which

The data in the database at a particular moment in

time is called a database state or snapshot.

• Database State: Refers to the content of a database at a moment in time.

• Initial Database State: Refers to the database when it is loaded

• External Schema (View):

Conceptual Students(id: int, name: string, login: string, age:

Courses(id: int, cname.string, credits:integer)

Enrolled(id: int, grade:string)

Three-Schema Architecture is Proposed to support DBMS

Support of multiple views of the data.

We can define two types of data independence:

1.Logical Data Independence

2.Physical Data Independence

• A file system is a technique of arranging the files in a storage

medium like a hard disk, pen drive, DVD, etc.

• It mostly consists of different types of files like mp3, mp4, txt,

doc, etc. that are grouped into directories.

• Popular high-level conceptual data model

• Diagrammatic notation associated with the ER model

• Functional requirements of the application

• Description of data requirements

• Includes detailed descriptions of the entity types, relationships, and constraints

• Transformed from high-level data model into implementation data model

• Physical design phase

• ER model describes data as:

• Single-valued versus multivalued attributes

• Stored versus derived attributes

• Value sets (or domain of values)

• Company is organized into departments

• Department controls a number of projects

• Represent references as relationships not attributes

• Relationship type R among n entity types E1, E2, ..., En

A Relationship Type is a type of association that can exist between

• Each ri associates n individual entities (e1, e2, ..., en)

• Each entity ej in ri is a member of entity set Ej

• Role names and recursive relationships