Module _1 DBMS
Module _1 DBMS
Management Systems
Information: Processed Data
Ex: The age of Suresh is 25
Basic Definitions
• Database: A collection of related data.
• Data: Known facts that can be recorded and have an
implicit meaning.
• Mini-world: Some part of the real world about which
data is stored in a database. For example, student
grades and transcripts at a university.
• Database Management System (DBMS): A software
package/ system to facilitate the creation and
maintenance of a computerized database.
• Database System: The DBMS software together with
the data itself. Sometimes, the applications are also
included.
Slide 1-4
Definitions
Database:
Def 1: Database is an organized collection of logically related data
Def 2: A database is a shared collection of logically related data that is stored to
meet the requirements of different users of an organization
Def 3: A database is a self-describing collection of integrated records
Def 4: A database models a particular real world system in the computer in the
form of data
Ex: Online banking system, Library Management
Slide 1-5
What is DBMS ?
The DBMS provides users and programmers a systematic way to create, retrieve, update and manage database. or
A software package/ system to facilitate the creation and maintenance of a computerized database.
Database System:
The DBMS software together with the data itself. Sometimes, the applications are also included.
Advantages
Multiple user interfaces
Redundancy control
Authorized Access
Main Characteristics of the Database
Approach
• Data Abstraction
Main Characteristics of the Database
Approach
Multimedia Databases
Data Warehouses
Users may be divided into those who actually use and control the
content (called “Actors on the Scene”) and those who enable the
Database administrators
Database Designers
End-users
Categories of End-users
Casual
Naïve or Parametric
Sophisticated
Stand-alone
Workers behind the Scene
Tool developers
• Economies of scale
Data Models
Data Model: A set of concepts to describe the structure of a
database, and certain constraints that the database should obey.
Data Model Operations: Operations for specifying database
retrievals and updates by referring to the concepts of the
data model. Operations on the data model may include basic
operations and user-defined operations.
Data Models
Data Model gives us an idea that how the final system will look like after
its complete implementation.
It defines the data elements and the relationships between the data
elements.
Data Models are used to show how data is stored, connected, accessed
and updated in the database management system.
Physical (low-level, internal) data models: Provide concepts that describe details of how data
is stored in the computer.
Implementation (representational) data models: Provide concepts that fall between the above
two, balancing user views with some computer storage details.
Some of the Data Models in DBMS
are:
Hierarchical Model Flat Data Model
DISADVANTAGES:
Navigational and procedural nature of processing
contains.
record has only one parent, whereas each parent record can
Parent/child relationship
known as an object.
Object relational model
Is a combination of a Object oriented database model and a Relational
database model. So, it supports objects, classes, inheritance etc. just like
Object Oriented models and has support for data types, tabular structures
etc. like Relational data model.
One of the major goals of Object relational data model is to close the gap
between relational databases and the object oriented practices frequently
used in many programming languages such as C++, C#, Java etc.
Both Relational data models and Object oriented data models are very
useful. But it was felt that they both were lacking in some characteristics and
so work was started to build a model that was a combination of them both.
DBMS Languages
Types of DBMS
languages:
30
Data Definition
Language (DDL)
• DDL is used for specifying the database schema.
• It is used for creating tables, schema, indexes, constraints etc. in
database.
• CREATE
• ALTER
• DROP
• TRUNCATE
• RENAME
• DROP
• Comment
31
Data Manipulation
Language (DML)
DML is used for accessing and manipulating data in a database. The following
operations on database comes under DML:
• SELECT
• INSERT
• UPDATE
• DELETE
Data Control language
(DCL)
DCL is used for granting and revoking user access on a database –
• GRANT
• REVOKE
Transaction Control
Language(TCL)
The changes in the database that we made using DML commands are
either performed or roll backed using TCL.
• COMMIT
• ROLLBACK
1.Schemas, Instances, and Database State
Database Schema
Employee Schema
Department Schema
Dept_Location Schema
Schemas, Instances, and Database State
Instances
The instance of the database is the values of these variables or
attributes at any given time
Employee Schema
Instances
Schemas, Instances, and Database State
Database State
• Database Schema
• Schema Diagram
• Schema Construct
• Database Instance
Database Schema Vs. Database State
• Valid State: A state that satisfies the structure and constraints of the database.
• Distinction
• The database schema changes very infrequently. The database state changes
every time the database is updated.
• Schema is also called intension, whereas state is called extension.
Example: University
Database
• Conceptual schema:
• Students(sid: string, name: string, login: string, age: integer, gpa:real)
• Courses(cid: string, cname:string, credits:integer)
• Enrolled(sid:string, cid:string, grade:string)
• Physical schema:
• Relations stored as unordered files.
• Index on first column of Students.
41
Example of a database state
42
2.Three-Schema Architecture
43
Three-Schema Architecture
Defines DBMS schemas at three levels:
1.Internal schema
2.Conceptual schema
3.External schemas
Type of Implementation
Schema
External View 1: Course info(cid:int,cname:string)
Schema
View 2: studeninfo(id:int. name:string)
characteristics of:
Program-data independence.
• Data redundancy: Data redundancy refers to the duplication of data, lets say we are
managing the data of a college where a student is enrolled for two courses, the same
student details in such case will be stored twice, which will take more storage than
needed. Data redundancy often leads to higher storage costs and poor access time.
• Data inconsistency: Data redundancy leads to data inconsistency, lets take the same
example that we have taken above, a student is enrolled for two courses and we have
student address stored twice, now lets say student requests to change his address, if
the address is changed at one place and not on all the records then this can lead to
data inconsistency.
50
Cont..
• Data Isolation: Because data are scattered in
various files, and files may be in different formats,
writing new application programs to retrieve the
appropriate data is difficult.
• Dependency on application programs: Changing
files would lead to change in application programs.
51
Cont..
• Atomicity issues: Atomicity of a transaction refers to “All or
nothing”, which means either all the operations in a
transaction executes or none.
• It is difficult to achieve atomicity in file processing
systems.
• Data Security: Data should be secured from unauthorised
access, for example a student in a college should not be
able to see the payroll details of the teachers, such kind of
security constraints are difficult to apply in file processing
systems.
52
5.File Systems Vs Database
System
• A file system is a software that manages and organizes the files in a storage medium, whereas
DBMS is a software application that is used for accessing, creating, and managing databases.
• The file system doesn't have a crash recovery mechanism on the other hand, DBMS provides a
crash recovery mechanism.
• Data inconsistency is higher in the file system. On the contrary Data inconsistency is low in a
database management system.
• File system does not provide support for complicated transactions, while in the DBMS system, it
is easy to implement complicated transactions using SQL.
• File system does not offer concurrency, whereas DBMS provides a concurrency facility.
Data Modelling using Entities and
Relationships
Chapter-2
Outline
Using High-Level Conceptual Data Models for Database Design
A Sample Database Application
Entity Types, Entity Sets, Attributes, and Keys
Relationship Types, Relationship Sets, Roles, and Structural Constraints
Weak Entity Types
Refining the ER Design for the COMPANY Database
ER Diagrams, Naming Conventions, and Design Issues
Example of Other Notation: UML Class Diagrams
Relationship Types of Degree Higher than Two
1.Data Modeling Using the Entity-Relationship
(ER) Model
• Entity-Relationship (ER) model
• ER diagrams
• Entities
• Relationships
• Attributes
ER Diagrams, Naming Conventions, and Design Issues
Chapter 3-61
ER Diagrams, Naming Conventions, and Design Issues
Chapter 3-62
ER Diagrams, Naming Conventions, and Design Issues
Chapter 3-63
Components of the ER Diagram
• This model is based on three basic concepts:
64
Entities and Attributes
• Entity
• Thing in real world with independent existence
• Attributes
• Particular properties that describe entity
• Types of attributes:
• Composite versus simple (atomic) attributes
• NULL values
• Complex attributes
Entities and Attributes (cont’d.)
Entity Types, Entity Sets, Keys, and Value Sets
• Entity type
• Collection (or set) of entities that have the same attributes
Types of Attributes
68
Entity Types, Entity Sets, Keys, and Value
Sets (cont’d.)
• Key or uniqueness constraint
• Attributes whose values are distinct for each individual entity in entity set
• Key attribute
• Uniqueness property must hold for every entity set of the entity type
• They allow you to find the relation between two tables. Keys
help you uniquely identify a row in a table by a combination of
one or more columns in that table.
• Key is also helpful for finding unique record or row from the
table.
Types of Keys
1.Super Key
2.Primary Key
3.Candidate Key
4.Alternate Key
71
Types of Keys
•Foreign Key
•Compound Key
•Composite Key
72
A Sample Database Application
• COMPANY
• Employees, departments, and projects
• Employee: store each employee’s name, Social Security number, address, salary, sex
(gender), and birth date
• Keep track of the dependents of each employee
Initial Conceptual Design of the COMPANY
Database
Relationship Types, Relationship Sets, Roles, and
Structural Constraints
• Relationship
• When an attribute of one entity type refers to another entity type
• Binary, ternary,unary
Role Names and Recursive Relationships
• Recursive relationships
• Same entity type participates more than once in a relationship type in different roles
• Participation constraint
• Specifies whether existence of entity depends on its being related to another entity
• Identifying relationship
• Relates a weak entity type to its owner
• Choose binary relationship names to make ER diagram readable from left to right
and from top to bottom
Alternative Notations for ER Diagrams
• Specify structural constraints on relationships
• Replaces cardinality ratio (1:1, 1:N, M:N) and single/double line notation for participation
constraints
• Associate a pair of integer numbers (min, max) with each participation of an entity type E in
• Binary
• Relationship type of degree two
• Ternary
• Relationship type of degree three
Choosing between Binary and Ternary (or
Higher-Degree) Relationships
• Some database design tools permit only binary relationships
• Ternary relationship must be represented as a weak entity type
It links the strong and weak entity and is represented by a double diamond sign.
Chapter 3-97
Components of ER Diagram
Attribute
•An attribute exhibits the properties of an entity.
•You can illustrate an attribute with an oval shape in an ER diagram.
Components of ER Diagram
Key Attribute
•Key attribute uniquely identifies an entity from an entity set.
•It underlines the text of a key attribute.
•For example: For a student entity, the roll number can uniquely identify a student from a set of
student
Components of ER Diagram
Composite Attribute
•An attribute that is composed of several other attributes is known as a composite attribute.
•An oval showcases the composite attribute, and the composite attribute oval is further
connected with other ovals.
Components of ER Diagram
Multivalued Attribute
•Some attributes can possess over one value, those attributes are called multivalued attributes.
•The double oval shape is used to represent a multivalued attribute.
Components of ER Diagram
Derived Attribute
•An attribute that can be derived from other attributes of the entity is known as a derived
attribute.
•In the ER diagram, the dashed oval represents the derived attribute.
Components of ER Diagram
Relationship
•The diamond shape showcases a relationship in the ER diagram.
•It depicts the relationship between two entities.
•In the example below, both the student and the course are entities, and study is the relationship
between them.
Components of ER Diagram
Many-to-One Relationship
•When more than one element of an entity is related to a single element of another entity, then it
is called a many-to-one relationship.
•For example, students have to opt for a single course, but a course can have many students
Components of ER Diagram
Many-to-Many Relationship
•When more than one element of an entity is associated with more than one element of another
entity, this is called a many-to-many relationship.
•For example, you can assign an employee to many projects and a project can have many
employees.
105
Components of ER Diagram
Participation Constraints
•Total Participation − Each entity is involved in the relationship. Total participation is
represented by double lines.
•Partial participation − Not all entities are involved in the relationship. Partial participation is
represented by single lines.
107
108
The (min,max) notation
relationship constraints
(0,1) (1,1)
(1,1) (1,N)
COMPANY ER Schema Diagram
using (min, max) notation
Data Modeling Tools
A number of popular tools that cover conceptual modeling and mapping into relational schema
design. Examples:
• ERWin,
• To avoid the problem of layout algorithms and aesthetics of diagrams, they prefer
boxes and lines and do nothing more than represent (primary-foreign key) relationships
among resulting tables.(a few exceptions)
• METHODOLGY
Generalization
More E-R Diagram Examples
126
Assignment
127
128
ER DIAGRAM FOR A BANK
DATABASE
© The Benjamin/Cummings Publishing Company, Inc. 1994, Elmasri/Navathe, Fundamentals of Database Systems, Second Edition
Chapter 3-129
ER DIAGRAM FOR A BANK
DATABASE
© The Benjamin/Cummings Publishing Company, Inc. 1994, Elmasri/Navathe, Fundamentals of Database Systems, Second Edition
Chapter 3-130
Koushik De- - CSE, UEMK
Koushik De- - CSE, UEMK
2.Relational Algebra
• The relational algebra is a procedural query language. Relational algebra is the basic set of
operations for the relational model
• It consists of a set of operations that take one or two relations as input and produce a new
relation as their result.
• Six basic operators
– select:
– project:
– union:
– set difference: –
– Cartesian product: x
– rename:
Relational Algebra Overview
• Relational Algebra consists of several groups of operations
• Unary Relational Operations
• SELECT (symbol: s (sigma))
• PROJECT (symbol: p (pi))
• RENAME (symbol: r (rho))
• Relational Algebra Operations From Set Theory
• UNION (È), INTERSECTION (Ç ), DIFFERENCE (or MINUS, – )
• CARTESIAN PRODUCT ( x )
• Binary Relational Operations
• JOIN (several variations of JOIN exist)
• DIVISION
• Additional Relational Operations
• OUTER JOINS, OUTER UNION
• AGGREGATE FUNCTIONS (These compute summary of information: for example, SUM, COUNT, AVG, MIN, MAX)
Project Operation
• Project is used to display the required attributes from a relation. A1 , A2 ,, Ak (r )
• Notation:
where A1, A2 are attribute names and r is a relation name.
• The result is defined as the relation of k columns obtained by erasing the columns
that are not listed
• Duplicate rows removed from result, since relations are sets
Example:
• To list all instructors’ ID, name, and salary attributes of instructor
ID, name, salary (instructor)
• Find the name of all instructors in the Physics department
name ( dept name =“Physics” (instructor))
135
Select Operation
• The select operation selects tuples that satisfy a given predicate.
• Notation: (r) p
•
the relation name only to S
r
• (B1, B2, …, Bn ) (R) changes:
•
the column (attribute) names only to B1, B1, …..Bn
Cartesian-Product Operation
• The Cartesian-product operation, denoted by a cross (×), allows us to combine information
from any two relations.
• Cartesian product of relations r1 and r2 is written as r1 × r2.
• Ex: to find the names of all instructors in the Physics department together with the course id of
all courses they taught.
Union Operation
• Notation: r s
• Defined as:
r s = {t | t r or t s}
• The result of this operation, denoted by R ∪ S, is a relation that includes all tuples that are either in R or
in S or in both R and S. Duplicate tuples are eliminated.
• For r s to be valid (r and s should be union compatible).
1. r, s must have the same arity (same number of attributes)
2. The attribute domains must be compatible (example: 2nd column
of r deals with the same type of values as does the 2nd
column of s)
• Example: to find all courses taught in the Fall 2009 semester, or in the Spring 2010 semester, or in both
• Notation: r s
• Defined as:
• r s = { t | t r and t s }
• The result of this operation, denoted by r∩ s, is a relation that includes all tuples that are in
both r and s.
• Assume:
– r, s have the same arity
– attributes of r and s are compatible
• Example: to find all courses taught in the Fall 2009 semester, and in the Spring 2010 semester
course_id ( semester=“Fall” Λ year=2009 (section))
course_id ( semester=“Spring” Λ year=2010 (section))
Set Difference Operation
• Notation r – s
• Defined as:
• r – s = {t | t r and t s}
id Name Marks
10 Jay NULL
20 Veer 18
30 John 14
RIGHT JOIN( )
• RIGHT JOIN is similar to LEFT JOIN.
• This join returns all the rows of the table on the right side of the join and matching rows for
the table on the left side of join.
• The rows for which there is no matching row on left side, the result-set will contain null.
• RIGHT JOIN is also known as RIGHT OUTER JOIN
id name marks
Null Rohan 20
20 Veer 18
30 John 14
Null Sam 13
FULL JOIN ( )
• FULL OUTER JOIN creates the result-set by combining result of both LEFT JOIN and RIGHT
JOIN.
• The result-set will contain all the rows from both the tables.
• The rows for which there is no matching, the result-set will contain NULL values.
ID Name Marks
10 Jay NULL
Table A Table B 20 Veer 18
30 John 14
NULL Rohan 20
Null Sam 13
OUTER UNION Operations
• The outer union operation was developed to take the union of
tuples from two relations if the relations are not type compatible.
• This operation will take the union of tuples in two relations R(X, Y)
and S(X, Z) that are partially compatible, meaning that only
some of their attributes, say X, are type compatible.
• The attributes that are type compatible are represented only once
in the result, and those attributes that are not type compatible
from either relation are also kept in the result relation T(X, Y, Z).
Division ÷
• Binary Operation between two relation C and B
• Implicitly C is A × B where A is any Relation
• C÷B => (A × B) ÷ B
• The operator ‘÷ ‘ splits B from C and produces A
• e.g C÷B=>
C= A × B B A
P Q M N M N P Q
p1 q1 m1 n1 m1 n1 p1 q1
p1 q1 m2 n2 m2 n2
p1 q1 m3 n3 p2 q2
m3 n3
p2 q2 m1 n1
p2 q2 m2 n2
p2 q2 m3 n3
• Division operation will return the result which is from A ,which are
associated with every tuple in B.
159
Division ÷
Student Subject TestQP(Student ×Subject)
120 combinations
1 1
2 DBMS 2
.. DBMS
p3 COA COA
..
60 60
1,DBMS
1,COA
1
2,DBMS 2
2,COA 3
…. ..
60,DBMS
60,COA 60
Division ÷
Formal examples:
Cases :
Case 1:
Given A,B,C are relations and X,Y are attributes
C(X,Y) ÷ A(X) => B(Y)
C(X,Y) ÷ A(Y) => B(X)
Case 2:
X Y ÷ Y = X
X1 Y1 X1
X2 Y2 Y1
÷
Y2
÷
X1 Y2
X4 y4
162
Division ÷
Formal examples:
Case 3:
X Y
÷ X = Y
X1 Y1 X1 Y1
X2 Y2 Y2
X1 Y2
Case 4: y4
X4
X Y Y X
X1 Y1 Null
X2 Y2 ÷ Y1 =
Y2
Y3
X1 Y2
X4 y4 Y4
Division ÷
Formal examples:
Case 5:
X Y ÷ Y = x
X1 Y1 Y1 X1
X2 Y1 X2
X3
X3 Y1 X4
Case 6:
X4 y1
X Y Y X
X1 Y1 X1
÷ Y1
=
X2 Y1 X2
Y2
X2 Y2
X1 y2
Recap of Relational Algebra Operations
165
Aggregate Function Operation
• We can define an AGGREGATE FUNCTION operation, using the symbol ℑ (pronounced
script F), to specify these types of requests as follows:
• where <grouping attributes> is a list of attributes of the relation specified in R, and <function
list> is a list of (<function> <attribute> ) pairs.
• In each such pair, <function> is one of the allowed functions—such as SUM, AVERAGE,
MAXIMUM, MINIMUM, COUNT—and is an attribute of the relation specified by R
Aggregate Function Operation
• Use of the Aggregate Functional operation ℱ
• ℱCOUNT SSN, AVERAGE Salary (EMPLOYEE) computes the count (number) of employees and their average salary
• Note: count just counts the number of rows, without removing duplicates
Examples of applying aggregate functions and
grouping
169
171
Examples of Queries in Relational
Algebra
• Query 1. Retrieve the name and address of all employees who work for the
‘Research’ department.
• Query 2. For every project located in ‘Stafford’, list the project number, the
controlling department number, and the department manager’s last name,
address, and birth date.
Query 3. List the names of managers who have at least
one dependent.
Query 4. Find the names of employees who work on all the
projects controlled by department number 5.
Query 5. Make a list of project numbers for projects that involve
an employee whose last name is ‘Smith’, either as a worker or as
a manager of the department that controls the project.
176
Query 6. List the names of all employees with two or
more dependents.
Query 7. Retrieve the names of employees who have no
dependents.
Thank YOU
179