EE - DBMS - Unit 1 With Normalization

Download as pdf or txt
Download as pdf or txt
You are on page 1of 99

Database Management Systems

Examination scheme: Marks-50 [Continuous Assessment]


Course Objectives:

1. Understand and successfully apply logical database design principles, including E-R diagrams and database normalization.

2. Learn Database Programming languages and apply in DBMS applications.

3. Understand transaction processing and concurrency control in DBMS.

4. Learn database architectures, DBMS advancements and its usage in advance applications.

Course Outcomes: Upon completion of the course, the students will be able to:

1. Design ER-models to represent simple database application scenarios and Improve the database design by normalization.

2. Design Database Relational Model and apply SQL , PLSQL concepts for database programming.

3. Describe Transaction Processing and Concurrency Control techniques for databases.

4. Identify appropriate database architecture for the real world database applications.

DATABASE MANAGEMENT SYSTEMS 1


Module 1- Introduction to Database Management Systems and Data
Modeling

DBMS Vs File Systems, Database System Architecture, Data Abstraction, Data


Independence, Data Definition and Data Manipulation Languages, Database
System Internals-Components of a database system, Data Models , E-R diagram:
Components of E-R Model, Conventions, Keys, EER diagram Components, E-R
diagram into tables, Relational Model, Relational Integrity, Referential Integrities,
Enterprise Constraints, Schema Diagram, Relational Algebra- Basic Operations,
Normalization, Armstrong Axiom’s, Functional Dependency, Normal Forms
(1 NF—5 NF)

DATABASE MANAGEMENT SYSTEMS 2


Database Management System Basics

 It contains information about a particular enterprise. Database Applications


• Banking: Transactions
 It provides :
• Airlines: Reservations, Schedules
Collection of interrelated data. • Universities : Student Registration.
• Sales: Customers, Orders, Products
Set of programs to access the collected data. • Online retailers : Order tracking,
An environment that is both convenient and recommendations
• Manufacturing :Production,
efficient to use. Inventory, Supply Chain
Databases touch all aspects of our lives. • Human Resources :Employee
payroll.

DATABASE MANAGEMENT SYSTEMS 3


Motivation for Database Management Systems

Traditional file Systems have following drawbacks to store data :


 Data Redundancy and Inconsistency
 Multiple file formats, duplication of information in different files A
program that controls the execution of application programs
 Difficulty in accessing data
 Need to write a new program to carry out each new task.
 Data Isolation
 Multiple files and formats.
Question : What can be other drawbacks related to usage of file systems ?

DATABASE MANAGEMENT SYSTEMS 4


Motivation for Database Management Systems

Answer : Traditional file Systems have following drawbacks to store data:


 Integrity Problems
 Integrity constraints (e.g., account balance > 0) become “buried” in
program code rather than being stated explicitly
 Difficult to add new constraints or change existing ones
 Atomicity of Updates
 Failures may leave database in an inconsistent state with partial updates
carried out.

DATABASE MANAGEMENT SYSTEMS 5


Motivation for Database Management Systems

Traditional file Systems have following drawbacks to store data :


 Concurrent Access by Multiple users
 Concurrent access needed for performance.
 Uncontrolled concurrent accesses can lead to inconsistencies.
 Security Problems
 Hard to provide user access to some, but not all, data
Database Management Systems Offers solutions to all the above
problems/limitations of traditional file systems.

DATABASE MANAGEMENT SYSTEMS 6


Levels of Data Abstraction in Database System

 Physical level: describes how a record (e.g.,


customer) is stored.
 Logical level: describes data stored in database, and
the relationships among the data.
type instructor = record
ID : string;
name : string;
dept_name : string;
salary : integer;
end;
View level: application programs hide details of data
types. Views can also hide information (such as an
Levels of Abstraction in a Database System employee’s salary) for security purposes.

DATABASE MANAGEMENT SYSTEMS 7


Instances and Schema

 Schema – the logical structure of the database


◦ Example: The database consists of information about a set
of customers and accounts and the relationship between
them
 Physical schema: database design at the physical level
 Logical schema: database design at the logical level

 Instance – the actual content of the database at a


particular point in time
 Analogous to the value of a variable.

DATABASE MANAGEMENT SYSTEMS 8


Data Independence
Types of Data Independence :
 Physical Data Independence : the ability to modify the physical schema without
changing the logical schema
 Applications depend on the logical schema
 In general, the interfaces between the various levels and components should be
well defined so that changes in some parts do not seriously influence others.
 Logical Data Independence : the ability to change the conceptual scheme without
changing
 External views
 External API or programs
 Any change made will be absorbed by the mapping between external and
conceptual levels.
 When compared to Physical Data independence, it is challenging to achieve
logical data independence.

DATABASE MANAGEMENT SYSTEMS 9


Data Models
A collection of tools for describing :
 Data
 Data relationships
 Data semantics
 Data constraints

DATABASE MANAGEMENT SYSTEMS 10


Database System Languages
Data Definition Language(DDL) Data Manipulation Language(DML)
Specification notation for defining the database schema Language for accessing and manipulating the data
Example: create table instructor ( organized by the appropriate data model
ID char(5), ◦ DML also known as query language
name varchar(20),
dept_name varchar(20), Two classes of languages
salary numeric(8,2))
◦ Procedural – user specifies what data is required
DDL compiler generates a set of table templates stored in a and how to get those data
data dictionary ◦ Declarative (nonprocedural) – user specifies
Data dictionary contains metadata (i.e., data about what data is required without specifying how to
data) get those data
◦ Database schema
SQL is the most widely used query language
◦ Integrity constraints
◦ Primary key (ID uniquely identifies instructors)
◦ Referential integrity (references constraint in SQL)
◦ e.g. dept_name value in any instructor tuple must
appear in department relation

DATABASE MANAGEMENT SYSTEMS 11


Database System Internals

Important Components of
Database System :
Database Users
Query Processing
Storage Management
Transaction Management

DATABASE MANAGEMENT SYSTEMS 12


Database System Components : Database Users

Types of Database Users :

Naive Users
Application Programmers
Sophisticated Users
Database Administrators

DATABASE MANAGEMENT SYSTEMS 13


Database System Components : Query Processing

Query Processing Steps :

1.Parsing and Translation


2.Optimization
3.Evaluation

DATABASE MANAGEMENT SYSTEMS 14


Database System Components :Storage
Management
Storage manager : is a program module Issues:
that provides the interface between the
low-level data stored in the database and
the application programs and queries  Storage access
submitted to the system.  File organization
It is responsible for the following tasks:  Indexing and hashing
 Interaction with the file manager
 Efficient storing, retrieving and
updating of data

DATABASE MANAGEMENT SYSTEMS 15


Database System Components :Transaction
Management
What if the system fails?  Transaction-management component
What if more than one user is concurrently ensures that the database remains in a
updating the same data? consistent (correct) state despite system
failures (e.g., power failures and operating
A transaction is a collection of operations that system crashes) and transaction failures.
performs a single logical function in a database
application.
 Concurrency-control manager controls
Two Important Components related to Transactions: the interaction among the concurrent
transactions, to ensure the consistency of
 Transaction Manager
the database.
 Concurrency Control Manager

DATABASE MANAGEMENT SYSTEMS 16


Entity Relationship Model
 A database can be modeled as:
◦ a collection of entities,
◦ relationship among entities.
 An entity is an object that exists and is distinguishable
from other objects.
◦ Example: specific person, company, event, plant
 Entities have attributes
◦ Example: people have names and addresses
 An entity set is a set of entities of the same type that
share the same properties.
◦ Example: set of all persons, companies, trees,
holidays

DATABASE MANAGEMENT SYSTEMS 17


Relationship Sets
A relationship is an association among several entities
Example:
44553 (Peltier) advisor 22222
(Einstein)
student entity relationship set instructor entity
A relationship set is a mathematical relation among n  2
entities, each taken from entity sets
{(e1, e2, … en) | e1  E1, e2  E2, …, en  En}

where (e1, e2, …, en) is a relationship


◦ Example:
(44553,22222)  advisor

DATABASE MANAGEMENT SYSTEMS 18


Relationship Sets

 An attribute can also be property of a


relationship set.
For instance, the advisor relationship set
between entity sets instructor and student
may have the attribute date which tracks
when the student started being associated
with the advisor

DATABASE MANAGEMENT SYSTEMS 19


Degree of a Relationship Set
 Binary relationship  Ternary relationship
involve two entity sets (or degree two).  Example: students work on research
most relationship sets in a database system projects under the guidance of an
are binary. instructor.
 relationship proj_guide is a ternary
Relationships between more than two entity
relationship between instructor, student,
sets are rare. Most relationships are binary.
and project

DATABASE MANAGEMENT SYSTEMS 20


Attributes
 Attribute types
 An entity is represented by a set of attributes,  Simple and composite attributes.
that is descriptive properties possessed by all  Single-valued and multivalued attributes.
members of an entity set. Example: multivalued attribute:
◦ Example: phone_numbers
 Derived attributes
instructor = (ID, name, street, city, Can be computed from other attributes
salary )
Example: age, given date_of_birth
course= (course_id, title, credits)
 Domain – the set of permitted values for each
attribute

DATABASE MANAGEMENT SYSTEMS 21


Mapping Cardinality Constraints

Express the number of entities to which


another entity can be associated via a
relationship set.
Most useful in describing binary relationship
sets.
One to one One to many
For a binary relationship set the mapping
cardinality must be one of the following types:
 One to one
 One to many
 Many to one
 Many to many
Many to one Many to many
DATABASE MANAGEMENT SYSTEMS 22
Keys for Relationship Sets
The combination of primary keys of the participating entity sets forms a
super key of a relationship set.
◦ (s_id, i_id) is the super key of advisor
◦ NOTE: this means a pair of entity sets can have at most one relationship in
a particular relationship set.
◦ Example: if we wish to track multiple meeting dates between a student and
her advisor, we cannot assume a relationship for each meeting. We can use a
multivalued attribute though
Must consider the mapping cardinality of the relationship set when deciding
what are the candidate keys

DATABASE MANAGEMENT SYSTEMS 23


Entity Relationship Diagram

DATABASE MANAGEMENT SYSTEMS 24


Entity-Relationship Diagram
It is a graphical Representation of ER Model
Basic Notations :

 Rectangles represent entity sets.


 Diamonds represent relationship sets.
 Attributes listed inside entity rectangle
 Underline indicates primary key attributes

DATABASE MANAGEMENT SYSTEMS 25


Entity Relationship Diagram Continued
 Entity With Composite,  Relationship Sets with Attributes
Multivalued, and Derived
Attributes

DATABASE MANAGEMENT SYSTEMS 26


Entity Relationship Diagram Continued

Roles
 Entity sets of a relationship need not be distinct.
 Each occurrence of an entity set plays a “role” in the relationship.
The labels “course_id” and “prereq_id” are called roles.

DATABASE MANAGEMENT SYSTEMS 27


Entity Relationship Diagram Continued
Cardinality Constraints
We express cardinality constraints by drawing either a directed line (), signifying “one,” or
an undirected line (—), signifying “many,” between the relationship set and the entity set.

1. One-to-One Relationship
one-to-one relationship between an
instructor and a student
◦ an instructor is associated with at
most one student via advisor
◦ and a student is associated with at
most one instructor via advisor

DATABASE MANAGEMENT SYSTEMS 28


Entity Relationship Diagram Continued
b. One-to-Many Relationship
A one-to-many relationship between an instructor and a student
a an instructor is associated with several (including 0) students via advisor
student is associated with at most one instructor via advisor
c. Many-to-One Relationship
In a many-to-one relationship between an instructor and a student,
an instructor is associated with at most one student via advisor,
and a student is associated with several (including 0) instructors via advisor
d. Many-to Many Relationship
An instructor is associated with several (possibly 0) students via advisor
A student is associated with several (possibly 0) instructors via advisor

DATABASE MANAGEMENT SYSTEMS 29


Entity Relationship Diagram Continued
Participation of an Entity Set in a Relationship
Set
 Total participation (indicated by double line):
every entity in the entity set participates in at least
one relationship in the relationship set
• E.g., participation of section in sec_course is
total
 every section must have an associated course
 Partial participation: some entities may not
participate in any relationship in the relationship
set
 Example: participation of instructor in advisor
is partial

DATABASE MANAGEMENT SYSTEMS 30


Entity Relationship Diagram : Weak Entity Sets

An entity set that does not have a primary key is referred to as a weak
entity set.
The existence of a weak entity set depends on the existence of a
identifying entity set
◦ It must relate to the identifying entity set via a total, one-to-many
relationship set from the identifying to the weak entity set
◦ Identifying relationship depicted using a double diamond

The discriminator (or partial key) of a weak entity set is the  We underline the discriminator of a
set of attributes that distinguishes among all the entities of a weak entity set with a dashed line.
weak entity set.  We put the identifying relationship
of a weak entity in a double
The primary key of a weak entity set is formed by the primary
key of the strong entity set on which the weak entity set is diamond.
existence dependent, plus the weak entity set’s discriminator.  Primary key for section –
(course_id, sec_id, semester, year)

DATABASE MANAGEMENT SYSTEMS 31


E-R Diagram Example

Question : Design a ER Diagram for University Database based on scenario given below :
A university includes information about the course timetable of an academic institution. For each
Course the following information is recorded an identification number, name of the course, name
of the teacher assigned to teach the course, the number of periods each week it will be taught, the
number of

For each teacher the following information is recorded: SSN, name, name of the department
he/she works with, skills, the yearly salary. For each class period the following information is
recorded: period number, starting time, ending time. For each room the following information is
recorded: room number, room type (classroom, office, auditorium, and computer lab), capacity.
The above information is used in order to make the right assignment of a teacher that teaches a
particular course to a time period and a classroom.

DATABASE MANAGEMENT SYSTEMS 32


E-R Diagram for University Database

DATABASE MANAGEMENT SYSTEMS 33


Extended ER Features :
 Specialization
 Generalization
 Aggregation

DATABASE MANAGEMENT SYSTEMS 34


Extended ER Features : Specialization
Top-down design process; we designate subgroupings
within an entity set that are distinctive from other entities
in the set.
These subgroupings become lower-level entity sets that
have attributes or participate in relationships that do not
apply to the higher-level entity set.
Depicted by a triangle component labeled ISA (E.g.,
instructor “is a” person).
Attribute inheritance – a lower-level entity set inherits
all the attributes and relationship participation of the
higher-level entity set to which it is linked.

DATABASE MANAGEMENT SYSTEMS 35


Extended ER Features : Generalization
A bottom-up design process – combine a
number of entity sets that share the same
features into a higher-level entity set.
Specialization and generalization are simple
inversions of each other; they are represented
in an E-R diagram in the same way.
The terms specialization and generalization
are used interchangeably.

DATABASE MANAGEMENT SYSTEMS 36


Extended ER Features : Aggregation
 Consider the ternary relationship proj_guide, which
we saw earlier
 Suppose we want to record evaluations of a student
by a guide on a project.
 Relationship sets eval_for and proj_guide represent
overlapping information
 Every eval_for relationship corresponds to a
proj_guide relationship
 However, some proj_guide relationships may not
correspond to any eval_for relationships
◦ So we can’t discard the proj_guide relationship

DATABASE MANAGEMENT SYSTEMS 37


Extended ER Features : Aggregation
Eliminate this redundancy via aggregation
 Treat relationship as an abstract entity
 Allows relationships between relationships
 Abstraction of relationship into new entity

Without introducing redundancy, the following


diagram represents:
 A student is guided by a particular instructor on a
particular project
 A student, instructor, project combination may have
an associated evaluation

DATABASE MANAGEMENT SYSTEMS 38


Reduction of ER Diagram to
Relation Schemas

DATABASE MANAGEMENT SYSTEMS 39


Reduction to Relation Schemas

 Entity sets and relationship sets can be expressed uniformly as relation


schemas that represent the contents of the database.
 A database which conforms to an E-R diagram can be represented by a
collection of schemas.
 For each entity set and relationship set there is a unique schema that is
assigned the name of the corresponding entity set or relationship set.
 Each schema has a number of columns (generally corresponding to
attributes), which have unique names

DATABASE MANAGEMENT SYSTEMS 40


Reduction to Relation Schemas

Representing Entity Sets with Simple Attributes Representing Relationship Sets :

A strong entity set reduces to a schema with the same


attributes
course(course_id,title,credits)
course_id title credits A many-to-many relationship set is represented as
a schema with attributes for the primary keys of
A weak entity set becomes a table that includes a the two participating entity sets, and any
column for the primary key of the identifying strong descriptive attributes of the relationship set.
entity set
section ( course_id, sec_id, sem, year ) Example: schema for relationship set advisor
Course_id Sec_id sem year advisor = (s_id, i_id) S_id i_id

DATABASE MANAGEMENT SYSTEMS 41


Reduction to Relation Schema
Redundancy of schemas:
 Many-to-one and one-to-many relationship sets that are total on the many-side can be
represented by adding an extra attribute to the “many” side, containing the primary key
of the “one” side.

Example: Instead of creating a schema for relationship set inst_dept, add an attribute
dept_name to the schema arising from entity set instructor

DATABASE MANAGEMENT SYSTEMS 42


Reduction to Relation Schema
Representing Composite and Multi-valued
Attributes
 Composite attributes are flattened out by creating a separate
attribute for each component attribute
◦ Example: given entity set instructor with composite attribute
name with component attributes first_name and last_name the
schema corresponding to the entity set has two attributes
name_first_name and name_last_name
◦ Prefix omitted if there is no ambiguity
 Ignoring multivalued attributes, extended instructor schema is :

ID First_nam Middle_initial Last_name Street_ Street_ Apt_num city state zip Date_of
e number name ber _birth

DATABASE MANAGEMENT SYSTEMS 43


Reduction to Relation Schema
Representing Composite and Multi-valued Attributes
A multivalued attribute M of an entity E is represented by a separate
schema EM
◦ Schema EM has attributes corresponding to the primary key of E and
an attribute corresponding to multivalued attribute M
◦ Example: Multivalued attribute phone_number of instructor is
represented by a schema:
inst_phone= ( ID, phone_number)
◦ Each value of the multivalued attribute maps to a separate tuple of the
relation on schema EM
◦ For example, an instructor entity with primary key 22222 and
phone numbers 456-7890 and 123-4567 maps to two tuples:
(22222, 456-7890) and (22222, 123-4567) ID Phone_number
22222 456-7890
22222 123-4567

DATABASE MANAGEMENT SYSTEMS 44


Reduction to Relation Schema
Representing Composite and Multi-valued
Attributes

Special case : entity time_slot has only one attribute other


than the primary-key attribute, and that attribute is
multivalued
◦ Optimization: Don’t create the relation corresponding to
the entity, just create the one corresponding to the
multivalued attribute
◦ time_slot(time_slot_id, day, start_time, end_time)
◦ time_slot attribute of section (from sec_time_slot) cannot
be a foreign key due to this optimization

DATABASE MANAGEMENT SYSTEMS 45


Extended ER Features Reduction to Relation Schemas

DATABASE MANAGEMENT SYSTEMS 46


Relational Model

DATABASE MANAGEMENT SYSTEMS 47


Relational Model: Basic Structure
Formally, given sets D1, D2, …. Dn a relation r is a subset
of
D1 x D2 x … x Dn
Thus, a relation is a set of n-tuples (a1, a2, …, an) where
each ai  Di
Example: customer_name = {Jones, Smith, Curry,
Lindsay}
customer_street = {Main, North, Park}
customer_city = {Harrison, Rye, Pittsfield}
Then r = { (Jones, Main, Harrison),
(Smith, North, Rye),
(Curry, North, Rye),
(Lindsay, Park, Pittsfield) }
is a relation over
customer_name , customer_street, customer_city

DATABASE MANAGEMENT SYSTEMS 48


Relational Model: Attribute types
Each attribute of a relation has a name
The set of allowed values for each attribute is called the domain of the attribute
Attribute values are (normally) required to be atomic; that is, indivisible
Note: multivalued attribute values are not atomic ({secretary. clerk}) is example
of multivalued attribute position
Note: composite attribute values are not atomic
The special value null is a member of every domain
The null value causes complications in the definition of many operations

DATABASE MANAGEMENT SYSTEMS 49


Relational Model: Relation Schema
A1, A2, …, An are attributes
R = (A1, A2, …, An ) is a relation schema
Example:
Customer
Customer_schema = (customer_name, Customer_name Customer_street Customer_city
customer_street, customer_city)
r(R) is a relation on the relation schema R
Example:
customer (Customer_schema)

DATABASE MANAGEMENT SYSTEMS 50


Relational Model: Relation Instance
The current values (relation instance) of a relation are specified by a table
An element t of r is a tuple, represented by a row in a table

attributes
(or columns)
customer_name customer_street customer_city

Jones Main Harrison


Smith North Rye tuples
Curry North Rye (or rows)
Lindsay Park Pittsfield

Customer

DATABASE MANAGEMENT SYSTEMS 51


Relational Model: Database
 A database consists of multiple relations
 Information about an enterprise is broken up into parts,
with each relation storing one part of the information
account : stores information about accounts
depositor : stores information about which customer
owns which account
customer : stores information about customers

 Storing all information as a single relation such as


bank (account_number, balance, customer_name, ..)
results in repetition of information (e.g., two customers
own an account) and the need for null values (e.g.,
represent a customer without an account)

DATABASE MANAGEMENT SYSTEMS 52


Relational Model: Keys
Let K  R
 K is a superkey of R if values for K are sufficient to identify a
unique tuple of each possible relation r(R)
◦ by “possible r ” we mean a relation r that could exist in the Superkeys
enterprise we are modeling.
◦ Example: {customer_name, customer_street} and Candidate
{customer_name} keys K
are both superkeys of Customer, if no two customers can
possibly have the same name.

 K is a candidate key if K is minimal


Example: {customer_name} is a candidate key for.
Primary key
 Primary Key : Any candidate key

DATABASE MANAGEMENT SYSTEMS 53


Relational Integrity Constraints

DATABASE MANAGEMENT SYSTEMS 54


Integrity Constraints
Integrity constraints guard against accidental
damage to the database, by ensuring that
authorized changes to the database do not result
in a loss of data consistency.
Examples :
 Domain Constraints
 Referential Integrity Constraints
 Entity Integrity Constraint
 Key Constraints
 Triggers
 Functional Dependencies

DATABASE MANAGEMENT SYSTEMS 55


Integrity Constraints : Domain Constraints
 They define valid values for attributes
 They are the most elementary form of integrity constraint.
 They test values inserted in the database, and test queries to ensure that the comparisons make sense.
 The check clause in SQL-92 permits domains to be restricted(Not supported in MySQL)
use check clause to ensure that an hourly-wage domain allows only values greater than a specified
value.
create domain hourly-wage numeric(5,2)
constraint value-test check (value>=4.00)
The domain hourly-wage is declared to be a decimal number with 5 digits, 2 of which are after the
decimal point
The domain has a constraint that ensures that the hourly-wage is greater than 4.00.
constraint value-test is optional; useful to indicate which constraint an update violated.

DATABASE MANAGEMENT SYSTEMS 56


Integrity Constraints : Referential Integrity
Ensures that a value that appears in one relation for a given set of attributes
also appears for a certain set of attribute in another relation.
◦ If an account exists in the database with branch name “Perryridge”, then the branch “Perryridge” must
actually exist in the database.

account ( account-no, branch-name, balance )


A-123 Perryridge 5000

Foreign Key(branch-name)
branch (branch-name, branch-city, asset )
Perryridge Brooklyn 500,000

A set of attributes X in R is a foreign key if it is not a primary key of R but


it is a primary key of some relation S.

DATABASE MANAGEMENT SYSTEMS 57


Referential Integrity : Formal Definition
 Let r1(R1) and r2(R2) be relations with primary keys K1 and K2 respectively.

 The subset  of R2 is a foreign key referencing K1 in relation r1, if for every t2 in r2 there must be a tuple t1 in
r1 such that t1[K1]=t2[].

 Referential integrity constraint: (r2)  K1 (r1)

R1 ( K1, …, … ) R2 ( K2, …, , … )
t2
x
t1
x

DATABASE MANAGEMENT SYSTEMS 58


Referential Integrity for Insertion and Deletion

The following tests must be made in order to preserve the following referential integrity constraint:

 Insert. If a tuple t2 is inserted into r2. The system must ensure that there is a tuple t1 in r1 such that
t1[K] = t2[]. That is
t2[] K(r1)
 Delete. If a tuple t1 is deleted from r1, the system must compute the set of tuples in r2 that reference
t1:
=t1[K](r2)

if this set is not empty, either the delete command is rejected as an error, or the tuples that reference
t1 must themselves be deleted (cascading deletions are possible)

DBMS_VASUNDHARA GHATE,MIT COE 59


Referential Integrity for Update

if a tuple t2 is updated in relation r2 and the update modifies values for the foreign key , then a test
similar to the insert case is made. Let t2’ denote the new value of tuple t2. The system must ensure
that
new foreign key value must
t2’[] K(r1)
exist

if a tuple t1 is updated in r1, and the update modifies values for primary key(K), then a test similar to
the delete case is made. The system must compute
=t1[K](r2)
no foreign keys contain the old
primary key

using the old value of t1 (the value before the update is applied). If this set is not empty, the update
may be rejected as an error, or the update may be applied to the tuples in the set (cascade update), or
the tuples in the set may be deleted.

DBMS_VASUNDHARA GHATE,MIT COE 60


Integrity Constraints : Entity Integrity Constraints

DBMS_VASUNDHARA GHATE,MIT COE 61


Integrity Constraints :Key Constraints

Key Constraints :
 Consider Works_in Relationship: An
employee can work in many departments ; a
department can have many employees.
In contrast, each department has at most one
manager ,according to the key constraint on
Manages Relationship

DBMS_VASUNDHARA GHATE,MIT COE 62


Triggers

Example :
A trigger is a statement that is executed Suppose that instead of allowing negative
automatically by the system as a side account balances, the bank deals with
effect of a modification to the database. overdrafts by
 setting the account balance to zero
To design a trigger mechanism, we must:  creating a loan in the amount of the overdraft
◦ Specify the conditions under which the  giving this loan a loan number which is identical to
trigger is to be executed. the account number of the overdrawn account.
◦ Specify the actions to be taken when the
trigger executes. The condition for executing the trigger is an
update to the account relation that results in
a negative balance value.

DBMS_VASUNDHARA GHATE,MIT COE 63


Example of Translating Conceptual Model into Relation Schemas

Consider the following ‘simple’ conceptual data model:

M M 1 M
Course Staff Student

Staff(Staff-ID, Name, Address, ScalePoint, RateOfPay, DOB,


...)
Student(Enrol-No, Name, Address, ..)
Course(CourseCode, Name, Duration, ...)

Task : To find the number of relations formed from above model

DATABASE MANAGEMENT SYSTEMS 64


The ‘Translation’ Process

 Entities become Relations


 Attributes become Attributes(?)
 Key Attribute(s) become Primary Key(s)
 Relationships are represented by additional Foreign Key Attributes:
◦ for those Relations that are at the ‘M’ end of each 1:M Relationship

DATABASE MANAGEMENT SYSTEMS 65


The ‘Staff’ & ‘Student’ Relations

Staff(Staff-ID, Name, Address, ScalePoint, RateOfPay, DOB, ...)


becomes:
 Staff
Staff-ID Name Address ScalePoint RateOfPay DOB

Student(Enrol-No, Name, Address, ...)


becomes:
 Student
Enrol-No Name Address OLevelPoints Tutor

NB. Foreign Key Tutor references Staff. Staff-ID

DATABASE MANAGEMENT SYSTEMS 66


The ‘Staff’ & ‘Course’ Relations

Staff Staff-ID Name Address ScalePoint RateOfPay DOB

Course(CourseCode, Name, Duration, ...)


becomes:

Course
CourseCode Name Duration

Reflection Spot 2
Q. Based on the relationship between Staff and Course can we add foreign
key in any of above relations in the translation process?

DATABASE MANAGEMENT SYSTEMS 67


‘Staff’, ‘Course’ & ‘Team’ Relations

Staff Staff-ID Name Address ScalePoint RateOfPay DOB

Team CourseCode Staff-ID Answer : We can’t add a Foreign Key; as Both Relations
have a ‘M’ end.
Rule : We must create an ‘artificial’ linking Relation.
Course CourseCode Name Duration

The ‘artificial’ Relation (i.e. Team):


◦ The Primary Key is a composite of CourseCode & Staff-ID
◦ Foreign Key CourseCode references Course.CourseCode
◦ Foreign Key Staff-ID references Staff.Staff-ID

DATABASE MANAGEMENT SYSTEMS 68


4 Relations from 3 Entities?

Student Enrol-No Name Address OLevelPoints Tutor

Staff Staff-ID Name Address ScalePoint RateOfPay DOB

Team CourseCode Staff-ID

Course CourseCode Name Duration

BUT - are they anomaly free?

DATABASE MANAGEMENT SYSTEMS 69


5 Relations from 3 Entities
Student Enrol-No Name Address OLevelPoints Tutor

Staff
Staff-ID Name Address ScalePoint DOB

Course CourseCode Name Duration

Pay a ‘split-off’ Relation


ScalePoint RateOfPay
- to ‘solve’ a Dependency
‘problem’
Team CourseCode Staff-ID an ‘artificial’ Relation
- to ‘solve’ a M:M ‘problem’

DATABASE MANAGEMENT SYSTEMS 70


Relational Algebra

DATABASE MANAGEMENT SYSTEMS 71


Relational Algebra

 Basic Relational Algebra Operations:


 What is “algebra ? Select 
Project 
 Mathematical model consisting of:
 Operands --- Variables or values; Union 
 Operators --- Symbols denoting procedures Set Difference (or Subtract or minus) –
that construct new values from a given values
Cartesian Product X
 Relational Algebra : is an algebra whose
Natural Join
operands are relations and operators are
designed to do the most commons things that
we need to do with relations

DBMS_VASUNDHARA GHATE,MIT COE 72


Relational Algebra: Select Operation

 Notation:  p(r) Example of selection:


Account(account_number, branch_name,balance)
p is called the selection predicate
 branch-name=“Perryridge”(account)
Defined as:
p(r) = {t | t  r and p(t)} A B C D A B C D
  1 7   1 7
 Where p is a formula in
propositional calculus consisting of terms
connected by :  (and),  (or),  (not)   5 7   23 10
Each term is one of:
  12 3
<attribute> op <attribute> or <constant> A=B ^ D > 5 (r)
where op is one of: =, , >, . <.    23 10

Relation r

DBMS_VASUNDHARA GHATE,MIT COE 73


Relational Algebra: Project Operation

 Notation: A1, A2, …, Ak (r) Example of Project Operation :


where A1, A2 are attribute names and r is a relation. A B C A C
The result is defined as the relation of k columns 10 1
obtained by erasing the columns that are not listed   1
A C
Duplicate rows removed from result, since relations 20 1
are sets   1  1

E.g. to eliminate the branch-name attribute of =


 30 1  1  1
account
account-number, balance (account)
 40 2  2
2

If relation Account contains 50 tuples, how many
tuples contains account-number, balance (account) ?
Relation r A,C (r)
That is, the projection of a relation on a set of attributes is
a set of tuples

DBMS_VASUNDHARA GHATE,MIT COE 74


Relational Algebra: Union Operation

 Notation: r  s Example of Union:


Consider relational schemas: A B
A B
Depositor(customer_name, account_number)  1
 1
A B
Borrower(customer_name, loan_number) 2
2 
For r  s to be valid.   2
 1
1. r, s must have the same number of attributes  1  3
 3
2. The attribute domains must be compatible (e.g., 2nd
column of r deals with the same type of values as does
Relation r Relation s
the 2nd column of s) rs
 Find all customers with either an account or a loan
customer-name (depositor)  customer-name (borrower)

DBMS_VASUNDHARA GHATE,MIT COE 75


Relational Algebra: Set Difference Operation

 Notation : r – s Example of Set Difference:


Set differences must be taken between A B
compatible relations.
 1
 r and s must have the same number of A B A B
attributes
 2  2  1
 attribute domains of r and s must be
compatible
 1  3  1

Relation r Relation s r-s

DBMS_VASUNDHARA GHATE,MIT COE 76


Relational Algebra: Cartesian Product Operation

 Notation : r x s A B C D E
C D E
 1  10 a
 10 a
Assume that attributes of r(R) and s(S) are disjoint.  1  10 a
(That is, R  S =  ).  10 a
If attributes of r(R) and s(S) are not disjoint, then
A B rⅹs  1  20 b
renaming must be used.  1  20 b
 1  10 b
 2  10 b
 2  10 a

 2  10 a
Relation r Relation s  2  20 b
 2  10 b

DBMS_VASUNDHARA GHATE,MIT COE 77


DATABASE MANAGEMENT SYSTEMS 78
DATABASE MANAGEMENT SYSTEMS 79
DATABASE MANAGEMENT SYSTEMS 80
Example

DATABASE MANAGEMENT SYSTEMS 81


DATABASE MANAGEMENT SYSTEMS 82
Functional Dependency

DATABASE MANAGEMENT SYSTEMS 83


Functional Dependency
 Redundancy in relational databases is often caused by a functional dependency
 A functional dependency (FD) : a link between two sets of attributes in a relation
 We can normalize a relation by removing undesirable FD
 A set of attributes, A, functionally determines another set, B, or: there exists a
functional dependency between A and B (A ->B)
 If whenever two rows of the relation have the same values for all the attributes in A,
then they also have the same values for all the attributes in B.

DATABASE MANAGEMENT SYSTEMS 84


Functional Dependencies Continued
Example
Set of FDs :
1. {ID} - >{First,Last}
2. {ID,modCode}->{First,Last,modName}
3. {modCode}->{modName}

Represented by an arrow sign (→) that is, X→Y,


where X functionally determines Y.
The left-hand side attributes determine the values of
attributes on the right-hand side.

DATABASE MANAGEMENT SYSTEMS 85


Armstrong’s Axioms
 Closure : If F is a set of functional dependencies then the closure of F, denoted as F+ , is the set of all functional
dependencies logically implied by F.
 Set of rules, that when applied repeatedly, generates a closure of functional dependencies.
 Rules are as follows:
Reflexive rule : if Y ⊆ X then XY
Augmentation rule : if XY then XZ YZ for any Z
Transitivity rule : if XY and YZ then XZ
Union : if XY and XZ then XYZ
Decomposition : if XYZ then XY and XZ
Pseudo Transitivity : if XY and YZW then XZW

DATABASE MANAGEMENT SYSTEMS 86


Database Normalization

DATABASE MANAGEMENT SYSTEMS 87


Database Normalization : Need
What is an Anomaly?
Anything we try to do with a database that leads to unexpected and/or unpredictable
results.
Three types of Anomaly to guard against:
 insert
 delete
 update

Need to check your database design carefully:


 the only good database is an anomaly free database.

DATABASE MANAGEMENT SYSTEMS 88


Reflection Spot 1

Question. Consider the database given below. Suppose we have built a new
room B112 but it has not been added to the time-table what can be the
problem for inserting room information?

CoNo Tutor Room RSize EnLimit

351 Clark A401 45 40


354 Turner C502 100 60

434 Smith B101 430 320

677 Stella H607 400 45

VASUNDHARA GHATE,MIT COE,IT DEPT 89


1.Insert Anomaly
Answer : The attempt to insert room details will be prevented/not allowed unless
and until it has been added in the time table and has associated Course and tutor
assigned.
When we want to enter a value into a data cell but the attempt is prevented, as
another value is not known it leads to Insert Anomaly

DATABASE MANAGEMENT SYSTEMS 90


2. Delete Anomaly
 When a value we want to delete also means we will delete values we wish to keep.

 e.g. If CoNo 351 is deleted, but Room A401will be used elsewhere.

CoNo Tutor Room RSize EnLimit


351 Clark A401 45 40
354 Turner C502 100 60
434 Smith B101 430 320
677 Stella H607 400 45

DATABASE MANAGEMENT SYSTEMS 91


3.Update Anomaly
 When we want to change a single data item value, but must update multiple
entries
e.g. Room H940 has been improved, it is now of RSize = 500. (Note:- Here in this
table is incomplete to see imformation of H940.)

DATABASE MANAGEMENT SYSTEMS 92


Database Normalization Forms

Normal Forms :
 1 NF (Atomicity)
 2 NF (Remove Partial Dependency)
 3NF (Remove Transitive Dependency)
 Boyce Codd NF (Super key)
 4 NF (Multi-valued Dependencies)
 5 NF (Join Dependency)

DATABASE MANAGEMENT SYSTEMS 93


1 NF : First Normal Form

A method to remove all these anomalies and bring the Conversion to 1NF
database to a consistent state.
Consider the relation Course_info

Rules :
All the attributes in a relation must have atomic
domains.  Each attribute must contain only a single
The values in an atomic domain must be value from its pre-defined domain.
indivisible units.

DATABASE MANAGEMENT SYSTEMS 94


2 NF : Second Normal Form

2 NF Conversion to 2 NF
Prime attribute − An attribute, which is a part From example , we find that Stu_Name can be identified
of the prime-key, is known as a prime attribute. by Stu_ID and Proj_Name can be identified by Proj_ID
independently. This is called partial dependency, which
Non-prime attribute − An attribute, which is is not allowed in Second Normal Form.
not a part of the prime-key, is said to be a non-
prime attribute. Therefore, we can convert the relation into as shown
below
Consider the relation student_project
Student {Stud_id,Stud-name} and
{Stud_id, Proj_id, Stud_name, Project_Title} Stud_id Stud_name
Rule : Every non-prime attribute should be Project {Proj_id,Project_Title}
fully functionally dependent on prime key
attribute. That is, if X → A holds, then there Prod_id Project_Title
should not be any proper subset Y of X, for
which Y → A also holds true.
DATABASE MANAGEMENT SYSTEMS 95
3 NF : Third Normal Form

Rules :
 For a relation to be in Third Normal Form, it must be in Second Normal form and the
following must satisfy −
 No non-prime attribute is transitively dependent on prime key attribute.
 For any non-trivial functional dependency, X → A, then either
◦ X is a super key or,
◦ A is prime attribute.
Consider relation Stud_info
Stud_info {rno,name,marks,zip,city,dob}
rno name marks zip city dob
Reflection Spot 3
Question. Does the above relation satisfy the 3 NF criteria??If no convert it to 3 NF.

DATABASE MANAGEMENT SYSTEMS 96


3 NF : Third Normal Form

Ans : No. The given relation does not satisfy the 3NF as it contains following transitive
dependency:
We have : Rno->zip but zip->city
Therefore Rno->city (Transitive Dependency)

Conversion to 3NF :
Stud_info {rno,name,marks,dob,zip}

rno name marks dob zip

Zip_city {zip,city}
zip city

DATABASE MANAGEMENT SYSTEMS 97


References
1.Ramakrishnan, R. and Gherke, J., “Database Management Systems”, 3rd Ed., McGraw-
Hill.
2. Connally T, Begg C.,”Database Systems”,Pearson Education

DATABASE MANAGEMENT SYSTEMS 98


End of Unit 1

DATABASE MANAGEMENT SYSTEMS 99

You might also like