Schema Refinement (Normalization) in DBMS
Schema Refinement (Normalization) in DBMS
Schema Refinement
(Normalization)
Outline
3
Informal Design Guidelines For Relation Schemas
4
Informal Design Guidelines For Relation Schemas
2. Redundant Information in Tuples and Update Anomalies
DNUMBER
• Update anomalies:
• Insertion anomalies
• Deletion anomalies.
• Modification anomalies.
7
Informal Design Guidelines For Relation Schemas
EMP_PROJ
SSN PNumber Hours EName PName PLocation
• Insertion Anomalies:
• Occurs when it is impossible to store a fact until another fact is
known.
• Example:
1. Cannot insert a project unless an employee is assigned.
2. Cannot insert an employee unless he/she is assigned to a
project.
8
Informal Design Guidelines For Relation Schemas
EMP_PROJ
SSN PNumber Hours EName PName PLocation
• Delete anomalies:
• Occurs when the deletion of a fact causes other facts to be
deleted.
• Example:
1. When a project is deleted, it will result in deleting all the
employees who work on that project.
2. If an employee is the sole employee on a project, deleting
that employee would result in deleting the corresponding
project.
9
Informal Design Guidelines For Relation Schemas
EMP_PROJ
SSN PNumber Hours EName PName PLocation
• Modification Anomalies:
1. Occurs when a change in a fact causes multiple
modifications to be necessary.
2. Example: Changing the name of project number P1 (for
example) may cause this update to be made for all
employees working on that project.
10
Informal Design Guidelines For Relation Schemas
if any anomalies are present, note them clearly and make sure that
the programs that update the database will operate correctly.
11
Informal Design Guidelines For Relation Schemas
12
Functional Dependencies (FDs)
FDs are constraints that are derived from the meaning and
interrelationships of the data attributes
EMPLOYEE
Emp_ID Dept_IDDept_Name
Transitive Dependency
A transitive dependency is a functional dependency which holds
by virtue of transitivity. A transitive dependency can occur only in
a relation that has three or more attributes.
22
First Normal Form (1NF)
A relation is said to be in 1NF if:
– The attribute value are atomic:
A attribute said to be value atomic if it contain only a
single value of data for any given rows and column
intersection
– There should be No repeating group in particular
rows
Relation in 1 NF disallows:
– Multivalued attribute
– Composite or nested attribute
– Repeating groups of rows
First Normal Form (1NF)
Multivalued
First Normal Form (1NF)
There are three main techniques to achieve first normal
form for Multivalued attributes:
1. Expand the key so that there will be a separate tuple in the
original DEPARTMENT relation for each location of a
DEPARTMENT. Redundancy- Repeating groups
28
Second Normal Form (2NF)
• A relation is in 2NF if it is:
– in 1NF
– Every nonprime attribute is fully functionally dependent
on the primary key
30
Second Normal Form: Examples
31
Second Normal Form: Examples (cont’d…)
Solution:
Order(Order_No, Prod_ID)
Prod(Prod_ID, Description)
Third Normal Form (3 NF)
Based on concept of transitive dependency
Examples:
– SSN DMGRSSN is a transitive FD since
SSN DNUMBER and DNUMBER DMGRSSN hold
33
Example : Determine NF
BOOK
ISBN Title
In your solution you will write the
ISBN Publisher following justification:
Publisher Address 1. No M/V attributes, therefore at least
1NF
2. No partial dependencies, therefore
at least 2NF
Solution: 3. There is a transitive dependency
Book( ISBAN,Title,Publisher) (Publisher Address), therefore,
not 3NF
Pub_Add(Publisher, Address)
Steps in Data Normalization
UNORMALISED ENTITY
25
Advantages of Normalization
42
Disadvantages of Normalization
43