Normalization

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

Functional Dependencies

A functional dependency (FD) is a relationship between two attributes, typically between the
PK and other non-key attributes within a table. For any relation R, attribute Y is functionally
dependent on attribute X (usually the PK), if for every valid instance of X, that value of X
uniquely determines the value of Y. This relationship is indicated by the representation below :

X ———–> Y

The left side of the above FD diagram is called the determinant, and the right side is the
dependent. Here are a few examples.

In the first example, below, SIN determines Name, Address and Birthdate. Given SIN, we can
determine any of the other attributes within the table.

SIN ———-> Name, Address, Birthdate

For the second example, SIN and Course determine the date completed (DateCompleted). This
must also work for a composite PK.

SIN, Course ———> DateCompleted

The third example indicates that ISBN determines Title.

ISBN ———–> Title

functional dependency (FD): a relationship between two attributes, typically between the PK
and other non-key attributes within a table

non-normalized table: a table that has data redundancy in it


Normalization
Normalization should be part of the database design process. However, it is difficult to separate the normalization
process from the ER modelling process so the two techniques should be used concurrently.

Use an entity relation diagram (ERD) to provide the big picture, or macro view, of an organization’s data
requirements and operations. This is created through an iterative process that involves identifying relevant entities,
their attributes and their relationships.

Normalization procedure focuses on characteristics of specific entities and represents the micro view of entities
within the ERD.

What Is Normalization?
Normalization is the branch of relational theory that provides design insights. It is the process of determining how
much redundancy exists in a table. The goals of normalization are to:

 Be able to characterize the level of redundancy in a relational schema

 Provide mechanisms for transforming schemas in order to remove redundancy

Normalization theory draws heavily on the theory of functional dependencies. Normalization theory defines six
normal forms (NF). Each normal form involves a set of dependency properties that a schema must satisfy and each
normal form gives guarantees about the presence and/or absence of update anomalies. This means that higher normal
forms have less redundancy, and as a result, fewer update problems.

Normal Forms
All the tables in any database can be in one of the normal forms we will discuss next. Ideally we only want minimal
redundancy for PK to FK. Everything else should be derived from other tables. There are six normal forms, but we
will only look at the first four, which are:

 First normal form (1NF)

 Second normal form (2NF)

 Third normal form (3NF)

 Boyce-Codd normal form (BCNF)

BCNF is rarely used.


First Normal Form (1NF)
In the first normal form, only single values are permitted at the intersection of each row and column; hence, there are
no repeating groups.

To normalize a relation that contains a repeating group, remove the repeating group and form two new relations.

The PK of the new relation is a combination of the PK of the original relation plus an attribute from the newly created
relation for unique identification.

Process for 1NF


We will use the Student_Grade_Report table below, from a School database, as our example to explain the process
for 1NF.

Student_Grade_Report (StudentNo, StudentName, Major, CourseNo, CourseName, InstructorNo, InstructorName,


InstructorLocation, Grade)

 In the Student Grade Report table, the repeating group is the course information. A student can take many
courses.

 Remove the repeating group. In this case, it’s the course information for each student.

 Identify the PK for your new table.

 The PK must uniquely identify the attribute value (StudentNo and CourseNo).

 After removing all the attributes related to the course and student, you are left with the student course table
(StudentCourse).

 The Student table (Student) is now in first normal form with the repeating group removed.

 The two new tables are shown below.

Student (StudentNo, StudentName, Major)

StudentCourse (StudentNo, CourseNo, CourseName, InstructorNo, InstructorName, InstructorLocation, Grade)

How to update 1NF anomalies


StudentCourse (StudentNo, CourseNo, CourseName, InstructorNo, InstructorName, InstructorLocation, Grade)

 To add a new course, we need a student.

 When course information needs to be updated, we may have inconsistencies.

 To delete a student, we might also delete critical information about a course.

Second Normal Form (2NF)


For the second normal form, the relation must first be in 1NF. The relation is automatically in 2NF if, and only if, the
PK comprises a single attribute.

If the relation has a composite PK, then each non-key attribute must be fully dependent on the entire PK and not on a
subset of the PK (i.e., there must be no partial dependency or augmentation).

Process for 2NF


To move to 2NF, a table must first be in 1NF.

 The Student table is already in 2NF because it has a single-column PK.

 When examining the Student Course table, we see that not all the attributes are fully dependent on the PK;
specifically, all course information. The only attribute that is fully dependent is grade.

 Identify the new table that contains the course information.

 Identify the PK for the new table.

 The three new tables are shown below.

Student (StudentNo, StudentName, Major)

CourseGrade (StudentNo, CourseNo, Grade)

CourseInstructor (CourseNo, CourseName, InstructorNo, InstructorName, InstructorLocation)

How to update 2NF anomalies


 When adding a new instructor, we need a course.

 Updating course information could lead to inconsistencies for instructor information.

 Deleting a course may also delete instructor information.


Third Normal Form (3NF)
To be in third normal form, the relation must be in second normal form. Also all transitive dependencies must be
removed; a non-key attribute may not be functionally dependent on another non-key attribute.

Process for 3NF


 Eliminate all dependent attributes in transitive relationship(s) from each of the tables that have a transitive
relationship.

 Create new table(s) with removed dependency.

 Check new table(s) as well as table(s) modified to make sure that each table has a determinant and that no
table contains inappropriate dependencies.

 See the four new tables below.

Student (StudentNo, StudentName, Major)

CourseGrade (StudentNo, CourseNo, Grade)

Course (CourseNo, CourseName, InstructorNo)

Instructor (InstructorNo, InstructorName, InstructorLocation)

At this stage, there should be no anomalies in third normal form. Let’s look at the dependency diagram (Figure 12.1)
for this example. The first step is to remove repeating groups, as discussed above.

Student (StudentNo, StudentName, Major)

StudentCourse (StudentNo, CourseNo, CourseName, InstructorNo, InstructorName, InstructorLocation, Grade)


Normalization and Database Design
During the normalization process of database design, make sure that proposed entities
meet required normal form before table structures are created. Many real-world
databases have been improperly designed or burdened with anomalies if improperly
modified during the course of time. You may be asked to redesign and modify existing
databases. This can be a large undertaking if the tables are not properly normalized.
first normal form (1NF): only single values are permitted at the intersection of each
row and column so there are no repeating groups
normalization: the process of determining how much redundancy exists in a table
second normal form (2NF): the relation must be in 1NF and the PK comprises a single
attribute
third normal form (3NF): the relation must be in 2NF and all transitive dependencies
must be removed; a non-key attribute may not be functionally dependent on another
non-key attribute

You might also like