1NF, 2NF
1NF, 2NF
1NF, 2NF
If a table is not properly normalized and have data redundancy then it will not only eat up extra
memory space but will also make it difficult to handle and update the database, without facing
data loss. Insertion, Updation and Deletion Anomalies are very frequent if database is not
normalized. To understand these anomalies let us take an example of a Student table.
In the table above, we have data of 4 Computer Sci. students. As we can see, data for the fields
branch, hod(Head of Department) and office_tel is repeated for the students who are in the same
branch in the college, this is Data Redundancy.
Insertion Anomaly
Suppose for a new admission, until and unless a student opts for a branch, data of the student
cannot be inserted, or else we will have to set the branch information as NULL.
Also, if we have to insert data of 100 students of same branch, then the branch information will
be repeated for all those 100 students.
These scenarios are nothing but Insertion anomalies.
Updation Anomaly
What if Mr. X leaves the college? or is no longer the HOD of computer science department? In
that case all the student records will have to be updated, and if by mistake we miss any record, it
will lead to data inconsistency. This is Updation anomaly.
Deletion Anomaly
In our Student table, two different informations are kept together, Student information and
Branch information. Hence, at the end of the academic year, if student records are deleted, we
will also lose the branch information. This is Deletion anomaly.
Normalization Rule
Normalization rules are divided into the following normal forms:
In the next tutorial, we will discuss about the First Normal Form in details.
To understand what is Partial Dependency and how to normalize a table to 2nd normal for, jump
to the Second Normal Form tutorial.
Here is the Third Normal Form tutorial. But we suggest you to first study about the second
normal form and then head over to the third normal form.
To learn about BCNF in detail with a very easy to understand example, head to Boye-Codd
Normal Form tutorial.
Our table already satisfies 3 rules out of the 4 rules, as all our column names are unique, we have
stored data in the order we wanted to and we have not inter-mixed different type of data in
columns.
But out of the 3 different students in our table, 2 have opted for more than 1 subject. And we
have stored the subject names in a single column. But as per the 1st Normal form each column
must contain atomic value.
101 Akon OS
101 Akon CN
102 Bkon C
What is Partial Dependency? Do not worry about it. First let's understand what
is Dependency in a table?
What is Dependency?
Let's take an example of a Student table with columns student_id, name, reg_no(registration
number), branch and address(student's home address).
In this table, student_id is the primary key and will be unique for every row, hence we can
use student_id to fetch any row of data from this table
Even for a case, where student names are same, if we know the student_id we can easily fetch
the correct record.
Hence we can say a Primary Key for a table is the column or a group of columns(composite
key) which can uniquely identify each record in the table.
I can ask from branch name of student with student_id 10, and I can get it. Similarly, if I ask for
name of student with student_id 10 or 11, I will get it. So all I need is student_id and every other
column depends on it, or can be fetched using it.
This is Dependency and we also call it Functional Dependency.
subject_id subject_name
1 Java
2 C++
3 Php
Now we have a Student table with student information and another table Subject for storing
subject information.
Let's create another table Score, to store the marks obtained by students in the respective
subjects. We will also be saving name of the teacher who teaches that subject along with marks.
1 10 1 70 Java Teacher
2 10 2 75 C++ Teacher
3 11 1 80 Java Teacher
In the score table we are saving the student_id to know which student's marks are these
and subject_id to know for which subject the marks are for.
Together, student_id + subject_id forms a Candidate Key(learn about Database Keys) for this
table, which can be the Primary key.
Confused, How this combination can be a primary key?
See, if I ask you to get me marks of student with student_id 10, can you get it from this table?
No, because you don't know for which subject. And if I give you subject_id, you would not
know for which student. Hence we need student_id + subject_id to uniquely identify any row.
And our Score table is now in the second normal form, with no partial dependency.
score_id student_id subject_id marks
1 10 1 70
2 10 2 75
3 11 1 80
Quick Recap
1. For a table to be in the Second Normal form, it should be in the First Normal form and it
should not have Partial Dependency.
2. Partial Dependency exists, when for a composite primary key, any attribute in the table
depends only on a part of the primary key and not on the complete primary key.
3. To remove Partial dependency, we can divide the table, remove the attribute which is
causing partial dependency, and move it to some other table where it fits in well.