The document discusses data normalization, which is the process of organizing data in a database to eliminate redundancy and anomalies like insertion, update, and deletion anomalies. It covers the different normal forms including first normal form (1NF), second normal form (2NF), and third normal form (3NF) which aim to eliminate repeating groups, non-key attributes that are partially dependent on primary keys, and transitive dependencies, respectively, to optimize the database structure.
The document discusses data normalization, which is the process of organizing data in a database to eliminate redundancy and anomalies like insertion, update, and deletion anomalies. It covers the different normal forms including first normal form (1NF), second normal form (2NF), and third normal form (3NF) which aim to eliminate repeating groups, non-key attributes that are partially dependent on primary keys, and transitive dependencies, respectively, to optimize the database structure.
The document discusses data normalization, which is the process of organizing data in a database to eliminate redundancy and anomalies like insertion, update, and deletion anomalies. It covers the different normal forms including first normal form (1NF), second normal form (2NF), and third normal form (3NF) which aim to eliminate repeating groups, non-key attributes that are partially dependent on primary keys, and transitive dependencies, respectively, to optimize the database structure.
The document discusses data normalization, which is the process of organizing data in a database to eliminate redundancy and anomalies like insertion, update, and deletion anomalies. It covers the different normal forms including first normal form (1NF), second normal form (2NF), and third normal form (3NF) which aim to eliminate repeating groups, non-key attributes that are partially dependent on primary keys, and transitive dependencies, respectively, to optimize the database structure.
Download as PPTX, PDF, TXT or read online from Scribd
Download as pptx, pdf, or txt
You are on page 1of 10
DATA NORMALISATION
The process of normalization first developed
E.F.Codd. Normalisation is the process of building database structure to store data Normalization is a process of organizing the data in database to avoid data redundancy, insertion anomaly, update anomaly & deletion anomaly ANOMALIES • There are three types of anomalies that occur when the database is not normalized. These are – Insertion, update and deletion anomaly. • Update anomaly: we have two rows for employee Rick as he belongs to two departments of the company. If we want to update the address of Rick then we have to update the same in two rows or the data will become inconsistent. If somehow, the correct address gets updated in one department but not in other then as per the database, Rick would be having two different addresses, which is not correct and would lead to inconsistent data. • Insert anomaly: Suppose a new employee joins the company, who is under training and currently not assigned to any department then we would not be able to insert the data into the table if emp_dept field doesn’t allow nulls. • Delete anomaly: Suppose, if at a point of time the company closes the department D890 then deleting the rows that are having emp_dept as D890 would also delete the information of employee Maggie since she is assigned only to this department. • To overcome these anomalies we need to normalize the data NORMAL FORMS Redundant data can pose a huge problem in databases. First of all, someone has to enter the same data repeatedly. Second, if a changes made in one piece of the data, the change has to be made in many places. For example, if customer Starks changes his name to Starks Johnson, you would go to the individual row in INVOICE and make the changes. The redundancy may also lead to anomalies. FIRST NORMAL FORM(1NF) It is a relation in which the intersection of each row and columns contains one and only one value. To transform the unnormalised table (a table that contains one or more repeating groups) to first normal form, we identify and remove the repeating groups within the table. A repeating group is a set of columns that store similar information that repeats in the same table. • A table is said to be in first normal form or can be labeled 1NF, if the following condition exist. • The primary key is defined. This includes a composite key if a single column cannot be used as a primary key. In our INVOICE table, InvNo and ItemId are defined as the composite primary key components. • All non-key columns show functional dependency on the primary key components. If you know the invoice number and the item number, you can find out the invoice date, customer number and name, item name and price, and quantity ordered. • The table contains no multi-valued columns. In a single-valued column, the intersection of the row and column returns only one value. In a normalized table the intersection of a row ad a column is a single value. InvNo InvDate CustNo ItemNo CustName ItemName ItemPrice Qty
SECOND NORMAL FORM(2NF) It is based on the concept of full functional dependency. Full functional dependency indicates that if A and B are attributes of a relation. B is fully functionally dependent on A, if B is functionally dependent on A, but not on any proper subset of A. A functional dependency A B is partially dependent if there is some attribute that can be removed from A and the dependency still holds. A relation is in second normal form if it is in first normal form and every non-primary key attribute is fully and functionally dependent on the primary key. Thus no non-key attribute is functionally dependent on the primary key. THIRD NORMAL FORM(3NF) A table is said to be in third normal form or 3NF, if the following requirements are satisfied • All 2NF requirements are fulfilled • There is no transitive dependency • A table that has transitive dependency is not in 3NF, but it needs to be decomposed further to achieve 3NF. However a table in 2NF that does not contain any transitive dependency does not need any further decomposition and is automatically in 3NF. • Other, higher normal forms are defined in some database texts. Boyce-Codd normal form (BCNF), fourth normal form (4NF), fifth normal form(5NF), and domain key normal form (DKNF) are not covered in this text.