Normalization Is A Method For Organizing Data Elements in A Database Into Tables
Normalization Is A Method For Organizing Data Elements in A Database Into Tables
Stacy Kovar
Normalization Avoids
Duplication of Data – The same data is listed in multiple lines of the database
Insert Anomaly – A record about an entity cannot be inserted into the table without first
inserting information about another entity – Cannot enter a customer without a sales
order
Delete Anomaly – A record cannot be deleted without deleting a record about a related
entity. Cannot delete a sales order without deleting all of the customer’s information.
Update Anomaly – Cannot update information without changing information in many
places. To update customer information, it must be updated for each sales order the
customer has placed
Normalization is a three-stage process – After the first stage, the data is said to be in first normal
form, after the second, it is in second normal form, after the third, it is in third normal form
Before Normalization
1. Begin with a list of all of the fields that must appear in the database. Think of this as one big
table.
2. Do not include computed fields
3. One place to begin getting this information is from a printed document used by the system.
4. Additional attributes besides those for the entities described on the document can be added to
the database.
Fiction Company
202 N. Main
Mahattan, KS 66502
The repeating fields will be removed from the original data table, leaving the following.
SalesOrderNo, Date, CustomerNo, CustomerName, CustomerAdd, ClerkNo, ClerkName
All of these fields except the primary key will be removed from the original table. The primary
key will be left in the original table to allow linking of data:
SalesOrderNo, ItemNo, Qty, UnitPrice
Never treat price as dependent on item. Price may be different for different sales orders
(discounts, special customers, etc.)
Along with the unchanged table below, these tables make up a database in second normal form:
SalesOrderNo, Date, CustomerNo, CustomerName, CustomerAdd, ClerkNo, ClerkName
What if we did not Normalize the Database to Second Normal Form?
Repetition of Data – Description would appear every time we had an order for the item
Delete Anomalies – All information about inventory items is stored in the
SalesOrderDetail table. Delete a sales order, delete the item.
Insert Anomalies – To insert an inventory item, must insert sales order.
Update Anomalies – To change the description, must change it on every SO.
All of these fields except the primary key will be removed from the original table. The primary
key will be left in the original table to allow linking of data as follows:
SalesOrderNo, Date, CustomerNo, ClerkNo
Together with the unchanged tables below, these tables make up the database in third normal
form.
ItemNo, Description
SalesOrderNo, ItemNo, Qty, UnitPrice
Answer:
Key is :
Student_ID, Course#, Semester#,
Dependency is:
Student_ID, Course#, Semester# -> Grade
Q2. Choose a key and write the dependencies for the LINE_ITEMS relation:
LINE_ITEMS (PO_Number, ItemNum, PartNum, Description, Price, Qty)
Answer:
Key can be: PO_Number, ItemNum
Dependencies are:
PO_Number, ItemNum -> PartNum, Description, Price, Qty
PartNum -> Description, Price
Answer:
First off, LINE_ITEMS could not be in BCNF because:
not all determinants are keys.
next: it could not be in 3NF because there is a transitive dependency:
Answer:
STORE_ITEM is in 1NF (non-key attribute (vendor) is dependent on only part of the key.
Q5: Normalize the above (Q4) relation into the next higher normal form.
Answer:
STORE_ITEM (SKU, PromotionID, Price)
VENDOR ITEM (SKU, Vendor, Style)
Q6: Choose a key and write the dependencies for the following SOFTWARE relation (assume all of the
vendor’s products have the same warranty).
SOFTWARE (SoftwareVendor, Product, Release, SystemReq, Price, Warranty)
SoftwareVendor, Product, Release -> SystemReq, Price, Warranty
Answer:
key is: SoftwareVendor, Product, Release
SoftwareVendor, Product, Release -> SystemReq, Price, Warranty
SoftwareVendor -> Warranty
.:. SOFTWARE is in 1NF
Answer:
SOFTWARE (SoftwareVendor, Product, Release, SystemReq, Price)
WARRANTY (SoftwareVendor, Warranty)
Answer:
2NF (Transitive dependencies exist)
Question 9: What normal form the following relation in?
STUFF2 (D, O, N, T, C, R, Y)
D, O -> N, T, C, R, Y
C, R -> D
D -> N
Answer:
1NF (Partial Key Dependency exist)
Invoice Relation
Inv# date custID Name Part# Desc Price #Used Ext Tax Tax Total
Price rate
14 12/63 42 Lee A38 Nut 0.32 10 3.20 0.10 1.22 13.42
14 12/63 42 Lee A40 Saw 4.50 2 9.00 0.10 1.22 13.42
15 1/64 44 Pat A38 Nut 0.32 20 6.40 0.10 064 7.04
Inv# date custID Name Part# Desc Price #Used Tax rate
14 12/63 42 Lee A38 Nut 0.32 10 0.10
14 12/63 42 Lee A40 Saw 4.50 2 0.10
15 1/64 44 Pat A38 Nut 32 20 0.10
To get 2NF
- Remove partial dependencies
- Partial FDs with key attributes.
- Inv# -> Date, CustID, Name, Tax Rate
- Part# -> Desc, Price
|–K1-||———————–D1———————————||—K2—||——-D2———|
Inv# date custID Name Tax rate Part# Desc Price #Used
14 12/63 42 Lee 0.10 A38 Nut 0.32 10
14 12/63 42 Lee 0.10 A40 Saw 4.50 2
15 1/64 44 Pat 0.10 A38 Nut 32 20
=
Remove transitive FD
=
Inv# date custID Tax rate
14 12/63 42 0.10
15 1/64 44 0.10
+
custID Name
42 Lee
44 Pat
custID Name
42 Lee
42 Pat
In this page you will see a basic database normalization example, transforming BCNF table into a 4NF
one(s).
The next table is in the BCNF form, convert it to the 4th normal form.
Jones X A
Jones Y B
So this data may be already in the table, which means that it’s repeated.
Jones B X
Jones A Y
To transform this into the 4th normal form (4NF) we must separate the original table into two
tables like this:
mployeeE killS
Jones electrical
Jones mechanical
Smith plumbing
And
mployeeE anguageL
Jones French
Jones German
Smith Spanish
To normalize databases, there are certain rules to keep in mind. These pages will illustrate the
basics of normalization in a simplified way, followed by some examples.
Database normalization Rule 1: Eliminate Repeating Groups. Make a separate table for each set of
related attributes, and give each table a primary key.
In the original list of data, each puppy description is followed by a list of tricks the puppy has
learned. Some might know 10 tricks, some might not know any. To answer the question “Can
Fifi roll over?” we need first to find Fifi’s puppy record, then scan the list of tricks associated
with the record.This is awkward, inefficient, and extremely untidy.
Moving the tricks into a separate tablehelps considerably. Seperating the repeating groupsof
tricks from the puppy information results in first normal form. The puppy number in the trick
table matches the primarykey in the puppy table, providing a foreign key for relating the two
tables with a join operation. Now we can answer our question with a direct retrieval look to see if
Fifi’s puppy number and the trick ID for “roll over” appear together in the trick table.
First Normal Form:
Puppy Table
puppy number — primary key
puppy name
kennel name
kennel location
Trick Table
puppy number
trick ID
trick name
trick where learned
skill level