Normalization in RDBMS

Normalization is a systematic approach of decomposing tables to eliminate data redundancy and undesirable characteristics like Insertion, Update and Deletion Anamolies.

Update anomalies − If data items are scattered and are not linked to each other properly, then it could lead to strange situations. For example, when we try to update one data item having its copies scattered over several places, a few instances get updated properly while a few others are left with old values. Such instances leave the database in an inconsistent state.

Deletion anomalies – When you tried to delete a record, but parts of it was not deleted because of unawareness, the data is also saved somewhere else.

Insert anomalies – When you tried to insert data in a record that does not exist at all.

Normalization is a method to remove all these anomalies and bring the database to a consistent state.

What are different types of Normal Forms?

  • 1st NF (1st normal form)
  • 2nd NF (2nd normal form)
  • 3rd NF (3rd normal form)
  • BCNF (BOYCE CODD NF)
  • 4th NF
  • 5th NF

1 NF (First Normal Form):

For First Normal form following rules must be followed:

  • Every column in the table must be unique
  • Separate tables must be created for each set of related data
  • Each table must be identified with a unique column or concatenated columns called the primary key
  • No rows may be duplicated
  • no columns may be duplicated
  • no row/column intersections contain a null value
  • no row/column intersections contain multivalued fields

2NF (Second Normal Form):

Second normal form states that it should meet all the rules for 1NF and there must be no partial dependences of any of the columns on the primary key.

A database is in second normal form if it satisfies the following conditions:

• It is in first normal form.

• All non-key attributes are fully functional dependent on the primary key.

Consider a table with following attributes:

 

This table is in first normal form, in that it obeys all the rules of first normal form. In this table, the primary key consists of ST_ID and BOOK_ID.

 

However, the table is not in second normal form because there are partial dependencies of primary keys and columns. ST_NAME is dependent on ST_ID, and there’s no real link between a Student’s name and what book he issued. Book name and author are also dependent on BOOK_ID, but they are not dependent on ST_ID, because there’s no link between a ST_ID and an AUTHOR_NAME or their ISSUE_DATE.

To make this table comply with second normal form, you need to separate the columns into three tables.

First, create a table to store the STUDENT details as follows:

 

Next, create a table to store details of each BOOK:

 

 

Finally, create a third table storing just ST_ID and BOOK_ID to keep track of all the books issue to a student:

 

 

3NF (Third Normal Form) 

 

First and foremost thing is that a table has to be in 2NF to be in 3NF. Next the rule is: remove to a new table any non-key attributes that are more dependent on other non-key attributes than the table key. Ignore tables with zero or only one non-key attribute (these go straight to 3NF with no conversion). 

The process is as follows:

If a non-key attribute is more dependent on another non-key attribute than the table key:

  • Move the dependent attribute, together with a copy of the non-key attribute upon which it is dependent, to a new table.
  • Make the non-key attribute, upon which it is dependent, the key in the new table. Underline the key in this new table.
  • Leave the non-key attribute, upon which it is dependent, in the original table and mark it a foreign key .

Thus a table is in third normal form if:

  • A table is in 2nd normal form.
  • It contains only columns that are non-transitively dependent on the primary key.