In this post, we will learn about normalization in dbms with examples. We will see different forms of normalization – 1 NF , 2 NF and 3 NF.
The normalization process takes a relation schema through a series of tests to clarify whether or not it belongs to a certain normal form.
The normalization process was first proposed by EF CODD and he introduced 3 normal forms, first, second and third normal form.
A stronger definition of 3 NF was proposed later by Boyce and Codd and is known as Boyce-Codd normal form (BCNF).
All three normal forms are based on the functional dependencies among the attributes of a relation. Normalization of data can be considered as a process during which unsatisfactory relation schemas are decomposed by breaking up their attributes into smaller relation schemas those posses desirable properties.
Normal forms provide database designers with:
- A formal framework for analyzing relation schemas based on their keys and on the functional dependencies among their attributes.
- A series of tests that can be carried out on individual relation schemas so that the relational database can be normalized to any degree. When a test fails, the relation violating that test must be decomposed into relations that individually meet the normalization tests.
The process of normalization in dbms through decomposition must also confirm the existence of additional properties that the relational schemas, taken together, should posses.
The properties are:
- Loss less join property.
- Dependency preservation property.
Unnormalized Relation
An unnormalized relation contains non-atomic values.
Now, we will see different forms on normalization in dbms – 1 NF, 2 NF and 3 NF
1. First Normal Form (1 NF)
A relation schema is said to be 1 NF, if its values in the domain of each attribute of the relation are atomic. In other words, only one value is associated with each attribute. And there should not be any repeating groups.
Example
Let’s say we have a product table which has a list of products and the color of the product
ProductID | ProductColor | ProductPrice | ProductDesc |
1 | Green | 500 | Shirt |
2 | White, Red, Black | 300 | Tshirt |
3 | Purple | 200 | Cap |
The above table is not in first normal form (1NF) because we can see there are multiple colors associated with a single product. For ProductID=1 we have 3 colors.
In order to make the above table in first normal form (1NF) we need to split the table into two.
ProductID | ProductPrice | ProductDesc |
1 | 500 | Shirt |
2 | 300 | Tshirt |
3 | 200 | Cap |
ProductID | ProductColor |
1 | Green |
2 | White, Red, Black |
3 | Purple |
2. Second Normal Form (2 NF)
For the database to be in second normal form it should meet two criteria’s. First criteria are that it should be in first normal form and second criteria are that all the non-key attributes should be functionally dependent on the primary key.
Example
Let’s say we have a table that contains information of teachers along with some other information and the subject they teach.
TeacherID | SubjectName | TeacherQualification | TeacherAge |
1001 | Database concepts | B.Tech | 42 |
1002 | C++ | B.Tech | 45 |
1003 | Java | Masters in Computer Science | 35 |
Let’s say TeacherID and SubjectName is Candidate Key and TeacherQualification and TeacherAge are non key attributes. Here if we want to retrieve TeacherAge then with the help of TeacherID we can retrieve. Hence TeacherAge is dependent on TeacherID and not on the other candidate key (SubjectName). Since the rule says all non-key attributes should be dependent on the primary key and not on the proper subset of any candidate key of the table. So we can spilt the tables as below.
TeacherID | SubjectName |
1001 | Database concepts |
1002 | C++ |
1003 | Java |
TeacherID | TeacherQualification | TeacherAge |
1001 | B.Tech | 42 |
1002 | B.Tech | 45 |
1003 | Masters in Computer Science | 35 |
3. Third Normal Form (3 NF)
A table is said to be in 3NF if it is complies with all the rules of Second Normal form (2NF) and it does not have transitive functional dependency.
Now, what is transitive functional dependency? Let’s say if A is functionally dependent on B & B is functionally dependent on C, then C is transitively dependent on A via B.
Example
Let say we have a customer table
CustID | CustName | CustPhone | CustCity | CustZipCode |
111 | Vikas | 9855555555 | Pimpri | 33344 |
222 | Dhruv | 8888855445 | Goregaon | 44433 |
333 | Akash | 8885565215 | Dadar | 44422 |
Here the CustCity is dependent on CustZipCode & CustZipCode depends on CustID.
In order to make this table in 3NF we break the table into 2.
CustID | CustName | CustPhone | CustZipCode |
111 | Vikas | 9855555555 | 33344 |
222 | Dhruv | 8888855445 | 44433 |
333 | Akash | 8885565215 | 44422 |
CustZipCode | CustCity |
33344 | Pimpri |
44433 | Goregaon |
44422 | Dadar |
After the dividing the Customer table into two, each column in the customer table is dependent on primary key.
InterView Questions on Normalization
1. What is First Normal Form?
Answer – A table is said to be in first normal form if the columns contain only atomic values and there are no repeating groups.
2. What is Second Normal Form?
Answer – A table is said to be in second normal form if it satisfies all rules of the first normal form and all non key attributes are functionally dependent on the primary key.
3. What is Third Normal Form?
A table is said to be in third normal form if it satisfies all the rules of second normal form and no transitive dependency exists.
That’s end of tutorial on normalization in dbms.