Normalization
Normalization is a systematic approach in relational database design that organizes data to minimize redundancy and dependency.
The primary goal is to divide large tables into smaller, manageable ones without losing data integrity, thus eliminating undesirable characteristics like insertion, update, and deletion anomalies.
First Normal Form (1NF)
A table is in the First Normal Form if:
- Atomicity: Each column contains atomic, indivisible values — no repeating groups or arrays.
- Uniqueness: Each record (row) is unique, identified by a primary key.
By ensuring 1NF, the table structure adheres to the basic relational model, making data retrieval and manipulation more efficient.
Second Normal Form (2NF)
A table is in the Second Normal Form if:
- It is already in 1NF.
- Full Functional Dependency: All non-key attributes are fully functionally dependent on the entire primary key.
This means that for tables with composite primary keys, every non-key attribute must depend on all components of the primary key, not just a part of it.
2NF eliminates partial dependencies, reducing data redundancy.
Third Normal Form (3NF)
A table is in the Third Normal Form if:
- It is already in 2NF.
- No Transitive Dependency: Non-key attributes are not dependent on other non-key attributes.
In other words, every non-key attribute should depend only on the primary key.
3NF removes transitive dependencies, further reducing redundancy and ensuring data integrity.
Example
Examples of achieving different normal forms in an Excel file from lecture.