🗃️

Database Normalization Overview

Jul 10, 2025

Overview

This lecture introduces normalization in databases, focusing on how it reduces data redundancy and prevents issues such as insertion, deletion, and update anomalies.

Problems with Unnormalized Databases

  • Data redundancy is the repetition of similar data in multiple places in a database.
  • Redundant data increases database size and wastes storage space.
  • Redundancy leads to three main problems: insertion anomaly, deletion anomaly, and update (modification) anomaly.

Anomalies Due to Redundancy

  • Insertion anomaly: Repeating branch information for each new student increases data repetition.
  • Deletion anomaly: Deleting the last student entry also deletes important branch information unintentionally.
  • Update anomaly: Changing the head of department (HOD) requires updating every student record, risking inconsistent data if some updates are missed.

How Normalization Solves These Problems

  • Normalization organizes data into multiple related tables to minimize redundancy.
  • Student and branch information are split into separate tables, connected by a common key (branch name).
  • Only branch name is repeated in the student table; other branch details are stored once in the branch table.
  • Updates to branch information are made in one place, reducing inconsistencies.
  • Inserting or deleting student records does not affect stored branch information.

Normal Forms in Normalization

  • Three basic normal forms exist: First Normal Form (1NF), Second Normal Form (2NF), and Third Normal Form (3NF).
  • An advanced form, BCNF (Boyce-Codd Normal Form), offers further normalization.
  • Each normal form addresses specific types of data redundancy and relationship issues.

Key Terms & Definitions

  • Normalization — Organizing data into related tables to minimize redundancy.
  • Data redundancy — Unnecessary repetition of data within a database.
  • Insertion anomaly — Problems caused by having to repeat data when inserting new records.
  • Deletion anomaly — Loss of important data when related data is deleted.
  • Update anomaly — Risks of inconsistent data when updating repeated information.
  • Normal forms — Structured steps of normalization: 1NF, 2NF, 3NF, and BCNF.

Action Items / Next Steps

  • Study the details of First Normal Form (1NF), Second Normal Form (2NF), Third Normal Form (3NF), and BCNF in upcoming lessons.