🗃️

Database Normalization and Keys

Sep 8, 2025

Overview

This lecture covers text-based relational notation in databases, the importance of referential integrity, how to test for relations, and the principles behind the first, second, and third normal forms in normalization.

Text-Based Relational Notation

  • Use underlining for primary keys and italics for foreign keys in notation examples.
  • Foreign keys connect related tables, e.g., employee to department.
  • Consistent naming for keys is recommended for clarity.

Referential Integrity

  • Referential integrity ensures foreign key values match existing primary key values in referenced tables.
  • Prevents data entry errors (e.g., misspelling department names).
  • Maintains database accuracy and prevents invalid data.

Testing for Relations

  • The "one and only one" rule helps determine if an attribute is valid for a relation.
  • Say it out loud: e.g., does a student have one and only one first name? Yes; one and only one class? No.
  • If not "one and only one," the design may need adjustment.

Normalization & Normal Forms

  • Normalization's main goal is to reduce or eliminate modification anomalies (inconsistencies after updates).
  • Modification anomalies occur when updates in one place do not propagate everywhere.
  • Duplication in tables signals potential design issues.

First Normal Form (1NF)

  • Table is a relation: each field contains atomic values, and every attribute depends on the primary key.
  • Violations occur if rows contain repeating groups or multiple values.
  • Each record should relate to one and only one key value.

Second Normal Form (2NF)

  • Table must first be in 1NF.
  • All non-key attributes must depend on the entire (compound) primary key.
  • Violations occur when non-key fields depend on part of a compound primary key.
  • Usually only applies to tables with compound keys.

Third Normal Form (3NF)

  • Table must first be in 2NF.
  • No transitive dependencies: non-key attributes depend only on the primary key, not on other non-key attributes.
  • Example: if "building fee" depends on "building," and "building" depends on "student," that's a 3NF violation.

Key Terms & Definitions

  • Primary Key — a unique identifier for table records, underlined in notation.
  • Foreign Key — an attribute in one table that refers to the primary key in another, italicized in notation.
  • Referential Integrity — enforcing valid links between foreign keys and referenced primary keys.
  • Modification Anomaly — inconsistencies from partial updates in redundant data.
  • Normalization — structuring tables to reduce redundancy and anomalies.
  • First Normal Form (1NF) — all fields contain atomic values and directly relate to the key.
  • Second Normal Form (2NF) — non-key attributes depend on the full compound primary key.
  • Third Normal Form (3NF) — non-key attributes depend only on the primary key, not other non-key attributes.
  • Transitive Dependency — a non-key attribute depending on another non-key attribute.

Action Items / Next Steps

  • Review examples on pages 166–171 for primary/foreign keys and normal form violations.
  • Practice applying the "one and only one" test to sample tables.
  • Read about functional and transitive dependencies for next class.