Normalization

Here’s a detailed guide based on the YouTube video from Decomplexify that explains Database Normalization from 1NF to 5NF in a simple, practical, and example-driven way.


📘 Database Normalization: A Complete Guide (1NF to 5NF)

🔍 What Is Database Normalization?

Normalization is the process of organizing data in a database so that it:

  • Avoids redundancy (duplicate data)

  • Maintains data integrity

  • Prevents contradictions (e.g., two birthdates for one customer)

A normalized database ensures:

  • Only one version of truth

  • Easier data management

  • Fewer anomalies (insertion, update, deletion)


🛠 Why Normalize?

Without normalization:

  • Contradictory data can exist (e.g., one customer with two birthdates)

  • It becomes hard to trust or query data accurately

  • It leads to inefficient, error-prone database design


✅ Normal Forms Explained

Each normal form is a level of "safety" against bad design — like safety levels for a bridge. Each form builds upon the previous.


1️⃣ First Normal Form (1NF)

📌 Rules:

  • No use of row order to store meaning

  • No mixing of data types in a column

  • Every table must have a primary key

  • No repeating groups or multivalued columns

🧠 Examples:

  • Don’t use row position to imply rank — store height/rank in a column.

  • All data in a column must be of the same type (e.g., only integers in a height column).

  • Include a primary key to uniquely identify each row.

  • Repeating groups like "arrows, shields, coins" should be broken into multiple rows.


2️⃣ Second Normal Form (2NF)

📌 Rule:

Each non-key attribute must depend on the whole primary key, not just part of it.

🧠 Example:

If a table’s key is (Player, Item_Type):

  • Item_Quantity depends on both → ✅ OK

  • Player_Rating depends only on Player → ❌ Breaks 2NF

🛠 Solution:

Move partial dependencies to separate tables. For example:

  • Create a Players table to store Player_ID, Player_Rating, etc.

  • Keep Player_Inventory focused only on items and quantities.


3️⃣ Third Normal Form (3NF)

📌 Rule:

Every non-key attribute must depend only on the primary key, no transitive dependencies.

🧠 Example:

  • Player_IDSkill_Level

  • Skill_LevelPlayer_Rating

This transitive chain means Player_Rating indirectly depends on Player_ID, violating 3NF.

🛠 Solution:

  • Remove Player_Rating from Player table

  • Create a new table: Skill_LevelsRating

Summary of 3NF:
Every attribute should depend on the key, the whole key, and nothing but the key.


⚖️ Boyce-Codd Normal Form (BCNF)

📌 Rule:

A stronger version of 3NF — removes anomalies caused by overlapping candidate keys.

In practice, 3NF ≈ BCNF for most systems.


4️⃣ Fourth Normal Form (4NF)

📌 Rule:

Only multivalued dependencies on the primary key are allowed.

🧠 Example:

In a birdhouse store:

  • Model → multiple available Colors

  • Model → multiple available Styles

Combining them in one table allows inconsistent combos (e.g., green allowed only with bungalow). That violates 4NF.

🛠 Solution:

  • Split into two tables:

    • Model_Colors_Available

    • Model_Styles_Available


5️⃣ Fifth Normal Form (5NF)

📌 Rule:

A table must not be representable as a join of multiple other tables unless all rows logically follow from those joins.

🧠 Example:

Ice cream preferences:

  • Who likes which brands

  • Who likes which flavors

  • Which brands offer which flavors

Trying to store all valid combinations in one table risks missed updates or inconsistent logic.

🛠 Solution:

Split into three tables:

  • Person_Brands_Liked

  • Person_Flavors_Liked

  • Brand_Flavors_Offered

Let SQL JOINs infer valid combinations dynamically.


🚨 Anomalies Addressed by Normalization

Anomaly

What It Means

Caused By

Insertion

Can’t add data unless related data exists

Violates 2NF

Update

Partial updates cause inconsistencies

Violates 2NF/3NF

Deletion

Deleting data removes unrelated needed info

Violates 2NF

Redundancy

Repeated data increases storage and error risk

Violates 1NF

Contradictions

Same entity has different data in different places

Violates 3NF+


🔁 Final Summary of Rules

Normal Form

Rule

1NF

No repeating groups, no mixed types, use a primary key

2NF

Every non-key attribute depends on the entire primary key

3NF

No transitive dependencies on non-key attributes

BCNF

Every determinant must be a candidate key

4NF

No multivalued dependencies except on the key

5NF

Cannot be decomposed into smaller tables with the same join result


✅ Key Takeaways

  • Normalization prevents bad data and anomalies.

  • Each normal form addresses a specific type of design issue.

  • Most real-world systems normalize up to 3NF or BCNF.

  • Higher forms (4NF and 5NF) are important in complex, multi-value systems.


Let me know if you want this guide as a PDF, Notion doc, or part of a technical documentation set!

Updated on