Signing you in…

Data Normalization: 1NF, 2NF, 3NF

Normalization: Bringing order to data

Imagine a university table where one cell lists all of a student’s courses separated by commas, and phone and address are mixed together. If someone changes their last name or moves, you would have to find and fix that chain in many places. Normalization is the deliberate splitting of data into tables and relationships to remove duplication, prevent contradictions, and simplify database maintenance.

Why bother? Three classes of anomalies
Open a card — what breaks without normalization
Insertion anomaly
🗑️
Deletion anomaly
✏️
Update anomaly
The normalization path: 1NF → 2NF → 3NF
Click the steps — decomposition order
1️⃣
1NF
Atomicity — remove lists inside cells
2️⃣
2NF
Whole key — remove partial dependencies
3️⃣
3NF
Only from the key — remove transitive dependencies
First normal form (1NF)

The 1NF rule is simple: each cell holds one indivisible value from a domain. No “comma-separated lists” in one field — use separate rows or a separate table for that.

1NF fix — “before” version
sql
1
-- Bad: 1NF violation — multiple phones in one cell
2
CREATE TABLE students_bad (
3
    id          SERIAL PRIMARY KEY,
4
    full_name   TEXT NOT NULL,
5
    phones      TEXT  -- '+7999..., +7911...'
6
);
1NF fix — “after” version
sql
1
-- Good: 1NF — one phone per row
2
CREATE TABLE students (
3
    id        SERIAL PRIMARY KEY,
4
    full_name TEXT NOT NULL
5
);
6
CREATE TABLE student_phones (
7
    id         SERIAL PRIMARY KEY,
8
    student_id INT NOT NULL REFERENCES students(id),
9
    phone      TEXT NOT NULL
10
);
Second normal form (2NF)

2NF matters when the primary key is composite. Every non-key attribute must depend on the entire key, not part of it. If the key is the pair “student + course,” then course duration depends only on the course: storing it on the enrollment row means copying the same value and risking drift.

2NF fix — “before” version
sql
1
-- Bad: partial dependency on a composite key
2
CREATE TABLE enrollments_bad (
3
    student_id   INT NOT NULL,
4
    course_id    INT NOT NULL,
5
    enrolled_at  DATE NOT NULL,
6
    course_duration_hours INT NOT NULL, -- duplicate per student
7
    PRIMARY KEY (student_id, course_id)
8
);
2NF fix — “after” version
sql
1
-- Good: duration stored once, in the course catalog
2
CREATE TABLE courses (
3
    id          SERIAL PRIMARY KEY,
4
    title       TEXT NOT NULL,
5
    duration_hours INT NOT NULL
6
);
7
CREATE TABLE enrollments (
8
    student_id  INT NOT NULL,
9
    course_id   INT NOT NULL REFERENCES courses(id),
10
    enrolled_at DATE NOT NULL,
11
    PRIMARY KEY (student_id, course_id)
12
);
Third normal form (3NF)

3NF forbids transitive dependencies: a non-key column must not depend on another non-key column. If a student has `city_id` and `city_name` is duplicated next to it, the city name depends on the city identifier, not on the student’s primary key. The name should live in a separate lookup table.

3NF fix — “before” version
sql
1
-- Bad: city_name depends transitively on city_id, not on student id
2
CREATE TABLE students_bad_city (
3
    id         SERIAL PRIMARY KEY,
4
    full_name  TEXT NOT NULL,
5
    city_id    INT NOT NULL,
6
    city_name  TEXT NOT NULL  -- duplicate lookup
7
);
3NF fix — “after” version
sql
1
-- Good: city lookup; student has only city_id
2
CREATE TABLE cities (
3
    id        SERIAL PRIMARY KEY,
4
    name      TEXT NOT NULL
5
);
6
CREATE TABLE students (
7
    id        SERIAL PRIMARY KEY,
8
    full_name TEXT NOT NULL,
9
    city_id   INT NOT NULL REFERENCES cities(id)
10
);
Cheat sheet

1NF • 2NF • 3NF at a glance

Normalization is neither dogma nor endless splitting: it removes redundancy until insert, delete, and update anomalies stop threatening the data. For teaching OLTP schemas, 3NF is often a reasonable starting point.
1NF: indivisible value in each cell
2NF: non-key attributes depend on the whole primary key
3NF: non-key attributes depend only on the key, not on each other
Fewer redundant copies — less drift and easier maintenance