Imagine a university table where one cell lists all of a student’s courses separated by commas, and phone and address are mixed together. If someone changes their last name or moves, you would have to find and fix that chain in many places. Normalization is the deliberate splitting of data into tables and relationships to remove duplication, prevent contradictions, and simplify database maintenance.
The 1NF rule is simple: each cell holds one indivisible value from a domain. No “comma-separated lists” in one field — use separate rows or a separate table for that.
-- Bad: 1NF violation — multiple phones in one cell
CREATE TABLE students_bad (
id SERIAL PRIMARY KEY,
full_name TEXT NOT NULL,
phones TEXT -- '+7999..., +7911...'
);
-- Good: 1NF — one phone per row
CREATE TABLE students (
id SERIAL PRIMARY KEY,
full_name TEXT NOT NULL
);
CREATE TABLE student_phones (
id SERIAL PRIMARY KEY,
student_id INT NOT NULL REFERENCES students(id),
phone TEXT NOT NULL
);
2NF matters when the primary key is composite. Every non-key attribute must depend on the entire key, not part of it. If the key is the pair “student + course,” then course duration depends only on the course: storing it on the enrollment row means copying the same value and risking drift.
-- Bad: partial dependency on a composite key
CREATE TABLE enrollments_bad (
student_id INT NOT NULL,
course_id INT NOT NULL,
enrolled_at DATE NOT NULL,
course_duration_hours INT NOT NULL, -- duplicate per student
PRIMARY KEY (student_id, course_id)
);
-- Good: duration stored once, in the course catalog
CREATE TABLE courses (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
duration_hours INT NOT NULL
);
CREATE TABLE enrollments (
student_id INT NOT NULL,
course_id INT NOT NULL REFERENCES courses(id),
enrolled_at DATE NOT NULL,
PRIMARY KEY (student_id, course_id)
);
3NF forbids transitive dependencies: a non-key column must not depend on another non-key column. If a student has `city_id` and `city_name` is duplicated next to it, the city name depends on the city identifier, not on the student’s primary key. The name should live in a separate lookup table.
-- Bad: city_name depends transitively on city_id, not on student id
CREATE TABLE students_bad_city (
id SERIAL PRIMARY KEY,
full_name TEXT NOT NULL,
city_id INT NOT NULL,
city_name TEXT NOT NULL -- duplicate lookup
);
-- Good: city lookup; student has only city_id
CREATE TABLE cities (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL
);
CREATE TABLE students (
id SERIAL PRIMARY KEY,
full_name TEXT NOT NULL,
city_id INT NOT NULL REFERENCES cities(id)
);