Data Modeling
Normalization (1NF–3NF, BCNF), denormalization, schema evolution
Normalization eliminates data redundancy by decomposing relations into canonical forms: 1NF (atomic values, no repeating groups), 2NF (no partial dependency on composite key), 3NF (no transitive dependency), and BCNF (every determinant is a candidate key). In practice, normalized schemas (3NF/BCNF) are ideal for OLTP write workloads, while denormalization — intentionally duplicating data — reduces JOIN count for read-heavy paths, accepting eventual consistency or application-level sync logic.
Key Points
- 1NF violation example: storing "phone1, phone2" in a single column; fix by extracting a phones table with a foreign key.
- 2NF applies only to tables with composite primary keys: if OrderID+ProductID is the PK but ProductName depends only on ProductID, that is a partial dependency — extract a Products table.
- 3NF violation: Employee(EmpID, DeptID, DeptName) — DeptName is transitively dependent on EmpID via DeptID; extract Departments table.
- BCNF is stricter than 3NF; a table in 3NF but not BCNF has a non-trivial functional dependency where the determinant is not a superkey.
- Denormalization patterns: embedding (sub-documents in MongoDB), materialized views (PostgreSQL MATERIALIZED VIEW), pre-computed aggregates, and summary tables.
- Schema evolution strategies: backward-compatible changes (add nullable column, add index) vs breaking changes (rename column, change type); use expand-contract pattern for zero-downtime migrations.
- Temporal modeling: valid-time tables (when the fact was true in reality) vs transaction-time tables (when the DB recorded it); bitemporal tables track both dimensions — used in finance and healthcare.
- Star schema: fact table at center (Orders, PageViews) surrounded by dimension tables (Date, Customer, Product); optimizes GROUP BY / aggregate queries in data warehouses.
Real-World Example
Booking.com runs a highly normalized OLTP PostgreSQL schema for reservation data, then asynchronously materializes denormalized summary rows into a separate analytics schema. Shopify uses Rails migrations with strict expand-contract discipline — every migration must be backward-compatible for the zero-downtime deploy window.