Database Schema Design Patterns: Polymorphism, Hierarchies, Audit Logs, Soft Deletes, and SCD

TalkingSchema TeamBuilding the future of database design

March 29, 20266 min read

Schema design isn't just about getting data to persist. It's about modeling the shape of your domain — the things that exist, the relationships between them, and the constraints that keep it all consistent. A good schema makes queries fast, business logic obvious, and migrations safe. A bad one makes all three miserable.

These are five patterns that appear in almost every non-trivial system, with notes on when to reach for each one.

1. Polymorphic associations (done right)

The naive version of polymorphism — a commentable_type + commentable_id column pair — works until you need a foreign key or an index. Instead, use a union table:

-- Each commentable entity gets its own join table
CREATE TABLE post_comments (
  id         BIGSERIAL PRIMARY KEY,
  post_id    BIGINT NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
  comment_id BIGINT NOT NULL REFERENCES comments(id) ON DELETE CASCADE,
  UNIQUE (post_id, comment_id)
);

CREATE TABLE photo_comments (
  id         BIGSERIAL PRIMARY KEY,
  photo_id   BIGINT NOT NULL REFERENCES photos(id) ON DELETE CASCADE,
  comment_id BIGINT NOT NULL REFERENCES comments(id) ON DELETE CASCADE,
  UNIQUE (photo_id, comment_id)
);

You get real referential integrity, proper indexes, and queries that the query planner can reason about. The cost is more tables — which is almost always worth it.

2. Hierarchical data with closure tables

Self-referential parent_id columns are easy to write and brutal to query. Getting all descendants of a node requires a recursive CTE or application-level recursion. The closure table pattern pre-computes every ancestor-descendant relationship:

CREATE TABLE categories (
  id   BIGSERIAL PRIMARY KEY,
  name TEXT NOT NULL
);

-- Every (ancestor, descendant) pair, including self-references (depth = 0)
CREATE TABLE category_tree (
  ancestor_id   BIGINT NOT NULL REFERENCES categories(id),
  descendant_id BIGINT NOT NULL REFERENCES categories(id),
  depth         INT NOT NULL DEFAULT 0,
  PRIMARY KEY (ancestor_id, descendant_id)
);

Getting all descendants of category 42 becomes a simple join:

SELECT c.*
FROM categories c
JOIN category_tree t ON c.id = t.descendant_id
WHERE t.ancestor_id = 42 AND t.depth > 0;

The trade-off: writes are more expensive (you insert N rows per new node, where N is the depth of the tree), but reads are O(1) regardless of tree depth.

3. Audit logs with event sourcing

Instead of updating rows in place, append only and reconstruct state from history. This is particularly useful for financial systems, compliance requirements, or anything where you need to ask "what did this look like on March 3rd?"

CREATE TABLE account_events (
  id           BIGSERIAL PRIMARY KEY,
  account_id   BIGINT    NOT NULL REFERENCES accounts(id),
  event_type   TEXT      NOT NULL, -- 'created', 'credited', 'debited', 'closed'
  amount_cents BIGINT,             -- NULL for non-monetary events
  occurred_at  TIMESTAMPTZ NOT NULL DEFAULT now(),
  metadata     JSONB
);

-- Current balance is a derived value, not stored
CREATE VIEW account_balances AS
SELECT
  account_id,
  SUM(CASE event_type
    WHEN 'credited' THEN amount_cents
    WHEN 'debited'  THEN -amount_cents
    ELSE 0
  END) AS balance_cents
FROM account_events
GROUP BY account_id;

The event log is the source of truth. The balance is a projection. You can replay history, add new projections later, and audit every state transition for free.

4. Soft deletes with `deleted_at`

Rather than DELETEing rows, set a deleted_at timestamp. Queries filter it out; the data stays recoverable.

Approach	Recovery	FK integrity	Query simplicity
Hard delete	✗	Cascades automatically	Simple
Soft delete	✓	Must filter in every query	Requires discipline
Archive table	✓	Maintained	Moderate

The catch with soft deletes is that every query must include the filter — easy to forget, especially in ORM-generated queries. A database view or row-level security policy can enforce it:

-- All queries against this view automatically exclude soft-deleted rows
CREATE VIEW active_users AS
SELECT * FROM users WHERE deleted_at IS NULL;

Reserve soft deletes for entities that users expect to be recoverable (posts, documents, records). Don't use them for join tables or events — those should be hard deleted or append-only, respectively.

5. Slowly changing dimensions (SCD Type 2)

When historical accuracy matters — for analytics, audits, or reporting — you need to track how data changed over time, not just what it is now. Type 2 SCDs do this by adding validity windows to every version of a record:

CREATE TABLE product_prices (
  id           BIGSERIAL PRIMARY KEY,
  product_id   BIGINT NOT NULL REFERENCES products(id),
  price_cents  INT    NOT NULL,
  currency     CHAR(3) NOT NULL DEFAULT 'USD',
  valid_from   DATE   NOT NULL,
  valid_until  DATE,                    -- NULL means "current"
  is_current   BOOLEAN NOT NULL DEFAULT TRUE,
  CONSTRAINT one_current_price_per_product
    EXCLUDE USING GIST (product_id WITH =) WHERE (is_current = TRUE)
);

To find the price on a specific date:

SELECT price_cents
FROM product_prices
WHERE product_id = $1
  AND valid_from <= $2
  AND (valid_until IS NULL OR valid_until > $2);

The EXCLUDE USING GIST constraint ensures only one price is marked is_current per product, enforced at the database level.

A note on schema evolution

These patterns are a starting point, not a destination. Real schemas evolve — columns get added, tables split, constraints tighten. The patterns above are designed to make evolution safer: append-only events don't need migrations; closure tables can be rebuilt; SCD rows are never updated, only closed.

If you're designing a new schema from scratch, try describing the domain in plain language first. What are the things? What are the relationships? What constraints are always true? The code follows from the answers — not the other way around.

1. Polymorphic associations (done right)​

2. Hierarchical data with closure tables​

3. Audit logs with event sourcing​

4. Soft deletes with deleted_at​

5. Slowly changing dimensions (SCD Type 2)​

A note on schema evolution​