Partial Indexes in PostgreSQL: Indexing Only the Hot Rows

Q: What a partial index is

A partial index is an ordinary index with a WHERE predicate. Only rows that satisfy the condition make it into the tree. If the table holds 50M orders but only a few thousand are pending, the second index is orders of magnitude smaller. Inserting a completed order never touches it — the row fails the predicate. That saves both disk space and write time. For the planner to use it, the query condition must logically match the index predicate (or be a subset of it):

Q: When it doesn't help, and engine differences

A partial index is not a silver bullet. It's useless when: - queries scan the whole range of a column rather than one subset; - the predicate covers most of the table (the "partial" benefit evaporates); - the query predicate doesn't match the index predicate — the planner just ignores it. Differences across engines: - PostgreSQL — full support for partial indexes and partial UNIQUE via CREATE INDEX ... WHERE. The reference implementation. - SQLite — supports the same CREATE INDEX ... WHERE syntax, including partial UNIQUE. - MySQL/InnoDB — no partial indexes. Workarounds: a generated column plus an index on it, or a separate "hot rows" table. Note: KEY (col(10)) in MySQL is a prefix index (p

Most indexes cover the whole table: every row, every value. But queries are rarely spread evenly across your data. You constantly hit pending orders, active users, open tasks — while a historical tail of millions of finished rows just sits there as dead weight. A partial index indexes only the subset you actually care about. The payoff: a smaller index that updates faster and is far more likely to stay cached.

What a partial index is

A partial index is an ordinary index with a WHERE predicate. Only rows that satisfy the condition make it into the tree.

-- Full index: every row of orders
CREATE INDEX idx_orders_status ON orders (status);

-- Partial index: only the "hot" orders
CREATE INDEX idx_orders_pending
  ON orders (created_at)
  WHERE status = 'pending';

If the table holds 50M orders but only a few thousand are pending, the second index is orders of magnitude smaller. Inserting a completed order never touches it — the row fails the predicate. That saves both disk space and write time.

For the planner to use it, the query condition must logically match the index predicate (or be a subset of it):

-- Uses idx_orders_pending
SELECT id, created_at
FROM orders
WHERE status = 'pending'
ORDER BY created_at;

Why it's faster and cheaper

The wins come from plain arithmetic — the index physically holds fewer entries:

Smaller size. Fewer pages on disk, higher odds the whole index fits in shared_buffers.
Faster writes. INSERT/UPDATE of rows outside the predicate don't update the index at all.
Cleaner statistics. The planner estimates cardinality over a narrow subset and misjudges less often.
Cheaper maintenance. VACUUM and rebuilds work over a much smaller dataset.

The classic case is a job queue. Workers only read unfinished rows:

CREATE INDEX idx_jobs_queue
  ON jobs (priority DESC, created_at)
  WHERE state IN ('queued', 'running');

Even with hundreds of millions of completed jobs in the table, the queue index stays tiny and hot.

Gotcha: the predicate must be deterministic and immutable. You cannot write WHERE created_at > now() - interval '7 days' — now() changes, so yesterday's index would cover the wrong rows tomorrow. Use comparisons against static values: WHERE status = 'pending', WHERE deleted_at IS NULL.

Partial UNIQUE for soft deletes

The most powerful use is a partial UNIQUE constraint. Say users must have a unique email, but you never hard-delete rows — you stamp deleted_at instead. A plain UNIQUE (email) would forbid registering a new account on a deleted user's email.

-- Won't work: an old deleted row blocks the email forever
CREATE UNIQUE INDEX ON users (email);

-- Uniqueness only among live rows
CREATE UNIQUE INDEX idx_users_email_active
  ON users (email)
  WHERE deleted_at IS NULL;

Now you can keep any number of deleted rows sharing an email, while among active rows (deleted_at IS NULL) the email stays unique. The same trick enforces "one default per user":

-- At most one primary address per user
CREATE UNIQUE INDEX idx_one_primary_address
  ON addresses (user_id)
  WHERE is_primary = true;

That's an integrity rule that's nearly impossible to express with a regular UNIQUE constraint without triggers.

When it doesn't help, and engine differences

A partial index is not a silver bullet. It's useless when:

queries scan the whole range of a column rather than one subset;
the predicate covers most of the table (the "partial" benefit evaporates);
the query predicate doesn't match the index predicate — the planner just ignores it.

Differences across engines:

PostgreSQL — full support for partial indexes and partial UNIQUE via CREATE INDEX ... WHERE. The reference implementation.
SQLite — supports the same CREATE INDEX ... WHERE syntax, including partial UNIQUE.
MySQL/InnoDB — no partial indexes. Workarounds: a generated column plus an index on it, or a separate "hot rows" table. Note: KEY (col(10)) in MySQL is a prefix index (part of the value), not a partial one.
SQL Server — has an equivalent called a filtered index: CREATE INDEX ... WHERE.
ClickHouse — a different breed of engine; instead of predicate indexes you use partitioning (PARTITION BY) and data-skipping indexes.

A practical recipe: find a query that filters over and over on the same narrow condition (status, deleted_at, is_active), and lift that condition into the index WHERE. Confirm with EXPLAIN (ANALYZE) that the planner picks the index and that on-disk size dropped. It's often the cheapest speedup available.

Partial Indexes in PostgreSQL: Indexing Only the Hot Rows

What a partial index is

Why it's faster and cheaper

Partial UNIQUE for soft deletes

When it doesn't help, and engine differences

Related articles

Practice on real tasks