sqlpostgresqlgrouping-setsaggregation

SQL GROUPING SETS: Subtotals and Grand Totals in One Query

Use GROUPING SETS to compute several grouping levels in a single pass and tell subtotal rows apart from real NULLs.

2 min readReferencesql · postgresql · grouping-sets · aggregation · group-by

GROUPING SETS lets a single query compute several aggregation levels at once — per status, per user, and a grand total — where you used to glue multiple GROUP BY blocks together with UNION ALL. It is not only shorter but faster: the table is scanned once.

The problem: a stack of UNIONs

Say that over orders(id, user_id, amount, status, created_at) you want three slices: totals by status, totals by user, and a grand total. The naive answer concatenates three queries.

SELECT status, NULL::int AS user_id, SUM(amount) AS total
FROM orders GROUP BY status
UNION ALL
SELECT NULL, user_id, SUM(amount)
FROM orders GROUP BY user_id
UNION ALL
SELECT NULL, NULL, SUM(amount)
FROM orders;

The downsides are obvious: the table is scanned three times, the query is long, and adding a fourth slice is easy to forget.

The fix: GROUPING SETS

The same result in one pass. Each tuple in parentheses is a separate grouping level; empty parentheses () produce the grand total.

SELECT status, user_id, SUM(amount) AS total
FROM orders
GROUP BY GROUPING SETS ((status), (user_id), ());
  • (status) — totals by status;
  • (user_id) — totals by user;
  • () — a single grand-total row.

For columns not part of a given set, the engine fills in NULL. That is exactly why the status-subtotal rows carry a NULL user_id.

GROUPING(): subtotal vs real NULL

Here is the classic trap: what if status genuinely contains NULL values (an order without a status)? Then you cannot eyeball the difference between a subtotal row and a row of real data. That is what GROUPING(col) is for: it returns 1 when the column is rolled up in the current set, and 0 when the value is real.

SELECT
  CASE WHEN GROUPING(status) = 1 THEN 'ALL STATUSES' ELSE status END AS status,
  CASE WHEN GROUPING(user_id) = 1 THEN 'ALL USERS' ELSE user_id::text END AS usr,
  SUM(amount) AS total
FROM orders
GROUP BY GROUPING SETS ((status), (user_id), ());

Gotcha: GROUPING(col) = 1 means "this column is not in the current set", not "the value is NULL". Do not confuse it with col IS NULL — they are different checks.

ROLLUP and CUBE as shorthands

GROUPING SETS is the primitive; ROLLUP and CUBE just generate common sets. ROLLUP (dept, manager_id) produces a hierarchy: by (dept, manager_id), then (dept), then the grand total — handy for reports with per-department subtotals.

SELECT dept, manager_id, SUM(salary) AS payroll, GROUPING(dept, manager_id) AS g
FROM employees
GROUP BY ROLLUP (dept, manager_id)
ORDER BY dept NULLS LAST, manager_id NULLS LAST;

CUBE (a, b) builds every combination: (a,b), (a), (b), () — a full cross-tab. GROUPING(dept, manager_id) returns a bitmask (for example 2 = binary 10 — only the second column is rolled up).

Compatibility

  • PostgreSQL — fully supports GROUPING SETS, ROLLUP, CUBE and GROUPING() since 9.5.
  • MySQL — only WITH ROLLUP (GROUP BY status, user_id WITH ROLLUP) and a GROUPING() function since 8.0; no general GROUPING SETS/CUBE.
  • ClickHouse — supports GROUP BY GROUPING SETS, ROLLUP, CUBE and grouping(), but row order is not guaranteed without an explicit ORDER BY.

Practical takeaway: when you need several slices at once, reach for GROUPING SETS instead of a stack of UNION ALL, and always tag subtotal rows with GROUPING() so you never mistake them for real NULLs.

Practice on real tasks

Solve tasks in the SQL trainer with instant grading and hints.

Open trainer