GROUPING SETS lets a single query compute several aggregation levels at once — per status, per user, and a grand total — where you used to glue multiple GROUP BY blocks together with UNION ALL. It is not only shorter but faster: the table is scanned once.
The problem: a stack of UNIONs
Say that over orders(id, user_id, amount, status, created_at) you want three slices: totals by status, totals by user, and a grand total. The naive answer concatenates three queries.
SELECT status, NULL::int AS user_id, SUM(amount) AS total
FROM orders GROUP BY status
UNION ALL
SELECT NULL, user_id, SUM(amount)
FROM orders GROUP BY user_id
UNION ALL
SELECT NULL, NULL, SUM(amount)
FROM orders;
The downsides are obvious: the table is scanned three times, the query is long, and adding a fourth slice is easy to forget.
The fix: GROUPING SETS
The same result in one pass. Each tuple in parentheses is a separate grouping level; empty parentheses () produce the grand total.
SELECT status, user_id, SUM(amount) AS total
FROM orders
GROUP BY GROUPING SETS ((status), (user_id), ());
(status) — totals by status;
(user_id) — totals by user;
() — a single grand-total row.
For columns not part of a given set, the engine fills in NULL. That is exactly why the status-subtotal rows carry a NULL user_id.
GROUPING(): subtotal vs real NULL
Here is the classic trap: what if status genuinely contains NULL values (an order without a status)? Then you cannot eyeball the difference between a subtotal row and a row of real data. That is what GROUPING(col) is for: it returns 1 when the column is rolled up in the current set, and 0 when the value is real.
SELECT
CASE WHEN GROUPING(status) = 1 THEN 'ALL STATUSES' ELSE status END AS status,
CASE WHEN GROUPING(user_id) = 1 THEN 'ALL USERS' ELSE user_id::text END AS usr,
SUM(amount) AS total
FROM orders
GROUP BY GROUPING SETS ((status), (user_id), ());
Gotcha: GROUPING(col) = 1 means "this column is not in the current set", not "the value is NULL". Do not confuse it with col IS NULL — they are different checks.
ROLLUP and CUBE as shorthands
GROUPING SETS is the primitive; ROLLUP and CUBE just generate common sets. ROLLUP (dept, manager_id) produces a hierarchy: by (dept, manager_id), then (dept), then the grand total — handy for reports with per-department subtotals.
SELECT dept, manager_id, SUM(salary) AS payroll, GROUPING(dept, manager_id) AS g
FROM employees
GROUP BY ROLLUP (dept, manager_id)
ORDER BY dept NULLS LAST, manager_id NULLS LAST;
CUBE (a, b) builds every combination: (a,b), (a), (b), () — a full cross-tab. GROUPING(dept, manager_id) returns a bitmask (for example 2 = binary 10 — only the second column is rolled up).
Compatibility
- PostgreSQL — fully supports
GROUPING SETS, ROLLUP, CUBE and GROUPING() since 9.5.
- MySQL — only
WITH ROLLUP (GROUP BY status, user_id WITH ROLLUP) and a GROUPING() function since 8.0; no general GROUPING SETS/CUBE.
- ClickHouse — supports
GROUP BY GROUPING SETS, ROLLUP, CUBE and grouping(), but row order is not guaranteed without an explicit ORDER BY.
Practical takeaway: when you need several slices at once, reach for GROUPING SETS instead of a stack of UNION ALL, and always tag subtotal rows with GROUPING() so you never mistake them for real NULLs.
GROUPING SETSlets a single query compute several aggregation levels at once — per status, per user, and a grand total — where you used to glue multipleGROUP BYblocks together withUNION ALL. It is not only shorter but faster: the table is scanned once.The problem: a stack of UNIONs
Say that over
orders(id, user_id, amount, status, created_at)you want three slices: totals by status, totals by user, and a grand total. The naive answer concatenates three queries.SELECT status, NULL::int AS user_id, SUM(amount) AS total FROM orders GROUP BY status UNION ALL SELECT NULL, user_id, SUM(amount) FROM orders GROUP BY user_id UNION ALL SELECT NULL, NULL, SUM(amount) FROM orders;The downsides are obvious: the table is scanned three times, the query is long, and adding a fourth slice is easy to forget.
The fix: GROUPING SETS
The same result in one pass. Each tuple in parentheses is a separate grouping level; empty parentheses
()produce the grand total.SELECT status, user_id, SUM(amount) AS total FROM orders GROUP BY GROUPING SETS ((status), (user_id), ());(status)— totals by status;(user_id)— totals by user;()— a single grand-total row.For columns not part of a given set, the engine fills in
NULL. That is exactly why the status-subtotal rows carry aNULLuser_id.GROUPING(): subtotal vs real NULL
Here is the classic trap: what if
statusgenuinely containsNULLvalues (an order without a status)? Then you cannot eyeball the difference between a subtotal row and a row of real data. That is whatGROUPING(col)is for: it returns1when the column is rolled up in the current set, and0when the value is real.SELECT CASE WHEN GROUPING(status) = 1 THEN 'ALL STATUSES' ELSE status END AS status, CASE WHEN GROUPING(user_id) = 1 THEN 'ALL USERS' ELSE user_id::text END AS usr, SUM(amount) AS total FROM orders GROUP BY GROUPING SETS ((status), (user_id), ());ROLLUP and CUBE as shorthands
GROUPING SETSis the primitive;ROLLUPandCUBEjust generate common sets.ROLLUP (dept, manager_id)produces a hierarchy: by(dept, manager_id), then(dept), then the grand total — handy for reports with per-department subtotals.SELECT dept, manager_id, SUM(salary) AS payroll, GROUPING(dept, manager_id) AS g FROM employees GROUP BY ROLLUP (dept, manager_id) ORDER BY dept NULLS LAST, manager_id NULLS LAST;CUBE (a, b)builds every combination:(a,b),(a),(b),()— a full cross-tab.GROUPING(dept, manager_id)returns a bitmask (for example2= binary10— only the second column is rolled up).Compatibility
GROUPING SETS,ROLLUP,CUBEandGROUPING()since 9.5.WITH ROLLUP(GROUP BY status, user_id WITH ROLLUP) and aGROUPING()function since 8.0; no generalGROUPING SETS/CUBE.GROUP BY GROUPING SETS,ROLLUP,CUBEandgrouping(), but row order is not guaranteed without an explicitORDER BY.Practical takeaway: when you need several slices at once, reach for
GROUPING SETSinstead of a stack ofUNION ALL, and always tag subtotal rows withGROUPING()so you never mistake them for realNULLs.