UNION and UNION ALL stack the results of two or more SELECT statements vertically: one query's rows are appended below another's. The difference is a single keyword, but it drives both correctness and performance.
UNION vs UNION ALL
UNION ALL simply concatenates the row sets and checks nothing for repeats. UNION additionally removes duplicates, and to find them the engine has to sort or hash the entire result.
SELECT country FROM users
UNION ALL
SELECT country FROM users;
SELECT country FROM users
UNION
SELECT country FROM users;
Rule of thumb: if you already know the inputs don't overlap (say, you're stacking orders from different months or rows from separate shards), reach for UNION ALL. The extra dedup over millions of rows costs real CPU seconds and memory for the sort, sometimes spilling to disk. Reach for UNION only when repeats are genuinely possible and they bother you.
UNION ALL — no dedup, row order isn't guaranteed, but it's usually cheaper and steadier on memory.
UNION — dedup across the whole column set, not just the first column, so the cost grows with row width.
- A duplicate is a row that matches entirely; here
NULL equals NULL (unlike a normal = comparison).
- Parentheses let you combine branches:
(a UNION ALL b) UNION c first concatenates the two parts, then dedups everything together.
Column compatibility rules
You can only combine queries with the same number of columns and compatible types. Column names come from the first SELECT; the rest are ignored.
SELECT id, email FROM users
UNION ALL
SELECT user_id, status FROM orders;
SELECT id, email FROM users
UNION ALL
SELECT id, user_id, amount FROM orders;
If the types differ, line them up explicitly with CAST rather than relying on implicit coercion. To make a heterogeneous stack meaningful, add a source column:
SELECT id, name, CAST(NULL AS numeric) AS amount, 'user' AS source
FROM users
UNION ALL
SELECT id, CAST(NULL AS text), amount, 'order'
FROM orders;
ORDER BY and LIMIT over the union
ORDER BY applies to the whole result and is written once, at the very end. Sorting individual branches is pointless: UNION may reshuffle the rows anyway.
SELECT id, name, salary FROM employees WHERE dept = 'eng'
UNION ALL
SELECT id, name, salary FROM employees WHERE dept = 'sales'
ORDER BY salary DESC
LIMIT 10;
If you need to sort or limit a specific branch, wrap it in parentheses and a subquery:
SELECT * FROM (
SELECT id, amount FROM orders WHERE status = 'paid'
ORDER BY amount DESC LIMIT 5
) AS top_paid
UNION ALL
SELECT id, amount FROM orders WHERE status = 'refunded';
Gotcha: the final ORDER BY sorts by column position or by the name from the first SELECT. If branches name columns differently, refer to the first branch's name or to the position number (ORDER BY 3).
Engine differences
- PostgreSQL — strict type checking;
UNION uses a Sort or Hash Aggregate, visible in EXPLAIN.
- MySQL — allows more implicit coercion and may silently truncate values; put a branch's
ORDER BY/LIMIT in a subquery or it gets ignored.
- ClickHouse — by default
UNION without ALL may require explicit configuration; here you almost always write UNION ALL and dedup via DISTINCT or aggregates.
Bottom line: UNION ALL is the default choice for speed, and you reach for UNION deliberately, only when repeats genuinely need to collapse.
UNIONandUNION ALLstack the results of two or moreSELECTstatements vertically: one query's rows are appended below another's. The difference is a single keyword, but it drives both correctness and performance.UNION vs UNION ALL
UNION ALLsimply concatenates the row sets and checks nothing for repeats.UNIONadditionally removes duplicates, and to find them the engine has to sort or hash the entire result.-- Keeps every row, including duplicates. Fast. SELECT country FROM users UNION ALL SELECT country FROM users; -- Collapses identical rows into one. Extra dedup pass. SELECT country FROM users UNION SELECT country FROM users;Rule of thumb: if you already know the inputs don't overlap (say, you're stacking orders from different months or rows from separate shards), reach for
UNION ALL. The extra dedup over millions of rows costs real CPU seconds and memory for the sort, sometimes spilling to disk. Reach forUNIONonly when repeats are genuinely possible and they bother you.UNION ALL— no dedup, row order isn't guaranteed, but it's usually cheaper and steadier on memory.UNION— dedup across the whole column set, not just the first column, so the cost grows with row width.NULLequalsNULL(unlike a normal=comparison).(a UNION ALL b) UNION cfirst concatenates the two parts, then dedups everything together.Column compatibility rules
You can only combine queries with the same number of columns and compatible types. Column names come from the first
SELECT; the rest are ignored.-- OK: same arity, compatible types SELECT id, email FROM users UNION ALL SELECT user_id, status FROM orders; -- Fails: 2 columns vs 3 columns SELECT id, email FROM users UNION ALL SELECT id, user_id, amount FROM orders;If the types differ, line them up explicitly with
CASTrather than relying on implicit coercion. To make a heterogeneous stack meaningful, add a source column:SELECT id, name, CAST(NULL AS numeric) AS amount, 'user' AS source FROM users UNION ALL SELECT id, CAST(NULL AS text), amount, 'order' FROM orders;ORDER BY and LIMIT over the union
ORDER BYapplies to the whole result and is written once, at the very end. Sorting individual branches is pointless:UNIONmay reshuffle the rows anyway.SELECT id, name, salary FROM employees WHERE dept = 'eng' UNION ALL SELECT id, name, salary FROM employees WHERE dept = 'sales' ORDER BY salary DESC LIMIT 10;If you need to sort or limit a specific branch, wrap it in parentheses and a subquery:
SELECT * FROM ( SELECT id, amount FROM orders WHERE status = 'paid' ORDER BY amount DESC LIMIT 5 ) AS top_paid UNION ALL SELECT id, amount FROM orders WHERE status = 'refunded';Engine differences
UNIONuses a Sort or Hash Aggregate, visible inEXPLAIN.ORDER BY/LIMITin a subquery or it gets ignored.UNIONwithoutALLmay require explicit configuration; here you almost always writeUNION ALLand dedup viaDISTINCTor aggregates.Bottom line:
UNION ALLis the default choice for speed, and you reach forUNIONdeliberately, only when repeats genuinely need to collapse.