Moving rows from one table to another looks trivial: INSERT ... SELECT, then DELETE. But between those two statements there is a window where data is either duplicated or lost. Data-modifying CTEs in PostgreSQL collapse the whole thing into one atomic statement.
The two-statement problem
The classic "archive then purge" snippet looks like this:
INSERT INTO orders_archive
SELECT * FROM orders WHERE status = 'cancelled';
DELETE FROM orders WHERE status = 'cancelled';
Two things can go wrong. First, between the INSERT and the DELETE another session can modify or remove a row, so you archive one thing and delete another. Second, if a crash lands between the statements, rows end up in both tables at once. Wrapping them in a transaction fixes the crash case, but the logic is still split across two statements and scans the table twice.
DELETE ... RETURNING inside WITH
PostgreSQL lets you put INSERT, UPDATE, and DELETE directly in a WITH clause. The deleted rows come back through RETURNING and feed straight into the next step:
WITH moved AS (
DELETE FROM orders
WHERE status = 'cancelled'
RETURNING *
)
INSERT INTO orders_archive
SELECT * FROM moved;
This is a single statement. Rows are deleted and archived in one transaction, with no in-between window and no second scan of the table. List the columns explicitly when the schemas differ:
WITH moved AS (
DELETE FROM orders
WHERE created_at < now() - interval '1 year'
RETURNING id, user_id, amount, status, created_at
)
INSERT INTO orders_archive (id, user_id, amount, status, archived_at)
SELECT id, user_id, amount, status, now()
FROM moved;
The single-snapshot rule
The key subtlety: every sub-statement in a data-modifying CTE sees the same snapshot, taken when the statement starts. One CTE's changes are not visible to the others through a plain SELECT -- they are visible only through RETURNING.
- If
users references the orders you delete, a SELECT FROM orders in the same statement still "sees" the rows a sibling CTE is deleting.
- The execution order of the sub-statements is not defined; rely only on the data flow through
RETURNING.
- Trying to modify the same row twice in one statement (two
UPDATEs, say) gives an undefined result and is not recommended.
This rule is exactly what makes the move correct: you operate on a fixed set of deleted rows, not on a live, shifting table.
Partitioning rows in one pass
The same trick fans rows out to several targets. Move terminated employees and count how many left each department at the same time:
WITH removed AS (
DELETE FROM employees
WHERE salary IS NULL
RETURNING id, name, manager_id, dept, salary
), archived AS (
INSERT INTO employees_archive
SELECT * FROM removed
RETURNING dept
)
SELECT dept, count(*) AS moved
FROM archived
GROUP BY dept;
The final SELECT returns a report of the work done -- handy for logs and assertions.
Gotchas and engine differences
- Gotcha:
AFTER DELETE triggers on the source table do fire, but data-modifying CTEs do not guarantee ordering relative to them; do not build trigger side effects into your move logic.
RETURNING * pulls every column of the deleted rows; if the source schema changes it silently drifts, so spell out the columns in production code.
- MySQL has no data-modifying CTEs: a
DELETE inside WITH is rejected. There you move rows with a transaction holding INSERT ... SELECT plus DELETE, or INSERT ... SELECT ... FOR UPDATE to lock the rows.
- ClickHouse is an analytical engine:
DELETE is a heavy mutation there and RETURNING does not exist. Moves are done with INSERT INTO ... SELECT plus a separate mutation or partition manipulation.
Data-modifying CTEs are the idiomatic PostgreSQL way to express "move these rows" as one atomic, readable intent.
Moving rows from one table to another looks trivial:
INSERT ... SELECT, thenDELETE. But between those two statements there is a window where data is either duplicated or lost. Data-modifying CTEs in PostgreSQL collapse the whole thing into one atomic statement.The two-statement problem
The classic "archive then purge" snippet looks like this:
INSERT INTO orders_archive SELECT * FROM orders WHERE status = 'cancelled'; DELETE FROM orders WHERE status = 'cancelled';Two things can go wrong. First, between the
INSERTand theDELETEanother session can modify or remove a row, so you archive one thing and delete another. Second, if a crash lands between the statements, rows end up in both tables at once. Wrapping them in a transaction fixes the crash case, but the logic is still split across two statements and scans the table twice.DELETE ... RETURNING inside WITH
PostgreSQL lets you put
INSERT,UPDATE, andDELETEdirectly in aWITHclause. The deleted rows come back throughRETURNINGand feed straight into the next step:WITH moved AS ( DELETE FROM orders WHERE status = 'cancelled' RETURNING * ) INSERT INTO orders_archive SELECT * FROM moved;This is a single statement. Rows are deleted and archived in one transaction, with no in-between window and no second scan of the table. List the columns explicitly when the schemas differ:
WITH moved AS ( DELETE FROM orders WHERE created_at < now() - interval '1 year' RETURNING id, user_id, amount, status, created_at ) INSERT INTO orders_archive (id, user_id, amount, status, archived_at) SELECT id, user_id, amount, status, now() FROM moved;The single-snapshot rule
The key subtlety: every sub-statement in a data-modifying CTE sees the same snapshot, taken when the statement starts. One CTE's changes are not visible to the others through a plain
SELECT-- they are visible only throughRETURNING.usersreferences theordersyou delete, aSELECT FROM ordersin the same statement still "sees" the rows a sibling CTE is deleting.RETURNING.UPDATEs, say) gives an undefined result and is not recommended.This rule is exactly what makes the move correct: you operate on a fixed set of deleted rows, not on a live, shifting table.
Partitioning rows in one pass
The same trick fans rows out to several targets. Move terminated employees and count how many left each department at the same time:
WITH removed AS ( DELETE FROM employees WHERE salary IS NULL RETURNING id, name, manager_id, dept, salary ), archived AS ( INSERT INTO employees_archive SELECT * FROM removed RETURNING dept ) SELECT dept, count(*) AS moved FROM archived GROUP BY dept;The final
SELECTreturns a report of the work done -- handy for logs and assertions.Gotchas and engine differences
AFTER DELETEtriggers on the source table do fire, but data-modifying CTEs do not guarantee ordering relative to them; do not build trigger side effects into your move logic.RETURNING *pulls every column of the deleted rows; if the source schema changes it silently drifts, so spell out the columns in production code.DELETEinsideWITHis rejected. There you move rows with a transaction holdingINSERT ... SELECTplusDELETE, orINSERT ... SELECT ... FOR UPDATEto lock the rows.DELETEis a heavy mutation there andRETURNINGdoes not exist. Moves are done withINSERT INTO ... SELECTplus a separate mutation or partition manipulation.Data-modifying CTEs are the idiomatic PostgreSQL way to express "move these rows" as one atomic, readable intent.