sqlpostgresqlmysqlupdate

UPDATE ... FROM: Bulk Updates Driven by Joins and Subqueries

Update a table from another table or an aggregate, and avoid the missing-WHERE trap that overwrites every row.

2 min readReferencesql · postgresql · mysql · update · joins

A plain UPDATE sets rows to fixed values. The moment the new value depends on another table or an aggregate, you reach for UPDATE ... FROM — the construct that joins the target table to a source right inside the update.

Basic syntax

In PostgreSQL the source goes in FROM, and the row-to-row link goes in WHERE. Let's copy each user's country onto their orders (a denormalization for reporting):

ALTER TABLE orders ADD COLUMN country text;

UPDATE orders o
SET country = u.country
FROM users u
WHERE o.user_id = u.id;

Here FROM users u is the source and WHERE o.user_id = u.id is the join condition. Every orders row inherits its user's country. Drop that condition and the result changes completely — which is exactly the trap in the next section.

The trap: a missing WHERE

The join condition in UPDATE ... FROM lives in the WHERE clause, not in a separate ON. Forget it and the engine pairs every target row with every source row (a Cartesian product), updating every row to the same value:

-- DANGER: no join condition
UPDATE orders o
SET country = u.country
FROM users u;
  • All orders get the country of some arbitrary user.
  • Recovery means a backup or a rollback.

Gotcha: wrap these statements in a transaction and dry-run them as a SELECT with the same FROM/WHERE before committing. In PostgreSQL, start with BEGIN;, run the UPDATE, inspect SELECT count(*), then COMMIT; only when the numbers look right.

Updating from an aggregate

The most common source is a grouped subquery. Let's store each user's order total in a dedicated column:

ALTER TABLE users ADD COLUMN total numeric DEFAULT 0;

UPDATE users u
SET total = s.total
FROM (
    SELECT user_id, SUM(amount) AS total
    FROM orders
    WHERE status = 'paid'
    GROUP BY user_id
) s
WHERE u.id = s.user_id;

Users with no paid orders never appear in the subquery and keep the default 0. If you also need to zero those out explicitly, add a separate UPDATE ... WHERE id NOT IN (...) or drive the whole thing with a LEFT JOIN style source.

Denormalizing computed fields

The same trick keeps hot aggregates close at hand so you don't recompute them on every read. For example, give a department-wide raise based on average pay:

UPDATE employees e
SET salary = salary * 1.10
FROM (
    SELECT dept
    FROM employees
    GROUP BY dept
    HAVING AVG(salary) < 50000
) low
WHERE e.dept = low.dept;

The subquery picks departments with a low average salary, and the outer UPDATE raises everyone in those departments by 10%.

  • Denormalization speeds up reads but needs a recompute whenever the source changes.
  • Keep such updates in the same transaction as the source write.

MySQL: UPDATE ... JOIN

MySQL has no UPDATE ... FROM. The join is written right after UPDATE with JOIN, and the link goes in ON:

UPDATE users u
JOIN (
    SELECT user_id, SUM(amount) AS total
    FROM orders
    WHERE status = 'paid'
    GROUP BY user_id
) s ON u.id = s.user_id
SET u.total = s.total;

The trap is the same in reverse: an INNER JOIN with no ON, or ON 1=1, brings the Cartesian product right back. ClickHouse goes further still — it has no ordinary row-by-row UPDATE; you use the asynchronous ALTER TABLE ... UPDATE, and you typically join tables through engines and SELECT instead.

Bottom line: PostgreSQL puts the source in FROM and the link in WHERE, MySQL puts both in JOIN ... ON. One rule rules them all: without a join condition, you update the whole table.

Practice on real tasks

Solve tasks in the SQL trainer with instant grading and hints.

Open trainer