sqlpostgresqlnot-existsnot-in

NOT EXISTS vs NOT IN: the NULL Trap and Anti-Joins

Why NOT IN silently returns nothing when the subquery has a NULL, and when to pick NOT EXISTS or LEFT JOIN ... IS NULL.

3 min readReferencesql · postgresql · not-exists · not-in · anti-join · null

You can find rows with no match in another table three ways: NOT IN, NOT EXISTS, and LEFT JOIN ... IS NULL. They look interchangeable, but NOT IN hides a nasty NULL trap that quietly wipes out your result one day. Let's compare them over a users, orders, employees schema.

The task: users with no orders

The classic anti-join: who never placed an order. The intuitive NOT IN version:

SELECT u.id, u.email
FROM users u
WHERE u.id NOT IN (SELECT user_id FROM orders);

As long as orders.user_id has no NULL, this is fine. But user_id may be nullable (guest checkout, soft deletes), and then the query breaks -- no error, no warning.

The NULL trap in NOT IN

NOT IN (...) expands to a chain id <> v1 AND id <> v2 AND .... If any value is NULL, the comparison id <> NULL is neither TRUE nor FALSE but UNKNOWN. In three-valued logic TRUE AND UNKNOWN = UNKNOWN, so the row is dropped. The result: a single NULL in the subquery collapses the whole result to empty.

-- orders contains one row with user_id IS NULL.
-- This returns NOTHING, even though some users have no orders:
SELECT u.id, u.email
FROM users u
WHERE u.id NOT IN (SELECT user_id FROM orders);
  • This is standard SQL behavior, not a quirk of one engine.
  • NULL IN (...) and NULL NOT IN (...) never evaluate to TRUE either.
  • PostgreSQL, MySQL, and ClickHouse all behave the same way.

The band-aid is to filter NULL out: WHERE u.id NOT IN (SELECT user_id FROM orders WHERE user_id IS NOT NULL). It works, but it is easy to forget -- and even easier to forget on a column inside a composite key.

NOT EXISTS: safe by default

NOT EXISTS checks whether at least one row exists and uses plain two-valued existence logic, so NULL cannot derail it:

SELECT u.id, u.email
FROM users u
WHERE NOT EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.user_id = u.id
);

A row with o.user_id IS NULL simply fails the o.user_id = u.id predicate and is quietly skipped -- no collapse. The SELECT 1 inside is an idiom: the select list in EXISTS is never evaluated, only the presence of a row matters.

Correlated subqueries also compose nicely with multi-column conditions:

-- Employees with no reports (nobody lists them as manager_id):
SELECT e.id, e.name, e.dept
FROM employees e
WHERE NOT EXISTS (
    SELECT 1
    FROM employees s
    WHERE s.manager_id = e.id
);

LEFT JOIN ... IS NULL

The third option is an outer join filtered on the non-match. It is also NULL-safe as an anti-join condition:

SELECT u.id, u.email
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE o.user_id IS NULL;
  • Test IS NULL on the join key itself (o.user_id), not on some arbitrary nullable column of orders.
  • In a one-to-many relationship the join multiplies rows, so you often need DISTINCT or GROUP BY afterwards -- extra work. NOT EXISTS avoids that entirely.

Performance and anti-join plans

Modern PostgreSQL can run both NOT EXISTS and LEFT JOIN ... IS NULL as a real anti join (Hash Anti Join / Merge Anti Join in EXPLAIN), stopping at the first match. With a nullable column, the planner is not allowed to turn NOT IN into an anti-join because of NULL semantics, and you often get a heavier plan.

  • On PostgreSQL prefer NOT EXISTS for both NULL-safety and speed.
  • MySQL 8 also optimizes NOT EXISTS well; historically NOT IN could be markedly slower.
  • ClickHouse offers a clean LEFT ANTI JOIN -- an explicit anti-join with no subquery.
  • Back the correlated NOT EXISTS predicate with an index (for example on orders(user_id)).

Bottom line: make NOT EXISTS your default for "rows with no match." Keep NOT IN for static, NULL-free lists (status NOT IN ('paid','shipped')), and reach for LEFT JOIN ... IS NULL when you also need columns from the unmatched side.

Practice on real tasks

Solve tasks in the SQL trainer with instant grading and hints.

Open trainer