sqlpostgresqljoinsself-join

SQL Self-Joins: Joining a Table to Itself

Use table aliases to join a table to itself: employee→manager hierarchies, comparing rows within one table, and generating pairs without duplicates.

3 min readReferencesql · postgresql · joins · self-join · query-patterns

A self-join is an ordinary JOIN where both sides reference the same table. There's no special syntax for it: you just list the table twice and give it two different aliases. It sounds odd, but it's the workhorse for hierarchies (employee → manager), comparing rows against each other, and building pairs. Let's work through it on a concrete schema.

Throughout, we'll use this employees table:

CREATE TABLE employees (
    id          bigint PRIMARY KEY,
    name        text NOT NULL,
    manager_id  bigint REFERENCES employees(id),  -- points back to this table
    department  text,
    salary      numeric(10, 2),
    hired_at    date
);

Why join a table to itself at all

The core idea: one row of a table needs to "see" another row of the same table. If managers lived in a separate managers table, you'd write a normal JOIN. But manager_id points back into employees, so you have to bring the table in twice.

Aliases aren't optional decoration here — they're required. Without them the engine can't tell the "employee row" from the "manager row," and you'll get an ambiguous-column error.

  • e — the employee's row;
  • m — that employee's manager row;
  • the join condition links e.manager_id to m.id.

The employee → manager hierarchy

The textbook case is showing each employee next to their manager:

SELECT e.name        AS employee,
       m.name        AS manager
FROM employees AS e
JOIN employees AS m ON m.id = e.manager_id;

There's a trap here: the CEO has manager_id IS NULL, and a plain (inner) JOIN silently drops that row — NULL matches nothing. To keep everyone, including the bossless, use a LEFT JOIN:

SELECT e.name                          AS employee,
       COALESCE(m.name, '— none —')    AS manager
FROM employees AS e
LEFT JOIN employees AS m ON m.id = e.manager_id
ORDER BY m.name NULLS FIRST, e.name;

COALESCE substitutes a placeholder for NULL, and NULLS FIRST floats the CEO to the top. One self-join gives you exactly one level of the hierarchy. For arbitrary depth (a manager's manager, and so on) you need a recursive WITH RECURSIVE CTE — but that's a separate topic.

Comparing rows within one table

Self-joins shine when you need to compare rows against each other. For example, find employees who out-earn their own manager:

SELECT e.name   AS employee,
       e.salary AS emp_salary,
       m.name   AS manager,
       m.salary AS mgr_salary
FROM employees AS e
JOIN employees AS m ON m.id = e.manager_id
WHERE e.salary > m.salary;

Or compare each person to a colleague in the same department to see who was hired first:

SELECT a.name AS earlier, b.name AS later, a.department
FROM employees AS a
JOIN employees AS b
  ON a.department = b.department
 AND a.hired_at < b.hired_at;       -- a was hired strictly before b

Notice that a.hired_at < b.hired_at does double duty: it both orders each pair by time and stops a row from joining to itself.

Pairs without duplicates or self-matches

If you want every pair of employees within a department (say, for pairing up code reviews), a naive JOIN on a.department = b.department produces junk: every row matches itself (Alice–Alice) and every pair shows up twice (Alice–Bob and Bob–Alice).

The fix is a single predicate — compare by id:

SELECT a.name AS person_a,
       b.name AS person_b,
       a.department
FROM employees AS a
JOIN employees AS b
  ON a.department = b.department
 AND a.id < b.id          -- strict < kills both self-pairs and duplicates
ORDER BY a.department, person_a;
  • a.id <> b.id would remove only the self-match, but duplicates would remain;
  • a.id < b.id removes both: each unordered pair survives exactly once.

This strict-inequality trick is the single most common pattern when generating pairs.

Gotcha. Mind performance: a self-join on an unindexed manager_id or department over a large table can degrade into an expensive Nested Loop. Index your join columns (CREATE INDEX ON employees(manager_id)) and check the plan with EXPLAIN ANALYZE.

Differences across databases

Self-join syntax is standard SQL and behaves the same in PostgreSQL, MySQL, and ClickHouse, but watch for a few wrinkles:

  • PostgreSQL / MySQL — everything above works as written; the LEFT JOIN to preserve rows with a NULL parent is mandatory.
  • MySQL before 8.0 had no WITH RECURSIVE, so deep hierarchies had to be built procedurally; 8.0+ supports recursion.
  • ClickHouseJOIN exists but is more costly and less flexible; hierarchies are often handled with array functions or denormalization, and NULL equality in JOIN behaves differently. On large data in ClickHouse, avoid an inequality self-join (a.id < b.id) — it parallelizes poorly.

Bottom line: a self-join is just a JOIN of a table with itself via aliases. Remember three moves: LEFT JOIN for NULL-parent hierarchies, column comparison for row analysis, and a strict id inequality for duplicate-free pairs.

Practice on real tasks

Solve tasks in the SQL trainer with instant grading and hints.

Open trainer