A self-join is an ordinary JOIN where both sides reference the same table. There's no special syntax for it: you just list the table twice and give it two different aliases. It sounds odd, but it's the workhorse for hierarchies (employee → manager), comparing rows against each other, and building pairs. Let's work through it on a concrete schema.
Throughout, we'll use this employees table:
CREATE TABLE employees (
id bigint PRIMARY KEY,
name text NOT NULL,
manager_id bigint REFERENCES employees(id),
department text,
salary numeric(10, 2),
hired_at date
);
Why join a table to itself at all
The core idea: one row of a table needs to "see" another row of the same table. If managers lived in a separate managers table, you'd write a normal JOIN. But manager_id points back into employees, so you have to bring the table in twice.
Aliases aren't optional decoration here — they're required. Without them the engine can't tell the "employee row" from the "manager row," and you'll get an ambiguous-column error.
e — the employee's row;
m — that employee's manager row;
- the join condition links
e.manager_id to m.id.
The employee → manager hierarchy
The textbook case is showing each employee next to their manager:
SELECT e.name AS employee,
m.name AS manager
FROM employees AS e
JOIN employees AS m ON m.id = e.manager_id;
There's a trap here: the CEO has manager_id IS NULL, and a plain (inner) JOIN silently drops that row — NULL matches nothing. To keep everyone, including the bossless, use a LEFT JOIN:
SELECT e.name AS employee,
COALESCE(m.name, '— none —') AS manager
FROM employees AS e
LEFT JOIN employees AS m ON m.id = e.manager_id
ORDER BY m.name NULLS FIRST, e.name;
COALESCE substitutes a placeholder for NULL, and NULLS FIRST floats the CEO to the top. One self-join gives you exactly one level of the hierarchy. For arbitrary depth (a manager's manager, and so on) you need a recursive WITH RECURSIVE CTE — but that's a separate topic.
Comparing rows within one table
Self-joins shine when you need to compare rows against each other. For example, find employees who out-earn their own manager:
SELECT e.name AS employee,
e.salary AS emp_salary,
m.name AS manager,
m.salary AS mgr_salary
FROM employees AS e
JOIN employees AS m ON m.id = e.manager_id
WHERE e.salary > m.salary;
Or compare each person to a colleague in the same department to see who was hired first:
SELECT a.name AS earlier, b.name AS later, a.department
FROM employees AS a
JOIN employees AS b
ON a.department = b.department
AND a.hired_at < b.hired_at;
Notice that a.hired_at < b.hired_at does double duty: it both orders each pair by time and stops a row from joining to itself.
Pairs without duplicates or self-matches
If you want every pair of employees within a department (say, for pairing up code reviews), a naive JOIN on a.department = b.department produces junk: every row matches itself (Alice–Alice) and every pair shows up twice (Alice–Bob and Bob–Alice).
The fix is a single predicate — compare by id:
SELECT a.name AS person_a,
b.name AS person_b,
a.department
FROM employees AS a
JOIN employees AS b
ON a.department = b.department
AND a.id < b.id
ORDER BY a.department, person_a;
a.id <> b.id would remove only the self-match, but duplicates would remain;
a.id < b.id removes both: each unordered pair survives exactly once.
This strict-inequality trick is the single most common pattern when generating pairs.
Gotcha. Mind performance: a self-join on an unindexed manager_id or department over a large table can degrade into an expensive Nested Loop. Index your join columns (CREATE INDEX ON employees(manager_id)) and check the plan with EXPLAIN ANALYZE.
Differences across databases
Self-join syntax is standard SQL and behaves the same in PostgreSQL, MySQL, and ClickHouse, but watch for a few wrinkles:
- PostgreSQL / MySQL — everything above works as written; the
LEFT JOIN to preserve rows with a NULL parent is mandatory.
- MySQL before 8.0 had no
WITH RECURSIVE, so deep hierarchies had to be built procedurally; 8.0+ supports recursion.
- ClickHouse —
JOIN exists but is more costly and less flexible; hierarchies are often handled with array functions or denormalization, and NULL equality in JOIN behaves differently. On large data in ClickHouse, avoid an inequality self-join (a.id < b.id) — it parallelizes poorly.
Bottom line: a self-join is just a JOIN of a table with itself via aliases. Remember three moves: LEFT JOIN for NULL-parent hierarchies, column comparison for row analysis, and a strict id inequality for duplicate-free pairs.
A self-join is an ordinary
JOINwhere both sides reference the same table. There's no special syntax for it: you just list the table twice and give it two different aliases. It sounds odd, but it's the workhorse for hierarchies (employee → manager), comparing rows against each other, and building pairs. Let's work through it on a concrete schema.Throughout, we'll use this employees table:
CREATE TABLE employees ( id bigint PRIMARY KEY, name text NOT NULL, manager_id bigint REFERENCES employees(id), -- points back to this table department text, salary numeric(10, 2), hired_at date );Why join a table to itself at all
The core idea: one row of a table needs to "see" another row of the same table. If managers lived in a separate
managerstable, you'd write a normalJOIN. Butmanager_idpoints back intoemployees, so you have to bring the table in twice.Aliases aren't optional decoration here — they're required. Without them the engine can't tell the "employee row" from the "manager row," and you'll get an ambiguous-column error.
e— the employee's row;m— that employee's manager row;e.manager_idtom.id.The employee → manager hierarchy
The textbook case is showing each employee next to their manager:
SELECT e.name AS employee, m.name AS manager FROM employees AS e JOIN employees AS m ON m.id = e.manager_id;There's a trap here: the CEO has
manager_id IS NULL, and a plain (inner)JOINsilently drops that row —NULLmatches nothing. To keep everyone, including the bossless, use aLEFT JOIN:SELECT e.name AS employee, COALESCE(m.name, '— none —') AS manager FROM employees AS e LEFT JOIN employees AS m ON m.id = e.manager_id ORDER BY m.name NULLS FIRST, e.name;COALESCEsubstitutes a placeholder forNULL, andNULLS FIRSTfloats the CEO to the top. One self-join gives you exactly one level of the hierarchy. For arbitrary depth (a manager's manager, and so on) you need a recursiveWITH RECURSIVECTE — but that's a separate topic.Comparing rows within one table
Self-joins shine when you need to compare rows against each other. For example, find employees who out-earn their own manager:
SELECT e.name AS employee, e.salary AS emp_salary, m.name AS manager, m.salary AS mgr_salary FROM employees AS e JOIN employees AS m ON m.id = e.manager_id WHERE e.salary > m.salary;Or compare each person to a colleague in the same department to see who was hired first:
SELECT a.name AS earlier, b.name AS later, a.department FROM employees AS a JOIN employees AS b ON a.department = b.department AND a.hired_at < b.hired_at; -- a was hired strictly before bNotice that
a.hired_at < b.hired_atdoes double duty: it both orders each pair by time and stops a row from joining to itself.Pairs without duplicates or self-matches
If you want every pair of employees within a department (say, for pairing up code reviews), a naive
JOINona.department = b.departmentproduces junk: every row matches itself (Alice–Alice) and every pair shows up twice (Alice–Bob and Bob–Alice).The fix is a single predicate — compare by
id:SELECT a.name AS person_a, b.name AS person_b, a.department FROM employees AS a JOIN employees AS b ON a.department = b.department AND a.id < b.id -- strict < kills both self-pairs and duplicates ORDER BY a.department, person_a;a.id <> b.idwould remove only the self-match, but duplicates would remain;a.id < b.idremoves both: each unordered pair survives exactly once.This strict-inequality trick is the single most common pattern when generating pairs.
Gotcha. Mind performance: a self-join on an unindexed
manager_idordepartmentover a large table can degrade into an expensiveNested Loop. Index your join columns (CREATE INDEX ON employees(manager_id)) and check the plan withEXPLAIN ANALYZE.Differences across databases
Self-join syntax is standard SQL and behaves the same in PostgreSQL, MySQL, and ClickHouse, but watch for a few wrinkles:
LEFT JOINto preserve rows with aNULLparent is mandatory.WITH RECURSIVE, so deep hierarchies had to be built procedurally; 8.0+ supports recursion.JOINexists but is more costly and less flexible; hierarchies are often handled with array functions or denormalization, andNULLequality inJOINbehaves differently. On large data in ClickHouse, avoid an inequality self-join (a.id < b.id) — it parallelizes poorly.Bottom line: a self-join is just a
JOINof a table with itself via aliases. Remember three moves:LEFT JOINforNULL-parent hierarchies, column comparison for row analysis, and a strictidinequality for duplicate-free pairs.