sqlpostgresqlarraysaggregation

ARRAY_AGG in PostgreSQL: Collect Grouped Values into an Array with ORDER BY and FILTER

Fold a group's rows into an ordered array with ARRAY_AGG, filter and dedupe it, unnest it back to rows, and learn the MySQL workaround.

2 min readReferencesql · postgresql · arrays · aggregation · mysql

ARRAY_AGG is an aggregate function that gathers the values of every row in a group into a single array. It turns "many rows per key" into "one row holding an array," which is handy for per-entity history, tag lists, or a compact export without a second query.

Syntax and a basic example

The signature matches any aggregate: ARRAY_AGG(expression), usually next to a GROUP BY. Inside the parentheses you can add ORDER BY and FILTER, and that is exactly what separates readable output from a random order.

SELECT
    user_id,
    ARRAY_AGG(amount ORDER BY created_at) AS amounts
FROM orders
GROUP BY user_id;

A few behaviors worth knowing:

  • Element order is not guaranteed without an ORDER BY inside the aggregate. Never rely on table row order.
  • The result type is an array of the expression's type: numeric[], text[], and so on.
  • NULL values land in the array too. If you do not want them, add FILTER (WHERE expr IS NOT NULL).
  • An empty group yields NULL, not an empty array {}.

ORDER BY and FILTER inside the aggregate

The most valuable part is ordering and filtering right in the call. This builds, for instance, a chronological history of order amounts, restricted to paid ones:

SELECT
    user_id,
    ARRAY_AGG(amount ORDER BY created_at)
        FILTER (WHERE status = 'paid') AS paid_history
FROM orders
GROUP BY user_id;

ARRAY_AGG(DISTINCT ...) removes duplicates, which is great for a set of countries or statuses:

SELECT
    country,
    ARRAY_AGG(DISTINCT status) AS seen_statuses
FROM orders o
JOIN users u ON u.id = o.user_id
GROUP BY country;

Gotcha: DISTINCT and ORDER BY coexist in one ARRAY_AGG only if the ORDER BY expression matches the aggregated one. ARRAY_AGG(DISTINCT amount ORDER BY created_at) fails, because you cannot sort by a column that is not in the projection after DISTINCT. Sort by the same column: ARRAY_AGG(DISTINCT amount ORDER BY amount).

Per-row history in one query

A common need is a compact "user, list of their activity" report. ARRAY_AGG assembles that without subqueries or application code:

SELECT
    u.id,
    u.email,
    ARRAY_AGG(o.id ORDER BY o.created_at DESC) AS order_ids,
    ARRAY_AGG(o.amount ORDER BY o.created_at DESC) AS amounts
FROM users u
JOIN orders o ON o.user_id = u.id
GROUP BY u.id, u.email;

For an employee hierarchy you can collect every direct report of each manager:

SELECT
    manager_id,
    ARRAY_AGG(name ORDER BY salary DESC) AS reports
FROM employees
WHERE manager_id IS NOT NULL
GROUP BY manager_id;

Unnest an array back into rows

The inverse of ARRAY_AGG is UNNEST, which expands an array into rows. Use it to iterate over elements or to join against another table:

SELECT id, UNNEST(tags) AS tag
FROM articles;

-- with original position via WITH ORDINALITY
SELECT id, t.tag, t.pos
FROM articles a, UNNEST(a.tags) WITH ORDINALITY AS t(tag, pos);

The ARRAY_AGG plus UNNEST pair is a full round trip: fold into an array, process it, and unfold it back.

MySQL: no arrays, only JSON

MySQL has no real array type, so there is no direct ARRAY_AGG either. The closest match is JSON_ARRAYAGG, which collects values into a JSON array:

-- MySQL 8: collect amounts into a JSON array
SELECT user_id, JSON_ARRAYAGG(amount) AS amounts
FROM orders
GROUP BY user_id;

Differences worth remembering:

  • JSON_ARRAYAGG has no built-in ORDER BY. To impose an order, aggregate over a pre-sorted subquery.
  • For simple string joining there is GROUP_CONCAT(amount ORDER BY created_at SEPARATOR ','), which is text, not an array.
  • In ClickHouse the equivalent is groupArray(amount), with groupArraySorted or arraySort(groupArray(...)) for an ordered result.

When you genuinely need an array with predictable order and filtering, PostgreSQL with ARRAY_AGG ... ORDER BY ... FILTER stays the most direct tool; in MySQL, plan for JSON and subquery-level sorting up front.

Practice on real tasks

Solve tasks in the SQL trainer with instant grading and hints.

Open trainer