DATE_BIN in PostgreSQL: Arbitrary-Width Time Buckets for Metrics

Q: Where DATE_TRUNC falls short

DATE TRUNC only knows a fixed set of fields: hour, day, month. It cannot build a 15-minute or a 6-hour bucket. With DATE BIN that is a single line: Counting activity in 6-hour windows is just as easy, signups for instance: Change only the stride and you get any step, with no subqueries and no EXTRACT arithmetic.

Q: Why the origin matters

The origin sets the phase of the grid. For 15 minutes or one hour it rarely matters, but as soon as stride does not divide a day evenly, the origin becomes critical. Here weeks are cut from January 1, not from the ISO Monday. Shift the origin to the weekday you want and you get "weeks" that begin exactly when the business needs them to. - Use the same origin across every query so buckets line up between reports. - For an offset day (a workday starting at 9am) set an origin like TIMESTAMP '2024-01-01 09:00:00'.

DATE_BIN arrived in PostgreSQL 14 and solves what DATE_TRUNC cannot: it snaps a timestamp to the start of a bucket of any width, 15 minutes, 6 hours, 10 days, measured from a chosen origin. It is the right tool for metrics on a non-standard step.

Syntax and intuition

The signature is simple: DATE_BIN(stride, source, origin). The first argument is an interval (the bucket width), the second is the timestamp you want to bucket, and the third is the origin, the point the intervals are counted from.

SELECT DATE_BIN(
  INTERVAL '15 minutes',
  TIMESTAMP '2024-01-01 14:37:09',
  TIMESTAMP '2024-01-01'
);
-- 2024-01-01 14:30:00

The mental model: picture an infinite ruler of stride-wide buckets starting at origin. DATE_BIN finds the bucket that source lands in and returns its left edge.

It always floors, just like DATE_TRUNC.
The result snaps to the origin grid, not to the top of an hour or day.
stride must be a whole number of fixed time units (seconds, minutes, hours, days, weeks).

Where DATE_TRUNC falls short

DATE_TRUNC only knows a fixed set of fields: hour, day, month. It cannot build a 15-minute or a 6-hour bucket. With DATE_BIN that is a single line:

SELECT
  DATE_BIN(INTERVAL '15 minutes', created_at, TIMESTAMP '2024-01-01') AS bucket,
  COUNT(*)    AS orders,
  SUM(amount) AS revenue
FROM orders
WHERE status = 'paid'
GROUP BY bucket
ORDER BY bucket;

Counting activity in 6-hour windows is just as easy, signups for instance:

SELECT
  DATE_BIN(INTERVAL '6 hours', created_at, TIMESTAMP '2024-01-01') AS shift,
  COUNT(*) AS signups
FROM users
GROUP BY shift
ORDER BY shift;

Change only the stride and you get any step, with no subqueries and no EXTRACT arithmetic.

Why the origin matters

The origin sets the phase of the grid. For 15 minutes or one hour it rarely matters, but as soon as stride does not divide a day evenly, the origin becomes critical.

SELECT DATE_BIN(
  INTERVAL '7 days',
  TIMESTAMP '2024-03-20 10:00:00',
  TIMESTAMP '2024-01-01'
);
-- 2024-03-18 00:00:00

Here weeks are cut from January 1, not from the ISO Monday. Shift the origin to the weekday you want and you get "weeks" that begin exactly when the business needs them to.

Use the same origin across every query so buckets line up between reports.
For an offset day (a workday starting at 9am) set an origin like TIMESTAMP '2024-01-01 09:00:00'.

Gotchas: types and time zones

The main trap is type mismatch. If source is timestamptz, then origin must be timestamptz too, or you get an error. And for timestamptz the bins are computed in UTC, so your "6 hours" buckets drift relative to local time.

SELECT
  DATE_BIN(
    INTERVAL '6 hours',
    created_at AT TIME ZONE 'America/New_York',
    TIMESTAMP '2024-01-01'
  ) AS shift_ny,
  COUNT(*)
FROM orders
GROUP BY shift_ny;

A couple more rakes to step on:

Months and years are forbidden: INTERVAL '1 month' raises an error because its length is not constant. Use DATE_TRUNC for months.
DATE_BIN only exists from PostgreSQL 14 onward, it is absent on 13 and below.

Alternatives: TimescaleDB and ClickHouse

DATE_BIN is solid, but the ecosystem offers more flexible options.

TimescaleDB ships time_bucket, the ancestor of DATE_BIN. It has compatible semantics but richer features: month-long intervals, an offset argument, and time zones passed directly in.

SELECT
  time_bucket(INTERVAL '15 minutes', created_at) AS bucket,
  SUM(amount) AS revenue
FROM orders
GROUP BY bucket
ORDER BY bucket;

ClickHouse solves the same problem with toStartOfInterval:

SELECT
  toStartOfInterval(created_at, INTERVAL 15 MINUTE) AS bucket,
  sum(amount) AS revenue
FROM orders
GROUP BY bucket
ORDER BY bucket;

The takeaway: reach for DATE_TRUNC for standard units, DATE_BIN for an arbitrary step on plain PostgreSQL, and time_bucket when you need months, offsets, and zones out of the box.