WIDTH_BUCKET answers a simple question: which histogram bin does a number land in if you slice a range into equal-width pieces? It is the right tool whenever you need to group a continuous value (order amount, salary, score) into uniform buckets and build a distribution.
Signature and logic
The canonical PostgreSQL call takes four arguments:
SELECT WIDTH_BUCKET(score, 0, 100, 10) AS bucket;
This says: take the range from 0 to 100, cut it into 10 equal pieces of width 10 each, and return the bucket number for score. A value in 0..10 lands in bucket 1, 10..20 in bucket 2, and so on. The lower bound is inclusive, the upper bound is exclusive.
The key feature is two extra buckets:
- if a value is below the lower bound, you get
0 (underflow);
- if a value is at or above the upper bound, you get
n + 1 (overflow).
You never silently drop rows: everything outside the range is neatly piled up at the edges.
Distributions with GROUP BY
The most common job is a histogram of order amounts. Slice amount from 0 to 1000 into 10 buckets and count each one:
SELECT
WIDTH_BUCKET(amount, 0, 1000, 10) AS bucket,
COUNT(*) AS orders,
ROUND(MIN(amount), 2) AS lo,
ROUND(MAX(amount), 2) AS hi
FROM orders
WHERE status = 'paid'
GROUP BY bucket
ORDER BY bucket;
Bucket 0 collects refunds and zero amounts, bucket 11 collects whales above 1000. To turn the number into a readable label, compute the edges right in the query:
SELECT
bucket,
(bucket - 1) * 100 AS range_lo,
bucket * 100 AS range_hi,
COUNT(*) AS orders
FROM (
SELECT WIDTH_BUCKET(amount, 0, 1000, 10) AS bucket
FROM orders
WHERE status = 'paid'
) t
GROUP BY bucket
ORDER BY bucket;
The same pattern works for salaries: bucket employees.salary into bands and immediately see where most of the headcount sits.
SELECT
WIDTH_BUCKET(salary, 30000, 150000, 6) AS band,
dept,
COUNT(*) AS headcount
FROM employees
GROUP BY band, dept
ORDER BY band, dept;
WIDTH_BUCKET vs NTILE
These two are easy to confuse, but they split data along different axes:
WIDTH_BUCKET cuts the value axis into equal-width segments. Buckets may be empty or overflowing, and that is fine, because the shape of the distribution is exactly what you want to see.
NTILE(n) cuts the row set into n groups of roughly equal size (quantiles). Interval widths float, but each group holds about the same number of rows.
Reach for WIDTH_BUCKET when fixed thresholds matter ("orders 0-100, 100-200..."). Reach for NTILE when you need percentiles ("top 10% of customers by spend"):
SELECT
user_id,
SUM(amount) AS lifetime,
NTILE(4) OVER (ORDER BY SUM(amount)) AS quartile
FROM orders
GROUP BY user_id;
Gotcha: bounds and NULLs
A few traps that burn people:
- The upper bound is exclusive. Exactly
WIDTH_BUCKET(100, 0, 100, 10) returns 11, not 10. To pull the maximum into the last real bucket, widen the upper limit or use LEAST(amount, 999.99).
- A
NULL input yields a NULL output, so those rows drop out of the distribution. Filter them explicitly or fold them in with COALESCE.
- You can pass the bounds reversed (upper < lower), which flips the numbering so buckets count downward. That is rarely what you want, so double-check the order.
Differences across engines
- PostgreSQL and Oracle support the four-argument
WIDTH_BUCKET directly.
- MySQL has no
WIDTH_BUCKET. Emulate it with arithmetic: FLOOR((amount - 0) / 100) + 1, and handle underflow and overflow by hand.
- ClickHouse offers
roundDown(amount, [0, 100, 200, ...]) and plain floor(amount / 100) for the same effect.
The golden rule: pin min, max, and the bucket count as deliberate business thresholds rather than fitting them to the current data, otherwise two neighboring reports stop being comparable.
WIDTH_BUCKETanswers a simple question: which histogram bin does a number land in if you slice a range into equal-width pieces? It is the right tool whenever you need to group a continuous value (order amount, salary, score) into uniform buckets and build a distribution.Signature and logic
The canonical PostgreSQL call takes four arguments:
SELECT WIDTH_BUCKET(score, 0, 100, 10) AS bucket; -- bucket 1..10This says: take the range from
0to100, cut it into10equal pieces of width10each, and return the bucket number forscore. A value in0..10lands in bucket 1,10..20in bucket 2, and so on. The lower bound is inclusive, the upper bound is exclusive.The key feature is two extra buckets:
0(underflow);n + 1(overflow).You never silently drop rows: everything outside the range is neatly piled up at the edges.
Distributions with GROUP BY
The most common job is a histogram of order amounts. Slice
amountfrom 0 to 1000 into 10 buckets and count each one:SELECT WIDTH_BUCKET(amount, 0, 1000, 10) AS bucket, COUNT(*) AS orders, ROUND(MIN(amount), 2) AS lo, ROUND(MAX(amount), 2) AS hi FROM orders WHERE status = 'paid' GROUP BY bucket ORDER BY bucket;Bucket
0collects refunds and zero amounts, bucket11collects whales above 1000. To turn the number into a readable label, compute the edges right in the query:SELECT bucket, (bucket - 1) * 100 AS range_lo, bucket * 100 AS range_hi, COUNT(*) AS orders FROM ( SELECT WIDTH_BUCKET(amount, 0, 1000, 10) AS bucket FROM orders WHERE status = 'paid' ) t GROUP BY bucket ORDER BY bucket;The same pattern works for salaries: bucket
employees.salaryinto bands and immediately see where most of the headcount sits.SELECT WIDTH_BUCKET(salary, 30000, 150000, 6) AS band, dept, COUNT(*) AS headcount FROM employees GROUP BY band, dept ORDER BY band, dept;WIDTH_BUCKET vs NTILE
These two are easy to confuse, but they split data along different axes:
WIDTH_BUCKETcuts the value axis into equal-width segments. Buckets may be empty or overflowing, and that is fine, because the shape of the distribution is exactly what you want to see.NTILE(n)cuts the row set intongroups of roughly equal size (quantiles). Interval widths float, but each group holds about the same number of rows.Reach for
WIDTH_BUCKETwhen fixed thresholds matter ("orders 0-100, 100-200..."). Reach forNTILEwhen you need percentiles ("top 10% of customers by spend"):SELECT user_id, SUM(amount) AS lifetime, NTILE(4) OVER (ORDER BY SUM(amount)) AS quartile FROM orders GROUP BY user_id;Gotcha: bounds and NULLs
A few traps that burn people:
WIDTH_BUCKET(100, 0, 100, 10)returns11, not10. To pull the maximum into the last real bucket, widen the upper limit or useLEAST(amount, 999.99).NULLinput yields aNULLoutput, so those rows drop out of the distribution. Filter them explicitly or fold them in withCOALESCE.Differences across engines
WIDTH_BUCKETdirectly.WIDTH_BUCKET. Emulate it with arithmetic:FLOOR((amount - 0) / 100) + 1, and handle underflow and overflow by hand.roundDown(amount, [0, 100, 200, ...])and plainfloor(amount / 100)for the same effect.The golden rule: pin
min,max, and the bucket count as deliberate business thresholds rather than fitting them to the current data, otherwise two neighboring reports stop being comparable.