SQLwindow-functionstutorialanalytics

Window functions in SQL: ROW_NUMBER, RANK, LAG/LEAD in practice

Window functions are the analyst's most-used tool in SQL. We'll break down ROW_NUMBER, RANK, LAG/LEAD and PARTITION BY through real cases: top-N per group, day-over-day metrics, cumulative sums.

4 min readSQL · window-functions · tutorial · analytics

If you're already comfortable with SELECT, WHERE and JOIN but get stuck on questions like "find the top 3 products in each category" or "what's the day-over-day revenue change in percent" — welcome to window functions. After JOIN, this is the analyst's most-used tool in SQL.

I'll show everything on a simple orders table:

CREATE TABLE orders (
  id          INT PRIMARY KEY,
  category    VARCHAR(50),
  product     VARCHAR(100),
  amount      NUMERIC(10, 2),
  created_at  DATE
);

Window functions vs GROUP BY

The key difference fits in one word: GROUP BY collapses rows. Window functions don't.

With GROUP BY you lose the detail. A million orders in, GROUP BY category out — five rows, one per category. Great for a report, but you can't see both the individual order and the average ticket for its category at the same time.

With a window function every row stays put and a computed value appears alongside it:

SELECT
  id,
  category,
  amount,
  AVG(amount) OVER (PARTITION BY category) AS category_avg
FROM orders;

Read it out loud: "average amount over the window, partitioned by category." For each order, the average for its category appears next to it — no other rows are dropped.

OVER(): three knobs

Inside OVER() live three settings. For most everyday queries you only need the first two.

1. PARTITION BY — how to slice

Without PARTITION BY the entire table is one window. With it, each distinct value of the column gets its own group:

-- Average across the whole table
AVG(amount) OVER ()

-- Average per category, separately
AVG(amount) OVER (PARTITION BY category)

2. ORDER BY — in which order

Use it when row position inside the window matters: "first", "second", "next door". Without ORDER BY, ROW_NUMBER is meaningless. With it, you get a numbering:

-- Orders inside a category, ranked by amount desc: 1, 2, 3, ...
ROW_NUMBER() OVER (PARTITION BY category ORDER BY amount DESC)

3. Frame clause — the optional third knob

The frame controls "which neighbour rows go into the aggregate." Remember just one rule for now: the moment ORDER BY shows up inside OVER(), an aggregate like SUM switches to running mode — accumulating from the start of the window up to the current row.

-- Running total from the start of each category
SUM(amount) OVER (PARTITION BY category ORDER BY created_at)

This is the most common beginner trap: you add ORDER BY to "make it look nicer" and the aggregate quietly turns into a running total. Tattoo it.

ROW_NUMBER, RANK, DENSE_RANK — three ways to number

They differ in how they handle ties in the ORDER BY column.

FunctionOn values 10, 10, 20 produces
ROW_NUMBER()1, 2, 3 — always unique
RANK()1, 1, 3 — same rank, then a gap
DENSE_RANK()1, 1, 2 — same rank, no gap

Case study: top 3 products per category

This is probably the most common window-function task in SQL interviews. You can't write it cleanly with GROUP BY. With ROW_NUMBER it's a single pass:

WITH ranked AS (
  SELECT
    category,
    product,
    SUM(amount) AS total,
    ROW_NUMBER() OVER (
      PARTITION BY category
      ORDER BY SUM(amount) DESC
    ) AS rn
  FROM orders
  GROUP BY category, product
)
SELECT category, product, total
FROM ranked
WHERE rn <= 3
ORDER BY category, rn;

Step by step:

  1. Group by (category, product) and total the revenue per product. You get a list of products with their revenue.
  2. Inside each category, number the products by revenue descending. The leader gets rn = 1, the runner-up rn = 2, and so on.
  3. The outer query keeps only the rows where rn <= 3.

ROW_NUMBER guarantees a strict 1, 2, 3 even when two products tie on revenue. If you want "everyone in the top three" — including ties on second place — use RANK instead of ROW_NUMBER.

LAG and LEAD — peek at a neighbour row

LAG(col) returns col from the previous row in the window, LEAD(col) from the next one. If there's no neighbour, you get NULL by default.

Case study: day-over-day revenue growth

One of the top three queries an analyst is expected to write at an interview. Given a stream of daily revenue, return the percentage change vs the previous day.

Step 1. Roll orders up into a daily total:

WITH daily AS (
  SELECT
    DATE_TRUNC('day', created_at)::DATE AS day,
    SUM(amount) AS revenue
  FROM orders
  GROUP BY 1
)
SELECT * FROM daily ORDER BY day;

Step 2. Add a column with yesterday's revenue:

SELECT
  day,
  revenue,
  LAG(revenue) OVER (ORDER BY day) AS prev_revenue
FROM daily
ORDER BY day;

The first row's prev_revenue is NULL — that's expected, the first day has no neighbour to its left.

Step 3. Compute the percentage:

SELECT
  day,
  revenue,
  ROUND(
    100.0 * (revenue - LAG(revenue) OVER (ORDER BY day))
          / NULLIF(LAG(revenue) OVER (ORDER BY day), 0),
    2
  ) AS dod_growth_pct
FROM daily
ORDER BY day;

The NULLIF(..., 0) guard saves you from a division-by-zero if there were zero orders on the previous day.

Cumulative sum

A running total is the most common case where the frame matters. With ORDER BY and no explicit frame, an aggregate already runs cumulatively:

SELECT
  day,
  revenue,
  SUM(revenue) OVER (ORDER BY day) AS cumulative_revenue
FROM daily
ORDER BY day;

That's "revenue from the start of history to today". Want month-to-date instead? Add PARTITION BY per month:

SELECT
  day,
  revenue,
  SUM(revenue) OVER (
    PARTITION BY DATE_TRUNC('month', day)
    ORDER BY day
  ) AS month_to_date
FROM daily
ORDER BY day;

On the first of each month the counter resets and starts accumulating again.

7-day moving average

To stop your revenue chart from twitching on weekends — a staple smoothing trick on dashboards:

SELECT
  day,
  revenue,
  AVG(revenue) OVER (
    ORDER BY day
    ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
  ) AS revenue_7d_avg
FROM daily
ORDER BY day;

The frame ROWS BETWEEN 6 PRECEDING AND CURRENT ROW means "current row plus the six before it." For the first six days the window isn't full — the average is computed over whatever rows exist so far.

Where to go from here

Window functions unlock a whole class of queries that without them require nested subqueries or are outright impossible: top-N per group, day-over-day metrics, cumulative sums, moving averages, lag features for ML. If you write SQL for analytics, this is your daily tool.

The fastest way to internalise this is to solve tasks specifically on the topic. There's a dedicated window-functions category on the trainer with step-by-step breakdowns.

Practice on real tasks

Solve tasks in the SQL trainer with instant grading and hints.

Open trainer