Window functions

Andrey Aksenov

Contents

Window expressions overview
Functions supported by window expressions
- General-purpose window functions
Usage examples

Window functions perform calculations across a set of rows related to the current row without reducing the result set. Unlike aggregate functions, which return a single result for each group of rows, window functions return a value for every row, calculated for the set of rows in the row’s window frame. They are often used in SELECT queries for analytical purposes, such as computing moving averages, running totals, or rankings.

Window expressions can appear only in the SELECT list:

SELECT <function_name>( [...] ) OVER ( () | <window_specification> | <window_name> )
FROM <table_name>
[ WINDOW <window_name> AS (<window_specification>) ];

A window expression applies a window function to a set of rows, called a window frame, defined by the OVER clause. The frame specifies which rows are used in each calculation.

After OVER, you can specify one of the following:

() — empty parentheses to pass all rows from the query result to the window function.
<window_specification> — an inline definition of the window.
<window_name> — a reference to a window defined in the WINDOW clause of the query.

NOTE

To filter query results based on window function values, compute the window functions within a CTE or subquery.

A window specification defines how a window function groups, orders, and limits rows for its calculation. Its basic syntax looks as follows:

<function_name>( [expression [, ...]] ) OVER ( () | <window_specification> | <window_name> )

You can filter rows before applying a window calculation by using the following syntax:

<function_name>( [expression [, ...]] ) [ FILTER ( WHERE <filter_clause> ) ] OVER ( () | <window_specification> | <window_name> )

Only window functions that are aggregates accept the FILTER clause. See Functions supported by window expressions.

The general form of <window_specification> is:

[PARTITION BY <partition_column> [, ...]]
[ORDER BY <order_column> [ASC | DESC] [, ...]]
[ <frame_clause> ]

A window specification has the following characteristics:

PARTITION BY — divides the result set into partitions (subsets of rows) to which the window function is applied independently. If omitted, all rows are treated as a single partition.
ORDER BY — defines the order of rows within each partition. This order determines the sequence in which calculations occur. Ranking functions (such as RANK() or ROW_NUMBER()) require ORDER BY to determine rank values. For OLAP-style aggregations, the ORDER BY clause is required to use window frames (<frame_clause>).
<frame_clause> — specifies the type and boundaries of a window frame, determining the subset of rows within the partition that the function operates on. See Window frames: RANGE and ROWS.

NOTE

The ORDER BY clause inside a window specification operates independently of the query’s top-level ORDER BY.

A window frame is a subset of rows within a window partition determined relative to the current row. When specified, the function operates on this moving frame rather than on the entire partition. Frames can be value-based (RANGE) or row-based (ROWS):

RANGE — defines the frame based on the ordering values specified in the ORDER BY clause. This means that rows with the same values in the ORDER BY columns share the same position in the ordering and are included together in the frame.
ROWS — defines the frame by an exact number of rows before or after the current row.

To specify the type and boundaries of a window frame, use <frame_clause> in one of the following forms:

{ RANGE | ROWS } <frame_start>
{ RANGE | ROWS } BETWEEN <frame_start> AND <frame_end>

The <frame_start> and <frame_end> values define the frame boundaries and can be one of:

UNBOUNDED PRECEDING
<value> PRECEDING
CURRENT ROW
<value> FOLLOWING
UNBOUNDED FOLLOWING

The following table describes the boundary values supported by the RANGE frame clause.

Value Applies to Description

Value	Applies to	Description
UNBOUNDED PRECEDING	Frame start	The frame starts at the first row of the partition
<value> PRECEDING	Frame start or end	The frame starts or ends at rows where the `ORDER BY` column value is at most `<value>` less than the current row’s `ORDER BY` value
CURRENT ROW	Frame start or end	When `CURRENT ROW` is used as the frame start, the frame starts at the first row in the current peer group (all rows with the same `ORDER BY` value as the current row). When `CURRENT ROW` is used as the frame end, the frame ends at the last row in the current peer group (all rows with the same `ORDER BY` value as the current row).
<value> FOLLOWING	Frame start or end	The frame starts or ends at rows where the `ORDER BY` column value is at most `<value>` greater than the current row’s `ORDER BY` value
UNBOUNDED FOLLOWING	Frame end	The frame ends at the last row of the partition

UNBOUNDED PRECEDING

Frame start

The frame starts at the first row of the partition

<value> PRECEDING

Frame start or end

The frame starts or ends at rows where the ORDER BY column value is at most <value> less than the current row’s ORDER BY value

CURRENT ROW

Frame start or end

When CURRENT ROW is used as the frame start, the frame starts at the first row in the current peer group (all rows with the same ORDER BY value as the current row).
When CURRENT ROW is used as the frame end, the frame ends at the last row in the current peer group (all rows with the same ORDER BY value as the current row).

<value> FOLLOWING

Frame start or end

The frame starts or ends at rows where the ORDER BY column value is at most <value> greater than the current row’s ORDER BY value

UNBOUNDED FOLLOWING

Frame end

The frame ends at the last row of the partition

This table lists the boundary options available for the ROWS frame clause.

Value	Applies to	Description
UNBOUNDED PRECEDING	Frame start	The frame starts at the first row of the partition
<value> PRECEDING	Frame start or end	The frame starts or ends a specified number of rows before the current row
CURRENT ROW	Frame start or end	The frame starts or ends at the current row
<value> FOLLOWING	Frame start or end	The frame starts or ends a specified number of rows after the current row
UNBOUNDED FOLLOWING	Frame end	The frame ends at the last row of the partition

Value

Applies to

Description

UNBOUNDED PRECEDING

Frame start

The frame starts at the first row of the partition

<value> PRECEDING

Frame start or end

The frame starts or ends a specified number of rows before the current row

CURRENT ROW

Frame start or end

The frame starts or ends at the current row

<value> FOLLOWING

Frame start or end

The frame starts or ends a specified number of rows after the current row

UNBOUNDED FOLLOWING

Frame end

The frame ends at the last row of the partition

Window expressions support two main categories of functions:

Aggregate functions

Aggregate functions such as SUM(), COUNT(), AVG(), MIN(), and MAX() can be used as window functions. This also includes any user-defined aggregate functions. However, ordered-set and hypothetical-set aggregates cannot be used as window functions. Aggregate functions behave as window functions only when followed by the OVER clause; otherwise, they serve as regular aggregates.
General-purpose window functions

These specialized functions operate on a set of rows related to the current row, as defined by the OVER clause. They include:
- Ranking functions, such as RANK(), DENSE_RANK(), or ROW_NUMBER().
- Distribution functions, such as CUME_DIST() or PERCENT_RANK().
- Value functions, such as LAG() or LEAD().

The following tables list built-in general-purpose window functions, grouped by type.

Ranking functions
Function	Return type	Description
RANK()	bigint	Calculates the rank of a row in an ordered group of values. Rows with equal values for the ranking criteria receive the same rank. The number of tied rows is added to the rank number to calculate the next rank value. Ranks may not be consecutive numbers in this case
DENSE_RANK()	bigint	Computes the rank of a row in an ordered group of rows without skipping rank values. Rows with equal values are given the same rank value
ROW_NUMBER()	bigint	Assigns a unique number to each row to which it is applied
NTILE(expr)	bigint	Divides an ordered dataset into a number of buckets (as defined by `expr`) and assigns a bucket number to each row

Distribution functions
Function	Return type	Description
CUME_DIST()	double precision	Calculates the cumulative distribution of a value in a group of values. Rows with equal values always evaluate to the same cumulative distribution value
PERCENT_RANK()	double precision	Calculates the relative rank of the current row as `(rank - 1) / (total rows - 1)` within the window partition

Value functions
Function	Return type	Description
LAG(expr [,offset] [,default])	same as input `expr` type	Provides access to more than one row of the same table without doing a self-join. Given a series of rows returned from a query and a position of the cursor, `LAG()` provides access to a row at a given physical offset prior to that position. The default offset is `1`. `default` sets the value returned if the offset goes beyond the scope of the window. If `default` is not specified, the default value is `NULL`
LEAD(expr [,offset] [,default])	same as input `expr` type	Provides access to more than one row of the same table without doing a self-join. Given a series of rows returned from a query and a position of the cursor, `LEAD()` provides access to a row at a given physical offset after that position. The default offset is `1`. `default` sets the value returned if the offset goes beyond the scope of the window. If `default` is not specified, the default value is `NULL`
FIRST_VALUE(expr)	same as input `expr` type	Returns the first value in an ordered set of values
LAST_VALUE(expr)	same as input `expr` type	Returns the last value in an ordered set of values

Prerequisites

To follow the examples in this topic, connect to the Greengage DB master host using psql as described in Connect to Greengage DB via psql. Create a new database and connect to it:

CREATE DATABASE marketplace;
\c marketplace

Then, create a table and populate it with data:

Create a table

CREATE TABLE sales
(
    id           INT,
    product_name TEXT,
    category     TEXT,
    quantity     INT,
    price        NUMERIC(8, 2),
    sale_date    TIMESTAMP WITHOUT TIME ZONE,
    region       TEXT
)
    WITH (appendoptimized = true)
    DISTRIBUTED BY (id);

INSERT INTO sales (id, product_name, category, quantity, price, sale_date, region)
VALUES (1, 'Laptop', 'Electronics', 1, 1200.00, '2025-01-01 09:15:00', 'East'),
       (2, 'T-Shirt', 'Clothing', 2, 30.00, '2025-01-01 14:45:00', 'East'),
       (3, 'Headphones', 'Electronics', 1, 180.00, '2025-01-02 10:20:00', 'East'),
       (4, 'Coffee Maker', 'Home', 1, 85.00, '2025-01-02 13:40:00', 'East'),
       (5, 'Desk Chair', 'Home', 1, 220.00, '2025-01-02 16:30:00', 'West'),
       (6, 'Laptop', 'Electronics', 1, 1250.00, '2025-01-03 09:50:00', 'East'),
       (7, 'Wireless Mouse', 'Electronics', 1, 65.00, '2025-01-03 14:10:00', 'East'),
       (8, 'T-Shirt', 'Clothing', 3, 35.00, '2025-01-03 15:55:00', 'West'),
       (9, 'Smartphone', 'Electronics', 2, 850.00, '2025-01-04 11:05:00', 'East'),
       (10, 'Gaming Keyboard', 'Electronics', 1, 120.00, '2025-01-04 17:25:00', 'West'),
       (11, 'Bookshelf', 'Home', 1, 150.00, '2025-01-05 10:10:00', 'West'),
       (12, 'Microwave', 'Home', 1, 200.00, '2025-01-05 15:00:00', 'West');

To apply an aggregate function to subsets of rows, use the GROUP BY clause. The following example returns the total sales amount for each region:

SELECT region,
       SUM(quantity * price) AS sales_amount_per_region
FROM sales
GROUP BY region
ORDER BY region;

The SUM() function reduces the number of rows returned by the query:

 region | sales_amount_per_region
--------+-------------------------
 East   |                 4540.00
 West   |                  795.00
(2 rows)

Similar to an aggregate function, a window function operates on a set of rows. However, it does not reduce the number of rows returned by the query. This query calculates the total sales amount for each region and displays it alongside each sale:

SELECT id,
       region,
       sale_date,
       quantity * price                                 AS sales_amount,
       SUM(quantity * price) OVER (PARTITION BY region) AS sales_amount_per_region
FROM sales
ORDER BY region, sale_date;

The result looks like this:

 id | region |      sale_date      | sales_amount | sales_amount_per_region
----+--------+---------------------+--------------+-------------------------
  1 | East   | 2025-01-01 09:15:00 |      1200.00 |                 4540.00
  2 | East   | 2025-01-01 14:45:00 |        60.00 |                 4540.00
  3 | East   | 2025-01-02 10:20:00 |       180.00 |                 4540.00
  4 | East   | 2025-01-02 13:40:00 |        85.00 |                 4540.00
  6 | East   | 2025-01-03 09:50:00 |      1250.00 |                 4540.00
  7 | East   | 2025-01-03 14:10:00 |        65.00 |                 4540.00
  9 | East   | 2025-01-04 11:05:00 |      1700.00 |                 4540.00
  5 | West   | 2025-01-02 16:30:00 |       220.00 |                  795.00
  8 | West   | 2025-01-03 15:55:00 |       105.00 |                  795.00
 10 | West   | 2025-01-04 17:25:00 |       120.00 |                  795.00
 11 | West   | 2025-01-05 10:10:00 |       150.00 |                  795.00
 12 | West   | 2025-01-05 15:00:00 |       200.00 |                  795.00
(12 rows)

OVER()

Example

If you want to pass all rows from a dataset to a window function, use the OVER clause with empty parentheses after it:

SELECT id,
       region,
       sale_date,
       quantity * price              AS sales_amount,
       SUM(quantity * price) OVER () AS sales_amount_all_regions
FROM sales
ORDER BY region, sale_date;

The result shows the total sales amount for the entire dataset:

 id | region |      sale_date      | sales_amount | sales_amount_all_regions
----+--------+---------------------+--------------+--------------------------
  1 | East   | 2025-01-01 09:15:00 |      1200.00 |                  5335.00
  2 | East   | 2025-01-01 14:45:00 |        60.00 |                  5335.00
  3 | East   | 2025-01-02 10:20:00 |       180.00 |                  5335.00
  4 | East   | 2025-01-02 13:40:00 |        85.00 |                  5335.00
  6 | East   | 2025-01-03 09:50:00 |      1250.00 |                  5335.00
  7 | East   | 2025-01-03 14:10:00 |        65.00 |                  5335.00
  9 | East   | 2025-01-04 11:05:00 |      1700.00 |                  5335.00
  5 | West   | 2025-01-02 16:30:00 |       220.00 |                  5335.00
  8 | West   | 2025-01-03 15:55:00 |       105.00 |                  5335.00
 10 | West   | 2025-01-04 17:25:00 |       120.00 |                  5335.00
 11 | West   | 2025-01-05 10:10:00 |       150.00 |                  5335.00
 12 | West   | 2025-01-05 15:00:00 |       200.00 |                  5335.00
(12 rows)

This SELECT query calculates a running total of sales for each region:

SELECT id,
       region,
       sale_date,
       quantity * price        AS sales_amount,
       SUM(quantity * price) OVER (
           PARTITION BY region
           ORDER BY sale_date) AS running_sales_amount_per_region
FROM sales
ORDER BY region, sale_date;

The key difference is using ORDER BY sale_date inside the window function, which makes the sum cumulative in the order of sales within each region. You can see the results in the running_sales_amount_per_region column:

 id | region |      sale_date      | sales_amount | running_sales_amount_per_region
----+--------+---------------------+--------------+---------------------------------
  1 | East   | 2025-01-01 09:15:00 |      1200.00 |                         1200.00
  2 | East   | 2025-01-01 14:45:00 |        60.00 |                         1260.00
  3 | East   | 2025-01-02 10:20:00 |       180.00 |                         1440.00
  4 | East   | 2025-01-02 13:40:00 |        85.00 |                         1525.00
  6 | East   | 2025-01-03 09:50:00 |      1250.00 |                         2775.00
  7 | East   | 2025-01-03 14:10:00 |        65.00 |                         2840.00
  9 | East   | 2025-01-04 11:05:00 |      1700.00 |                         4540.00
  5 | West   | 2025-01-02 16:30:00 |       220.00 |                          220.00
  8 | West   | 2025-01-03 15:55:00 |       105.00 |                          325.00
 10 | West   | 2025-01-04 17:25:00 |       120.00 |                          445.00
 11 | West   | 2025-01-05 10:10:00 |       150.00 |                          595.00
 12 | West   | 2025-01-05 15:00:00 |       200.00 |                          795.00
(12 rows)

This query uses a named window in the WINDOW clause to define the window specification:

SELECT id,
       region,
       sale_date,
       quantity * price    AS sales_amount,
       SUM(quantity * price) OVER
           regional_window AS running_sales_amount_per_region
FROM sales
WINDOW regional_window AS (
        PARTITION BY region
        ORDER BY sale_date
        )
ORDER BY region, sale_date;

The result looks like this:

 id | region |      sale_date      | sales_amount | running_sales_amount_per_region
----+--------+---------------------+--------------+---------------------------------
  1 | East   | 2025-01-01 09:15:00 |      1200.00 |                         1200.00
  2 | East   | 2025-01-01 14:45:00 |        60.00 |                         1260.00
  3 | East   | 2025-01-02 10:20:00 |       180.00 |                         1440.00
  4 | East   | 2025-01-02 13:40:00 |        85.00 |                         1525.00
  6 | East   | 2025-01-03 09:50:00 |      1250.00 |                         2775.00
  7 | East   | 2025-01-03 14:10:00 |        65.00 |                         2840.00
  9 | East   | 2025-01-04 11:05:00 |      1700.00 |                         4540.00
  5 | West   | 2025-01-02 16:30:00 |       220.00 |                          220.00
  8 | West   | 2025-01-03 15:55:00 |       105.00 |                          325.00
 10 | West   | 2025-01-04 17:25:00 |       120.00 |                          445.00
 11 | West   | 2025-01-05 10:10:00 |       150.00 |                          595.00
 12 | West   | 2025-01-05 15:00:00 |       200.00 |                          795.00
(12 rows)

This query implicitly uses the default RANGE frame, which includes all rows in the partition up to the current row:

SELECT id,
       region,
       sale_date,
       quantity * price        AS sales_amount,
       SUM(quantity * price) OVER (
           PARTITION BY region
           ORDER BY sale_date) AS running_sales_amount_per_region
FROM sales
ORDER BY region, sale_date;

The following query specifies the same default frame explicitly with RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW:

SELECT id,
       region,
       sale_date,
       quantity * price                                       AS sales_amount,
       SUM(quantity * price) OVER (
           PARTITION BY region
           ORDER BY sale_date
           RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_sales_amount_per_region
FROM sales
ORDER BY region, sale_date;

To specify a time-based frame, this query uses RANGE to include rows in the same partition from the last 24 hours up to the current row:

SELECT id,
       region,
       sale_date,
       quantity * price                                                 AS sales_amount,
       SUM(quantity * price) OVER (
           PARTITION BY region
           ORDER BY sale_date
           RANGE BETWEEN INTERVAL '24 hours' PRECEDING AND CURRENT ROW) AS sum_last_24_hours
FROM sales
ORDER BY region, sale_date;

The result looks like this:

 id | region |      sale_date      | sales_amount | sum_last_24_hours
----+--------+---------------------+--------------+-------------------
  1 | East   | 2025-01-01 09:15:00 |      1200.00 |           1200.00
  2 | East   | 2025-01-01 14:45:00 |        60.00 |           1260.00
  3 | East   | 2025-01-02 10:20:00 |       180.00 |            240.00
  4 | East   | 2025-01-02 13:40:00 |        85.00 |            325.00
  6 | East   | 2025-01-03 09:50:00 |      1250.00 |           1515.00
  7 | East   | 2025-01-03 14:10:00 |        65.00 |           1315.00
  9 | East   | 2025-01-04 11:05:00 |      1700.00 |           1765.00
  5 | West   | 2025-01-02 16:30:00 |       220.00 |            220.00
  8 | West   | 2025-01-03 15:55:00 |       105.00 |            325.00
 10 | West   | 2025-01-04 17:25:00 |       120.00 |            120.00
 11 | West   | 2025-01-05 10:10:00 |       150.00 |            270.00
 12 | West   | 2025-01-05 15:00:00 |       200.00 |            470.00
(12 rows)

This snippet highlights how the 24-hour frame sums are calculated for rows with IDs 3 and 12, indicating which rows fall inside or outside the frame:

  1 | East   | 2025-01-01 09:15:00 |      1200.00 |            -- outside 24-hour frame
  2 | East   | 2025-01-01 14:45:00 |        60.00 |            -- inside 24-hour frame
  3 | East   | 2025-01-02 10:20:00 |       180.00 |    240.00  -- current row
  ...
  8 | West   | 2025-01-03 15:55:00 |       105.00 |            -- outside 24-hour frame
 10 | West   | 2025-01-04 17:25:00 |       120.00 |            -- inside 24-hour frame
 11 | West   | 2025-01-05 10:10:00 |       150.00 |            -- inside 24-hour frame
 12 | West   | 2025-01-05 15:00:00 |       200.00 |    470.00  -- current row

This query calculates a running total of sales amounts over the current row and the two preceding rows within each region, using ROWS to define the frame:

SELECT id,
       region,
       sale_date,
       quantity * price                              AS sales_amount,
       SUM(quantity * price) OVER (
           PARTITION BY region
           ORDER BY sale_date
           ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS sum_last_3_sales
FROM sales
ORDER BY region, sale_date;

The result looks as follows:

 id | region |      sale_date      | sales_amount | sum_last_3_sales
----+--------+---------------------+--------------+------------------
  1 | East   | 2025-01-01 09:15:00 |      1200.00 |          1200.00
  2 | East   | 2025-01-01 14:45:00 |        60.00 |          1260.00
  3 | East   | 2025-01-02 10:20:00 |       180.00 |          1440.00
  4 | East   | 2025-01-02 13:40:00 |        85.00 |           325.00
  6 | East   | 2025-01-03 09:50:00 |      1250.00 |          1515.00
  7 | East   | 2025-01-03 14:10:00 |        65.00 |          1400.00
  9 | East   | 2025-01-04 11:05:00 |      1700.00 |          3015.00
  5 | West   | 2025-01-02 16:30:00 |       220.00 |           220.00
  8 | West   | 2025-01-03 15:55:00 |       105.00 |           325.00
 10 | West   | 2025-01-04 17:25:00 |       120.00 |           445.00
 11 | West   | 2025-01-05 10:10:00 |       150.00 |           375.00
 12 | West   | 2025-01-05 15:00:00 |       200.00 |           470.00
(12 rows)

This snippet highlights how the running total over the current row and two preceding rows is calculated for rows with IDs 4 and 12, indicating which rows fall inside or outside the frame:

  1 | East   | 2025-01-01 09:15:00 |      1200.00 |            -- outside 3-row frame
  2 | East   | 2025-01-01 14:45:00 |        60.00 |            -- inside 3-row frame
  3 | East   | 2025-01-02 10:20:00 |       180.00 |            -- inside 3-row frame
  4 | East   | 2025-01-02 13:40:00 |        85.00 |    325.00  -- current row
  ...
  8 | West   | 2025-01-03 15:55:00 |       105.00 |            -- outside 3-row frame
 10 | West   | 2025-01-04 17:25:00 |       120.00 |            -- inside 3-row frame
 11 | West   | 2025-01-05 10:10:00 |       150.00 |            -- inside 3-row frame
 12 | West   | 2025-01-05 15:00:00 |       200.00 |    470.00  -- current row

This example calculates the moving average of the last three sales amounts within each region:

SELECT id,
       region,
       sale_date,
       quantity * price                              AS sales_amount,
       AVG(quantity * price) OVER (
           PARTITION BY region
           ORDER BY sale_date
           ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS moving_avg_last_3_sales
FROM sales
ORDER BY region, sale_date;

The result looks as follows:

 id | region |      sale_date      | sales_amount | moving_avg_last_3_sales
----+--------+---------------------+--------------+-------------------------
  1 | East   | 2025-01-01 09:15:00 |      1200.00 |   1200.0000000000000000
  2 | East   | 2025-01-01 14:45:00 |        60.00 |    630.0000000000000000
  3 | East   | 2025-01-02 10:20:00 |       180.00 |    480.0000000000000000
  4 | East   | 2025-01-02 13:40:00 |        85.00 |    108.3333333333333333
  6 | East   | 2025-01-03 09:50:00 |      1250.00 |    505.0000000000000000
  7 | East   | 2025-01-03 14:10:00 |        65.00 |    466.6666666666666667
  9 | East   | 2025-01-04 11:05:00 |      1700.00 |   1005.0000000000000000
  5 | West   | 2025-01-02 16:30:00 |       220.00 |    220.0000000000000000
  8 | West   | 2025-01-03 15:55:00 |       105.00 |    162.5000000000000000
 10 | West   | 2025-01-04 17:25:00 |       120.00 |    148.3333333333333333
 11 | West   | 2025-01-05 10:10:00 |       150.00 |    125.0000000000000000
 12 | West   | 2025-01-05 15:00:00 |       200.00 |    156.6666666666666667
(12 rows)

The ROW_NUMBER() and RANK() functions assign a unique integer to each row according to its order in the result set.

This example assigns a sequential row number to each sale within its region using the ROW_NUMBER() window function:

SELECT id,
       region,
       sale_date,
       ROW_NUMBER() OVER (
           PARTITION BY region
           ORDER BY sale_date) AS row_num
FROM sales
ORDER BY region, sale_date;

The result should look like this:

 id | region |      sale_date      | row_num
----+--------+---------------------+---------
  1 | East   | 2025-01-01 09:15:00 |       1
  2 | East   | 2025-01-01 14:45:00 |       2
  3 | East   | 2025-01-02 10:20:00 |       3
  4 | East   | 2025-01-02 13:40:00 |       4
  6 | East   | 2025-01-03 09:50:00 |       5
  7 | East   | 2025-01-03 14:10:00 |       6
  9 | East   | 2025-01-04 11:05:00 |       7
  5 | West   | 2025-01-02 16:30:00 |       1
  8 | West   | 2025-01-03 15:55:00 |       2
 10 | West   | 2025-01-04 17:25:00 |       3
 11 | West   | 2025-01-05 10:10:00 |       4
 12 | West   | 2025-01-05 15:00:00 |       5
(12 rows)

This query ranks rows within each region by sales amount using the RANK() function:

SELECT id,
       region,
       quantity * price                    AS sales_amount,
       RANK() OVER (
           PARTITION BY region
           ORDER BY quantity * price DESC) AS rank_by_sales_amount
FROM sales
ORDER BY region, rank_by_sales_amount;

The result is as follows:

 id | region | sales_amount | rank_by_sales_amount
----+--------+--------------+----------------------
  9 | East   |      1700.00 |                    1
  6 | East   |      1250.00 |                    2
  1 | East   |      1200.00 |                    3
  3 | East   |       180.00 |                    4
  4 | East   |        85.00 |                    5
  7 | East   |        65.00 |                    6
  2 | East   |        60.00 |                    7
  5 | West   |       220.00 |                    1
 12 | West   |       200.00 |                    2
 11 | West   |       150.00 |                    3
 10 | West   |       120.00 |                    4
  8 | West   |       105.00 |                    5
(12 rows)

This example compares the behavior of the RANK() and DENSE_RANK() functions:

RANK() leaves gaps in the ranking sequence after ties.
DENSE_RANK() assigns consecutive ranks without gaps.

The following queries assign ranks based on the region value:

RANK()

SELECT id,
       region,
       quantity * price AS sales_amount,
       RANK() OVER (
           ORDER BY region DESC
           )            AS rank_by_region
FROM sales
ORDER BY region, rank_by_region;

DENSE_RANK()

SELECT id,
       region,
       quantity * price AS sales_amount,
       DENSE_RANK() OVER (
           ORDER BY region DESC
           )            AS dense_rank_by_region
FROM sales
ORDER BY region, dense_rank_by_region;

The outputs show that RANK() assigns the same rank to tied rows but skips to the next available rank (from 1 to 6), while DENSE_RANK() continues the sequence without gaps (from 1 to 2):

RANK()

 id | region | sales_amount | rank_by_region
----+--------+--------------+----------------
  2 | East   |        60.00 |              6
  9 | East   |      1700.00 |              6
  6 | East   |      1250.00 |              6
  4 | East   |        85.00 |              6
  3 | East   |       180.00 |              6
  1 | East   |      1200.00 |              6
  7 | East   |        65.00 |              6
 12 | West   |       200.00 |              1
  5 | West   |       220.00 |              1
 10 | West   |       120.00 |              1
 11 | West   |       150.00 |              1
  8 | West   |       105.00 |              1
(12 rows)

DENSE_RANK()

 id | region | sales_amount | dense_rank_by_region
----+--------+--------------+----------------------
  2 | East   |        60.00 |                    2
  9 | East   |      1700.00 |                    2
  1 | East   |      1200.00 |                    2
  4 | East   |        85.00 |                    2
  3 | East   |       180.00 |                    2
  6 | East   |      1250.00 |                    2
  7 | East   |        65.00 |                    2
  5 | West   |       220.00 |                    1
 10 | West   |       120.00 |                    1
 11 | West   |       150.00 |                    1
 12 | West   |       200.00 |                    1
  8 | West   |       105.00 |                    1
(12 rows)

This query returns the first sale date within each region using FIRST_VALUE():

SELECT id,
       region,
       sale_date,
       FIRST_VALUE(sale_date) OVER (
           PARTITION BY region
           ORDER BY sale_date) AS first_sale_date
FROM sales
ORDER BY region, sale_date;

The result looks like this:

 id | region |      sale_date      |   first_sale_date
----+--------+---------------------+---------------------
  1 | East   | 2025-01-01 09:15:00 | 2025-01-01 09:15:00
  2 | East   | 2025-01-01 14:45:00 | 2025-01-01 09:15:00
  3 | East   | 2025-01-02 10:20:00 | 2025-01-01 09:15:00
  4 | East   | 2025-01-02 13:40:00 | 2025-01-01 09:15:00
  6 | East   | 2025-01-03 09:50:00 | 2025-01-01 09:15:00
  7 | East   | 2025-01-03 14:10:00 | 2025-01-01 09:15:00
  9 | East   | 2025-01-04 11:05:00 | 2025-01-01 09:15:00
  5 | West   | 2025-01-02 16:30:00 | 2025-01-02 16:30:00
  8 | West   | 2025-01-03 15:55:00 | 2025-01-02 16:30:00
 10 | West   | 2025-01-04 17:25:00 | 2025-01-02 16:30:00
 11 | West   | 2025-01-05 10:10:00 | 2025-01-02 16:30:00
 12 | West   | 2025-01-05 15:00:00 | 2025-01-02 16:30:00
(12 rows)

This query returns the last sale date within each region using LAST_VALUE():

SELECT id,
       region,
       sale_date,
       LAST_VALUE(sale_date) OVER (
           PARTITION BY region
           ORDER BY sale_date
           ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_sale_date
FROM sales
ORDER BY region, sale_date;

Note that the frame clause ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING is specified to ensure LAST_VALUE() considers all rows in the partition, as the default frame is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. The output is as follows:

 id | region |      sale_date      |   last_sale_date
----+--------+---------------------+---------------------
  1 | East   | 2025-01-01 09:15:00 | 2025-01-04 11:05:00
  2 | East   | 2025-01-01 14:45:00 | 2025-01-04 11:05:00
  3 | East   | 2025-01-02 10:20:00 | 2025-01-04 11:05:00
  4 | East   | 2025-01-02 13:40:00 | 2025-01-04 11:05:00
  6 | East   | 2025-01-03 09:50:00 | 2025-01-04 11:05:00
  7 | East   | 2025-01-03 14:10:00 | 2025-01-04 11:05:00
  9 | East   | 2025-01-04 11:05:00 | 2025-01-04 11:05:00
  5 | West   | 2025-01-02 16:30:00 | 2025-01-05 15:00:00
  8 | West   | 2025-01-03 15:55:00 | 2025-01-05 15:00:00
 10 | West   | 2025-01-04 17:25:00 | 2025-01-05 15:00:00
 11 | West   | 2025-01-05 10:10:00 | 2025-01-05 15:00:00
 12 | West   | 2025-01-05 15:00:00 | 2025-01-05 15:00:00
(12 rows)

The LAG() function returns the value from a preceding row in the same partition:

SELECT id,
       region,
       sale_date,
       quantity * price        AS sales_amount,
       LAG(quantity * price) OVER (
           PARTITION BY region
           ORDER BY sale_date) AS prev_sales_amount
FROM sales
ORDER BY region, sale_date;

The output looks like this:

 id | region |      sale_date      | sales_amount | prev_sales_amount
----+--------+---------------------+--------------+-------------------
  1 | East   | 2025-01-01 09:15:00 |      1200.00 |
  2 | East   | 2025-01-01 14:45:00 |        60.00 |           1200.00
  3 | East   | 2025-01-02 10:20:00 |       180.00 |             60.00
  4 | East   | 2025-01-02 13:40:00 |        85.00 |            180.00
  6 | East   | 2025-01-03 09:50:00 |      1250.00 |             85.00
  7 | East   | 2025-01-03 14:10:00 |        65.00 |           1250.00
  9 | East   | 2025-01-04 11:05:00 |      1700.00 |             65.00
  5 | West   | 2025-01-02 16:30:00 |       220.00 |
  8 | West   | 2025-01-03 15:55:00 |       105.00 |            220.00
 10 | West   | 2025-01-04 17:25:00 |       120.00 |            105.00
 11 | West   | 2025-01-05 10:10:00 |       150.00 |            120.00
 12 | West   | 2025-01-05 15:00:00 |       200.00 |            150.00
(12 rows)

The LEAD() function returns the value from a following row in the same partition:

SELECT id,
       region,
       sale_date,
       quantity * price        AS sales_amount,
       LEAD(quantity * price) OVER (
           PARTITION BY region
           ORDER BY sale_date) AS next_sales_amount
FROM sales
ORDER BY region, sale_date;

The output should look as follows:

 id | region |      sale_date      | sales_amount | next_sales_amount
----+--------+---------------------+--------------+-------------------
  1 | East   | 2025-01-01 09:15:00 |      1200.00 |             60.00
  2 | East   | 2025-01-01 14:45:00 |        60.00 |            180.00
  3 | East   | 2025-01-02 10:20:00 |       180.00 |             85.00
  4 | East   | 2025-01-02 13:40:00 |        85.00 |           1250.00
  6 | East   | 2025-01-03 09:50:00 |      1250.00 |             65.00
  7 | East   | 2025-01-03 14:10:00 |        65.00 |           1700.00
  9 | East   | 2025-01-04 11:05:00 |      1700.00 |
  5 | West   | 2025-01-02 16:30:00 |       220.00 |            105.00
  8 | West   | 2025-01-03 15:55:00 |       105.00 |            120.00
 10 | West   | 2025-01-04 17:25:00 |       120.00 |            150.00
 11 | West   | 2025-01-05 10:10:00 |       150.00 |            200.00
 12 | West   | 2025-01-05 15:00:00 |       200.00 |
(12 rows)

This example uses a window function inside a CTE to find sales with amounts greater than the regional average:

WITH sales_with_avg AS (SELECT id,
                               quantity * price                                 AS sales_amount,
                               AVG(quantity * price) OVER (PARTITION BY region) AS avg_sales_amount_per_region
                        FROM sales)
SELECT id,
       sales_amount,
       avg_sales_amount_per_region
FROM sales_with_avg
WHERE sales_amount > avg_sales_amount_per_region
ORDER BY id;

The result is as follows:

 id | sales_amount | avg_sales_amount_per_region
----+--------------+-----------------------------
  1 |      1200.00 |        648.5714285714285714
  5 |       220.00 |        159.0000000000000000
  6 |      1250.00 |        648.5714285714285714
  9 |      1700.00 |        648.5714285714285714
 12 |       200.00 |        159.0000000000000000
(5 rows)

Aggregate functions

JSON

Window functions

Window expressions overview

Position in a SELECT query

Window specification syntax

Window frames: RANGE and ROWS

Functions supported by window expressions

General-purpose window functions

Usage examples

Window functions overview

WINDOW clause

RANGE and ROWS

Moving average

Ranking functions

Value functions

Window functions and CTE