Hello, I’m DocuDroid!
Submitting feedback
Thank you for rating our AI Search!
We would be grateful if you could share your thoughts so we can improve our AI Search for you and other readers.
GitHub

Common table expressions

Andrey Aksenov

A common table expression (CTE) is a temporary result set defined within the execution scope of a single SQL statement. CTEs allow you to simplify complex queries by splitting them into smaller, more manageable parts. They can also help optimize query execution when the same dataset needs to be accessed multiple times within the query.

This topic describes how to use CTEs in SELECT queries.

NOTE

For a SELECT command, you can also use the data-modifying commands INSERT, UPDATE, or DELETE within a CTE.

Prerequisites

To follow the examples in this topic, connect to the Greengage DB master host using psql as described in Connect to Greengage DB via psql. Create a new database and connect to it:

CREATE DATABASE marketplace;
\c marketplace

Non-recursive CTEs

Syntax

To define a CTE, use the WITH clause before the main query:

WITH <cte_name> (<column_list>) AS (
    <cte_subquery>
)
<main_query>;

where:

  • <cte_name> — the name used to reference the CTE within the main query.

  • <column_list> — (optional) a list of column names for the CTE result set.

  • <cte_subquery> — the SQL query that defines the CTE result set.

  • <main_query> — the main SQL query that uses the CTE.

You can also define multiple CTEs in a single WITH clause by separating them with commas:

WITH <cte_name1> (<column_list1>) AS (
    <cte_subquery1>
),
<cte_name2> (<column_list2>) AS (
    <cte_subquery2>
),
...
<primary_query>;

One CTE subquery

To follow the examples in this section, create the sales table:

CREATE TABLE sales
(
    id           INT,
    date         DATE,
    product_name TEXT,
    category     TEXT,
    quantity     INT,
    unit_price   NUMERIC(8, 2)
)
    WITH (appendoptimized = true)
    DISTRIBUTED BY (id);

Insert data:

INSERT INTO sales (id, date, product_name, category, quantity, unit_price)
VALUES (1, '2025-01-03', 'Laptop', 'Electronics', 1, 1200.00),
       (2, '2025-01-04', 'Headphones', 'Electronics', 2, 150.00),
       (3, '2025-01-05', 'Coffee Maker', 'Home', 1, 85.00),
       (4, '2025-01-06', 'T-Shirt', 'Clothing', 3, 25.00),
       (5, '2025-01-07', 'Desk Chair', 'Home', 1, 200.00),
       (6, '2025-01-08', 'Smartphone', 'Electronics', 2, 800.00),
       (7, '2025-01-09', 'Headphones', 'Electronics', 1, 150.00),
       (8, '2025-01-10', 'T-Shirt', 'Clothing', 2, 25.00),
       (9, '2025-01-11', 'Laptop', 'Electronics', 1, 1200.00);

This query defines a CTE named product_sales to aggregate sales data by product and then selects products sorted by total revenue in descending order:

WITH product_sales AS (SELECT product_name,
                              SUM(quantity)              AS total_quantity,
                              SUM(quantity * unit_price) AS total_revenue
                       FROM sales
                       GROUP BY product_name)
SELECT product_name,
       total_revenue
FROM product_sales
ORDER BY total_revenue DESC;

The result looks like this:

 product_name | total_revenue
--------------+---------------
 Laptop       |       2400.00
 Smartphone   |       1600.00
 Headphones   |        450.00
 Desk Chair   |        200.00
 T-Shirt      |        125.00
 Coffee Maker |         85.00
(6 rows)

This query uses the product_sales subquery in the FROM clause, which returns the same result as the CTE above:

SELECT product_name,
       total_revenue
FROM (SELECT product_name,
             SUM(quantity)              AS total_quantity,
             SUM(quantity * unit_price) AS total_revenue
      FROM sales
      GROUP BY product_name) AS product_sales
ORDER BY total_revenue DESC;

Multiple CTE subqueries

This query defines two CTEs: category_sales aggregates revenue by category, and product_sales aggregates sales by product. The main query joins these CTEs to display each product’s revenue along with its category’s total revenue:

WITH category_sales AS (SELECT category,
                                SUM(quantity * unit_price) AS category_revenue
                         FROM sales
                         GROUP BY category),
     product_sales AS (SELECT product_name,
                              category,
                              SUM(quantity)              AS total_quantity,
                              SUM(quantity * unit_price) AS total_revenue
                       FROM sales
                       GROUP BY product_name, category)
SELECT product_sales.product_name,
       product_sales.category,
       product_sales.total_revenue,
       category_sales.category_revenue
FROM product_sales
         JOIN category_sales ON product_sales.category = category_sales.category
ORDER BY product_sales.total_revenue DESC;

The result should look like this:

 product_name |  category   | total_revenue | category_revenue
--------------+-------------+---------------+------------------
 Laptop       | Electronics |       2400.00 |          4450.00
 Smartphone   | Electronics |       1600.00 |          4450.00
 Headphones   | Electronics |        450.00 |          4450.00
 Desk Chair   | Home        |        200.00 |           285.00
 T-Shirt      | Clothing    |        125.00 |           125.00
 Coffee Maker | Home        |         85.00 |           285.00
(6 rows)

If you rewrite this query without the WITH clause, it becomes a more complex statement with two nested subqueries in the FROM clause:

SELECT product_sales.product_name,
       product_sales.category,
       product_sales.total_revenue,
       category_sales.category_revenue
FROM (SELECT sales.product_name,
             sales.category,
             SUM(sales.quantity)                    AS total_quantity,
             SUM(sales.quantity * sales.unit_price) AS total_revenue
      FROM sales
      GROUP BY sales.product_name, sales.category) AS product_sales
         JOIN (SELECT sales.category,
                      SUM(sales.quantity * sales.unit_price) AS category_revenue
               FROM sales
               GROUP BY sales.category) AS category_sales
              ON product_sales.category = category_sales.category
ORDER BY product_sales.total_revenue DESC;

Recursive CTEs

Syntax

A recursive CTE allows you to perform iterative computations within a query, commonly used to process hierarchical data. It has the following syntax:

WITH RECURSIVE <cte_name> (<column_list>) AS (
    -- Non-recursive term: runs once
    <non_recursive_term>

    UNION [ALL]

    -- Recursive term: refers to <cte_name>, runs repeatedly
    <recursive_term>
)
<main_query>;

where:

  • <cte_name> — the name of the CTE that is referenced in the recursive term and in the main query.

  • <column_list> — (optional) a list of column names for the CTE result set.

  • <non_recursive_term> — the base query that produces the initial result set (executed once).

  • <recursive_term> — a query that references <cte_name> and is executed repeatedly, adding new rows to the result set until no more rows are returned. The result is combined with a non-recursive term using the UNION or UNION ALL operator.

  • <main_query> — the main SQL query that consumes the final result of the recursive CTE.

Greengage DB processes a recursive CTE in the following steps:

  1. Executes the non-recursive term to produce the initial result set.

  2. Executes the recursive term using the results from the previous step as input to generate additional rows.

  3. Repeats step 2 until the recursive term produces no new rows.

  4. Returns the combined result set, which is the union of all rows produced by the non-recursive and recursive terms.

Number sequence

This query uses a recursive CTE to generate a sequence of numbers from 1 to 5:

WITH RECURSIVE numbers AS (
    -- Non-recursive term
    SELECT 1 AS n

    UNION ALL

    -- Recursive term
    SELECT n + 1 FROM numbers WHERE n < 5
)
SELECT n FROM numbers;

The result looks like this:

 n
---
 1
 2
 3
 4
 5
(5 rows)

Running total

This query uses a recursive CTE named running_sum to calculate a running total for numbers from 1 to 5:

WITH RECURSIVE running_sum AS (
    SELECT 1 AS n,
           1 AS running_total

    UNION ALL

    SELECT n + 1,
           running_total + n + 1
    FROM running_sum
    WHERE n < 5
)
SELECT n,
       running_total
FROM running_sum;

The result is as follows:

 n | running_total
---+---------------
 1 |             1
 2 |             3
 3 |             6
 4 |            10
 5 |            15
(5 rows)

This query calculates a running total of the quantity of products in the sales table using a recursive CTE. It starts from a specific date and adds each day’s product quantity to the cumulative total of the previous days:

WITH RECURSIVE cumulative_quantity AS (
   SELECT date,
          quantity,
          quantity AS cumulative_total
   FROM sales
   WHERE date = '2025-01-03'

   UNION ALL

   SELECT sales.date,
          sales.quantity,
          cumulative_quantity.cumulative_total + sales.quantity
   FROM sales
   JOIN cumulative_quantity ON sales.date = cumulative_quantity.date + INTERVAL '1 day'
)
SELECT date,
       quantity,
       cumulative_total
FROM cumulative_quantity;

The result should look as follows:

    date    | quantity | cumulative_total
------------+----------+------------------
 2025-01-03 |        1 |                1
 2025-01-04 |        2 |                3
 2025-01-05 |        1 |                4
 2025-01-06 |        3 |                7
 2025-01-07 |        1 |                8
 2025-01-08 |        2 |               10
 2025-01-09 |        1 |               11
 2025-01-10 |        2 |               13
 2025-01-11 |        1 |               14
(9 rows)

Hierarchical data

To follow the examples in this section, create the customers table:

CREATE TABLE customers
(
    customer_id INT,
    name        TEXT,
    referred_by INT,
    region      TEXT
)
    WITH (appendoptimized = true)
    DISTRIBUTED BY (customer_id);

This statement inserts sample customer data with referral relationships:

INSERT INTO customers (customer_id, name, referred_by, region)
VALUES (1, 'Alice', NULL, 'North'),
       (2, 'Bob', 1, 'North'),
       (3, 'Charlie', 1, 'North'),
       (4, 'David', 2, 'South'),
       (5, 'Eve', 2, 'South'),
       (6, 'Frank', 4, 'South'),
       (7, 'Grace', 3, 'East');
/*
       Alice
       ├── Bob
       │   ├── David
       │   │   └── Frank
       │   └── Eve
       └── Charlie
           └── Grace
*/

This query uses a recursive CTE to find all customers referred by a customer with customer_id equal to 2, directly or through a chain of referrals:

WITH RECURSIVE referrals(customer_id, name, referred_by) AS (
    SELECT customer_id,
           name,
           referred_by
    FROM customers
    WHERE customer_id = 2

    UNION ALL

    SELECT customers.customer_id,
           customers.name,
           customers.referred_by
    FROM customers
    INNER JOIN referrals ON customers.referred_by = referrals.customer_id
)
SELECT * FROM referrals;

Search order

By default, recursive queries produce results in breadth-first search order. When applied to the customers table, a breadth-first search processes all immediate referrals of a customer before moving deeper into the referral chain. In depth-first order, the query follows each referral chain as far as possible before moving to the next referral.

To change the search order, you need to build a path array that tracks the sequence of referrals. This query appends each customer’s ID to the path as the recursion progresses, forming the complete referral chain for each row:

WITH RECURSIVE referrals(customer_id, name, referred_by, path) AS (
    SELECT
        customer_id,
        name,
        referred_by,
        ARRAY[customer_id] AS path
    FROM customers
    WHERE customer_id = 2

    UNION ALL

    SELECT
        customers.customer_id,
        customers.name,
        customers.referred_by,
        referrals.path || customers.customer_id
    FROM customers
    INNER JOIN referrals ON customers.referred_by = referrals.customer_id
)
SELECT * FROM referrals;

To display results in depth-first order, add ORDER BY path in the outer query:

WITH RECURSIVE referrals(customer_id, name, referred_by, path) AS (
    SELECT
        customer_id,
        name,
        referred_by,
        ARRAY[customer_id] AS path
    FROM customers
    WHERE customer_id = 2

    UNION ALL

    SELECT
        customers.customer_id,
        customers.name,
        customers.referred_by,
        referrals.path || customers.customer_id
    FROM customers
    INNER JOIN referrals ON customers.referred_by = referrals.customer_id
)
SELECT * FROM referrals ORDER BY path;

The results of these queries:

 customer_id | name  | referred_by |  path
-------------+-------+-------------+---------
           2 | Bob   |           1 | {2}
           4 | David |           2 | {2,4}
           5 | Eve   |           2 | {2,5}
           6 | Frank |           4 | {2,4,6}
(4 rows)
 customer_id | name  | referred_by |  path
-------------+-------+-------------+---------
           2 | Bob   |           1 | {2}
           4 | David |           2 | {2,4}
           6 | Frank |           4 | {2,4,6}
           5 | Eve   |           2 | {2,5}
(4 rows)

Cycle detection

In recursive queries, the recursive part must eventually return no rows to avoid an endless loop. To see cycle detection and handling in action, create a cycle by updating the customers table as follows:

UPDATE customers
SET referred_by = 6
WHERE customer_id = 1;

This query uses a recursive CTE to traverse referral relationships starting from the customer with ID equal to 2, while detecting cycles by tracking the referral path. The query includes the cycle flag that becomes true if a customer appears more than once in the referral path:

WITH RECURSIVE referrals(customer_id, name, referred_by, path, cycle) AS (
    SELECT
        customer_id,
        name,
        referred_by,
        ARRAY[customer_id] AS path,
        FALSE AS cycle
    FROM customers
    WHERE customer_id = 2

    UNION ALL

    SELECT
        customers.customer_id,
        customers.name,
        customers.referred_by,
        referrals.path || customers.customer_id,
        customers.customer_id = ANY(referrals.path) AS cycle  -- Detect a cycle
    FROM customers
    INNER JOIN referrals ON customers.referred_by = referrals.customer_id
    WHERE NOT referrals.cycle  -- Stop recursion if a cycle is found
)
SELECT * FROM referrals;
 customer_id |  name   | referred_by |     path      | cycle
-------------+---------+-------------+---------------+-------
           2 | Bob     |           1 | {2}           | f
           5 | Eve     |           2 | {2,5}         | f
           4 | David   |           2 | {2,4}         | f
           6 | Frank   |           4 | {2,4,6}       | f
           1 | Alice   |           6 | {2,4,6,1}     | f
           3 | Charlie |           1 | {2,4,6,1,3}   | f
           2 | Bob     |           1 | {2,4,6,1,2}   | t
           7 | Grace   |           3 | {2,4,6,1,3,7} | f
(8 rows)

In cases where detecting a cycle requires checking multiple fields, an array of rows can be used to track visited combinations:

WITH RECURSIVE referrals AS (
    SELECT
        customer_id,
        name,
        referred_by,
        region,
        ARRAY[ROW(customer_id, region)] AS path,
        FALSE AS cycle
    FROM customers
    WHERE customer_id = 2

    UNION ALL

    SELECT
        customers.customer_id,
        customers.name,
        customers.referred_by,
        customers.region,
        referrals.path || ROW(customers.customer_id, customers.region),
        ROW(customers.customer_id, customers.region) = ANY(referrals.path)
    FROM customers
    JOIN referrals ON customers.referred_by = referrals.customer_id
    WHERE NOT referrals.cycle
)
SELECT * FROM referrals;
 customer_id |  name   | referred_by | region |                                   path                                   | cycle
-------------+---------+-------------+--------+--------------------------------------------------------------------------+-------
           2 | Bob     |           1 | North  | {"(2,North)"}                                                            | f
           4 | David   |           2 | South  | {"(2,North)","(4,South)"}                                                | f
           5 | Eve     |           2 | South  | {"(2,North)","(5,South)"}                                                | f
           6 | Frank   |           4 | South  | {"(2,North)","(4,South)","(6,South)"}                                    | f
           1 | Alice   |           6 | North  | {"(2,North)","(4,South)","(6,South)","(1,North)"}                        | f
           2 | Bob     |           1 | North  | {"(2,North)","(4,South)","(6,South)","(1,North)","(2,North)"}            | t
           3 | Charlie |           1 | North  | {"(2,North)","(4,South)","(6,South)","(1,North)","(3,North)"}            | f
           7 | Grace   |           3 | East   | {"(2,North)","(4,South)","(6,South)","(1,North)","(3,North)","(7,East)"} | f
(8 rows)

A useful technique to prevent infinite loops when testing recursive queries is to add a LIMIT clause in the main query. For example, without LIMIT, this recursive query would run forever:

WITH RECURSIVE numbers AS (
    SELECT 1 AS n

    UNION ALL

    SELECT n + 1 FROM numbers
)
SELECT n FROM numbers LIMIT 10;
 n
----
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
(10 rows)

Turn recursive CTEs off

Recursive CTEs can be deactivated by setting the gp_recursive_cte server configuration parameter to false. This command turns recursive CTEs off for the current session:

SET gp_recursive_cte = false;

You can now check that recursive CTEs are turned off, for example:

WITH RECURSIVE numbers AS (
    SELECT 1 AS n

    UNION ALL

    SELECT n + 1 FROM numbers WHERE n < 5
)
SELECT n FROM numbers;

The query should fail with the following error:

RECURSIVE clauses in WITH queries are currently disabled

Share a CTE

In Greengage DB, GPORCA may automatically reuse the results of a CTE referenced more than once, depending on its cost-based decisions. The Postgres planner allows you to control whether a CTE referenced more than once is recomputed for each reference using the gp_cte_sharing configuration parameter:

  • If gp_cte_sharing is on, the planner computes the CTE once and shares the results among all references.

  • If gp_cte_sharing is off, each reference of the CTE is executed independently.

Data-modifying statements

For a SELECT command, you can use the data-modifying commands INSERT, UPDATE, or DELETE in the WITH clause. This enables you to perform multiple operations within a single query.

If the main query references a data-modifying statement in the WITH clause, the statement runs exactly once to completion, regardless of whether the main query consumes all or any of its output. This contrasts with a SELECT in the WITH clause, which is executed only as needed, depending on how much output the main query requests.

To follow the examples in this section, create the stg_sales table:

CREATE TABLE stg_sales
(
    id           INT,
    date         DATE,
    product_name TEXT,
    category     TEXT,
    quantity     INT,
    unit_price   NUMERIC(8, 2)
)
    DISTRIBUTED BY (id);

Insert data:

INSERT INTO stg_sales (id, date, product_name, category, quantity, unit_price)
VALUES (1, '2025-01-03', 'Laptop', 'Electronics', 1, 1200.00),
       (2, '2025-01-04', 'Headphones', 'Electronics', 2, 150.00),
       (3, '2025-01-05', 'Coffee Maker', 'Home', 1, 85.00),
       (4, '2025-01-06', 'T-Shirt', 'Clothing', 3, 25.00),
       (5, '2025-01-07', 'Desk Chair', 'Home', 1, 200.00),
       (6, '2025-01-08', 'Smartphone', 'Electronics', 2, 800.00),
       (7, '2025-01-09', 'Headphones', 'Electronics', 1, 150.00),
       (8, '2025-01-10', 'T-Shirt', 'Clothing', 2, 25.00),
       (9, '2025-01-11', 'Laptop', 'Electronics', 1, 1200.00);

Delete

This query deletes rows from stg_sales before January 6, 2025, and returns the details of the deleted rows using a data-modifying CTE:

WITH deleted_sales AS (
    DELETE FROM stg_sales
        WHERE date < '2025-01-06'
        RETURNING id, product_name, date)
SELECT *
FROM deleted_sales;
 id | product_name |    date
----+--------------+------------
  3 | Coffee Maker | 2025-01-05
  1 | Laptop       | 2025-01-03
  2 | Headphones   | 2025-01-04
(3 rows)

Note that data-modifying statements in a WITH clause must have the RETURNING clause refer to their result sets in the parent query.

NOTE

The DELETE …​ RETURNING clause is not supported for append-optimized tables.

Update

This query updates the unit price of Laptop by 10% in the stg_sales table and returns the updated rows using a data-modifying CTE:

WITH updated_sales AS (
    UPDATE stg_sales
        SET unit_price = unit_price * 1.10
        WHERE product_name = 'Laptop'
        RETURNING id, product_name, unit_price)
SELECT *
FROM updated_sales;
 id | product_name | unit_price
----+--------------+------------
  9 | Laptop       |    1320.00
(1 row)

This query updates the unit price of Laptop by 10% in the stg_sales table, but the main query retrieves all rows from the original table rather than the updated rows returned by the CTE:

WITH updated_sales AS (
    UPDATE stg_sales
        SET unit_price = unit_price * 1.10
        WHERE product_name = 'Laptop'
        RETURNING id, product_name, unit_price)
SELECT id, product_name, unit_price
FROM stg_sales;
 id | product_name | unit_price
----+--------------+------------
  4 | T-Shirt      |      25.00
  7 | Headphones   |     150.00
  8 | T-Shirt      |      25.00
  5 | Desk Chair   |     200.00
  6 | Smartphone   |     800.00
  9 | Laptop       |    1320.00
(6 rows)

Recursive CTEs

To follow the examples in this section, create the stg_customers table:

CREATE TABLE stg_customers
(
    customer_id INT,
    name        TEXT,
    referred_by INT,
    region      TEXT
)
    DISTRIBUTED BY (customer_id);

This statement inserts sample customer data with referral relationships:

INSERT INTO stg_customers (customer_id, name, referred_by, region)
VALUES (1, 'Alice', NULL, 'North'),
       (2, 'Bob', 1, 'North'),
       (3, 'Charlie', 1, 'North'),
       (4, 'David', 2, 'South'),
       (5, 'Eve', 2, 'South'),
       (6, 'Frank', 4, 'South'),
       (7, 'Grace', 3, 'East');
/*
       Alice
       ├── Bob
       │   ├── David
       │   │   └── Frank
       │   └── Eve
       └── Charlie
           └── Grace
*/

Recursive self-references in data-modifying statements are not allowed. You can include a data-modifying statement in the main query.

This query uses a recursive CTE to find all customers referred directly or indirectly by a customer with customer_id equal to 2, and then deletes them from the stg_customers table:

WITH RECURSIVE referrals(customer_id, name, referred_by) AS (
    SELECT customer_id,
           name,
           referred_by
    FROM stg_customers
    WHERE customer_id = 2

    UNION ALL

    SELECT stg_customers.customer_id,
           stg_customers.name,
           stg_customers.referred_by
    FROM stg_customers
    INNER JOIN referrals ON stg_customers.referred_by = referrals.customer_id
)
DELETE FROM stg_customers
WHERE customer_id IN (SELECT customer_id FROM referrals);

To check that customers' records are deleted, query stg_customers:

SELECT *
FROM stg_customers;

The result looks as follows:

 customer_id |  name   | referred_by | region
-------------+---------+-------------+--------
           3 | Charlie |           1 | North
           7 | Grace   |           3 | East
           1 | Alice   |             | North
(3 rows)