Common table expressions
A common table expression (CTE) is a temporary result set defined within the execution scope of a single SQL statement. CTEs allow you to simplify complex queries by splitting them into smaller, more manageable parts. They can also help optimize query execution when the same dataset needs to be accessed multiple times within the query.
This topic describes how to use CTEs in SELECT queries.
For a SELECT
command, you can also use the data-modifying commands INSERT
, UPDATE
, or DELETE
within a CTE.
Prerequisites
To follow the examples in this topic, connect to the Greengage DB master host using psql
as described in Connect to Greengage DB via psql.
Create a new database and connect to it:
CREATE DATABASE marketplace;
\c marketplace
Non-recursive CTEs
Syntax
To define a CTE, use the WITH
clause before the main query:
WITH <cte_name> (<column_list>) AS (
<cte_subquery>
)
<main_query>;
where:
-
<cte_name>
— the name used to reference the CTE within the main query. -
<column_list>
— (optional) a list of column names for the CTE result set. -
<cte_subquery>
— the SQL query that defines the CTE result set. -
<main_query>
— the main SQL query that uses the CTE.
You can also define multiple CTEs in a single WITH
clause by separating them with commas:
WITH <cte_name1> (<column_list1>) AS (
<cte_subquery1>
),
<cte_name2> (<column_list2>) AS (
<cte_subquery2>
),
...
<primary_query>;
One CTE subquery
To follow the examples in this section, create the sales
table:
CREATE TABLE sales
(
id INT,
date DATE,
product_name TEXT,
category TEXT,
quantity INT,
unit_price NUMERIC(8, 2)
)
WITH (appendoptimized = true)
DISTRIBUTED BY (id);
Insert data:
INSERT INTO sales (id, date, product_name, category, quantity, unit_price)
VALUES (1, '2025-01-03', 'Laptop', 'Electronics', 1, 1200.00),
(2, '2025-01-04', 'Headphones', 'Electronics', 2, 150.00),
(3, '2025-01-05', 'Coffee Maker', 'Home', 1, 85.00),
(4, '2025-01-06', 'T-Shirt', 'Clothing', 3, 25.00),
(5, '2025-01-07', 'Desk Chair', 'Home', 1, 200.00),
(6, '2025-01-08', 'Smartphone', 'Electronics', 2, 800.00),
(7, '2025-01-09', 'Headphones', 'Electronics', 1, 150.00),
(8, '2025-01-10', 'T-Shirt', 'Clothing', 2, 25.00),
(9, '2025-01-11', 'Laptop', 'Electronics', 1, 1200.00);
This query defines a CTE named product_sales
to aggregate sales data by product and then selects products sorted by total revenue in descending order:
WITH product_sales AS (SELECT product_name,
SUM(quantity) AS total_quantity,
SUM(quantity * unit_price) AS total_revenue
FROM sales
GROUP BY product_name)
SELECT product_name,
total_revenue
FROM product_sales
ORDER BY total_revenue DESC;
The result looks like this:
product_name | total_revenue --------------+--------------- Laptop | 2400.00 Smartphone | 1600.00 Headphones | 450.00 Desk Chair | 200.00 T-Shirt | 125.00 Coffee Maker | 85.00 (6 rows)
This query uses the product_sales
subquery in the FROM
clause, which returns the same result as the CTE above:
SELECT product_name,
total_revenue
FROM (SELECT product_name,
SUM(quantity) AS total_quantity,
SUM(quantity * unit_price) AS total_revenue
FROM sales
GROUP BY product_name) AS product_sales
ORDER BY total_revenue DESC;
Multiple CTE subqueries
This query defines two CTEs: category_sales
aggregates revenue by category, and product_sales
aggregates sales by product.
The main query joins these CTEs to display each product’s revenue along with its category’s total revenue:
WITH category_sales AS (SELECT category,
SUM(quantity * unit_price) AS category_revenue
FROM sales
GROUP BY category),
product_sales AS (SELECT product_name,
category,
SUM(quantity) AS total_quantity,
SUM(quantity * unit_price) AS total_revenue
FROM sales
GROUP BY product_name, category)
SELECT product_sales.product_name,
product_sales.category,
product_sales.total_revenue,
category_sales.category_revenue
FROM product_sales
JOIN category_sales ON product_sales.category = category_sales.category
ORDER BY product_sales.total_revenue DESC;
The result should look like this:
product_name | category | total_revenue | category_revenue --------------+-------------+---------------+------------------ Laptop | Electronics | 2400.00 | 4450.00 Smartphone | Electronics | 1600.00 | 4450.00 Headphones | Electronics | 450.00 | 4450.00 Desk Chair | Home | 200.00 | 285.00 T-Shirt | Clothing | 125.00 | 125.00 Coffee Maker | Home | 85.00 | 285.00 (6 rows)
If you rewrite this query without the WITH
clause, it becomes a more complex statement with two nested subqueries in the FROM
clause:
SELECT product_sales.product_name,
product_sales.category,
product_sales.total_revenue,
category_sales.category_revenue
FROM (SELECT sales.product_name,
sales.category,
SUM(sales.quantity) AS total_quantity,
SUM(sales.quantity * sales.unit_price) AS total_revenue
FROM sales
GROUP BY sales.product_name, sales.category) AS product_sales
JOIN (SELECT sales.category,
SUM(sales.quantity * sales.unit_price) AS category_revenue
FROM sales
GROUP BY sales.category) AS category_sales
ON product_sales.category = category_sales.category
ORDER BY product_sales.total_revenue DESC;
Recursive CTEs
Syntax
A recursive CTE allows you to perform iterative computations within a query, commonly used to process hierarchical data. It has the following syntax:
WITH RECURSIVE <cte_name> (<column_list>) AS (
-- Non-recursive term: runs once
<non_recursive_term>
UNION [ALL]
-- Recursive term: refers to <cte_name>, runs repeatedly
<recursive_term>
)
<main_query>;
where:
-
<cte_name>
— the name of the CTE that is referenced in the recursive term and in the main query. -
<column_list>
— (optional) a list of column names for the CTE result set. -
<non_recursive_term>
— the base query that produces the initial result set (executed once). -
<recursive_term>
— a query that references<cte_name>
and is executed repeatedly, adding new rows to the result set until no more rows are returned. The result is combined with a non-recursive term using the UNION orUNION ALL
operator. -
<main_query>
— the main SQL query that consumes the final result of the recursive CTE.
Greengage DB processes a recursive CTE in the following steps:
-
Executes the non-recursive term to produce the initial result set.
-
Executes the recursive term using the results from the previous step as input to generate additional rows.
-
Repeats step 2 until the recursive term produces no new rows.
-
Returns the combined result set, which is the union of all rows produced by the non-recursive and recursive terms.
Number sequence
This query uses a recursive CTE to generate a sequence of numbers from 1 to 5:
WITH RECURSIVE numbers AS (
-- Non-recursive term
SELECT 1 AS n
UNION ALL
-- Recursive term
SELECT n + 1 FROM numbers WHERE n < 5
)
SELECT n FROM numbers;
The result looks like this:
n --- 1 2 3 4 5 (5 rows)
Running total
This query uses a recursive CTE named running_sum
to calculate a running total for numbers from 1 to 5:
WITH RECURSIVE running_sum AS (
SELECT 1 AS n,
1 AS running_total
UNION ALL
SELECT n + 1,
running_total + n + 1
FROM running_sum
WHERE n < 5
)
SELECT n,
running_total
FROM running_sum;
The result is as follows:
n | running_total ---+--------------- 1 | 1 2 | 3 3 | 6 4 | 10 5 | 15 (5 rows)
This query calculates a running total of the quantity of products in the sales
table using a recursive CTE.
It starts from a specific date and adds each day’s product quantity to the cumulative total of the previous days:
WITH RECURSIVE cumulative_quantity AS (
SELECT date,
quantity,
quantity AS cumulative_total
FROM sales
WHERE date = '2025-01-03'
UNION ALL
SELECT sales.date,
sales.quantity,
cumulative_quantity.cumulative_total + sales.quantity
FROM sales
JOIN cumulative_quantity ON sales.date = cumulative_quantity.date + INTERVAL '1 day'
)
SELECT date,
quantity,
cumulative_total
FROM cumulative_quantity;
The result should look as follows:
date | quantity | cumulative_total ------------+----------+------------------ 2025-01-03 | 1 | 1 2025-01-04 | 2 | 3 2025-01-05 | 1 | 4 2025-01-06 | 3 | 7 2025-01-07 | 1 | 8 2025-01-08 | 2 | 10 2025-01-09 | 1 | 11 2025-01-10 | 2 | 13 2025-01-11 | 1 | 14 (9 rows)
Hierarchical data
To follow the examples in this section, create the customers
table:
CREATE TABLE customers
(
customer_id INT,
name TEXT,
referred_by INT,
region TEXT
)
WITH (appendoptimized = true)
DISTRIBUTED BY (customer_id);
This statement inserts sample customer data with referral relationships:
INSERT INTO customers (customer_id, name, referred_by, region)
VALUES (1, 'Alice', NULL, 'North'),
(2, 'Bob', 1, 'North'),
(3, 'Charlie', 1, 'North'),
(4, 'David', 2, 'South'),
(5, 'Eve', 2, 'South'),
(6, 'Frank', 4, 'South'),
(7, 'Grace', 3, 'East');
/*
Alice
├── Bob
│ ├── David
│ │ └── Frank
│ └── Eve
└── Charlie
└── Grace
*/
This query uses a recursive CTE to find all customers referred by a customer with customer_id
equal to 2
, directly or through a chain of referrals:
WITH RECURSIVE referrals(customer_id, name, referred_by) AS (
SELECT customer_id,
name,
referred_by
FROM customers
WHERE customer_id = 2
UNION ALL
SELECT customers.customer_id,
customers.name,
customers.referred_by
FROM customers
INNER JOIN referrals ON customers.referred_by = referrals.customer_id
)
SELECT * FROM referrals;
Search order
By default, recursive queries produce results in breadth-first search order.
When applied to the customers
table, a breadth-first search processes all immediate referrals of a customer before moving deeper into the referral chain.
In depth-first order, the query follows each referral chain as far as possible before moving to the next referral.
To change the search order, you need to build a path
array that tracks the sequence of referrals.
This query appends each customer’s ID to the path
as the recursion progresses, forming the complete referral chain for each row:
WITH RECURSIVE referrals(customer_id, name, referred_by, path) AS (
SELECT
customer_id,
name,
referred_by,
ARRAY[customer_id] AS path
FROM customers
WHERE customer_id = 2
UNION ALL
SELECT
customers.customer_id,
customers.name,
customers.referred_by,
referrals.path || customers.customer_id
FROM customers
INNER JOIN referrals ON customers.referred_by = referrals.customer_id
)
SELECT * FROM referrals;
To display results in depth-first order, add ORDER BY path
in the outer query:
WITH RECURSIVE referrals(customer_id, name, referred_by, path) AS (
SELECT
customer_id,
name,
referred_by,
ARRAY[customer_id] AS path
FROM customers
WHERE customer_id = 2
UNION ALL
SELECT
customers.customer_id,
customers.name,
customers.referred_by,
referrals.path || customers.customer_id
FROM customers
INNER JOIN referrals ON customers.referred_by = referrals.customer_id
)
SELECT * FROM referrals ORDER BY path;
The results of these queries:
customer_id | name | referred_by | path -------------+-------+-------------+--------- 2 | Bob | 1 | {2} 4 | David | 2 | {2,4} 5 | Eve | 2 | {2,5} 6 | Frank | 4 | {2,4,6} (4 rows)
customer_id | name | referred_by | path -------------+-------+-------------+--------- 2 | Bob | 1 | {2} 4 | David | 2 | {2,4} 6 | Frank | 4 | {2,4,6} 5 | Eve | 2 | {2,5} (4 rows)
Cycle detection
In recursive queries, the recursive part must eventually return no rows to avoid an endless loop.
To see cycle detection and handling in action, create a cycle by updating the customers
table as follows:
UPDATE customers
SET referred_by = 6
WHERE customer_id = 1;
This query uses a recursive CTE to traverse referral relationships starting from the customer with ID equal to 2
, while detecting cycles by tracking the referral path.
The query includes the cycle
flag that becomes true
if a customer appears more than once in the referral path:
WITH RECURSIVE referrals(customer_id, name, referred_by, path, cycle) AS (
SELECT
customer_id,
name,
referred_by,
ARRAY[customer_id] AS path,
FALSE AS cycle
FROM customers
WHERE customer_id = 2
UNION ALL
SELECT
customers.customer_id,
customers.name,
customers.referred_by,
referrals.path || customers.customer_id,
customers.customer_id = ANY(referrals.path) AS cycle -- Detect a cycle
FROM customers
INNER JOIN referrals ON customers.referred_by = referrals.customer_id
WHERE NOT referrals.cycle -- Stop recursion if a cycle is found
)
SELECT * FROM referrals;
customer_id | name | referred_by | path | cycle -------------+---------+-------------+---------------+------- 2 | Bob | 1 | {2} | f 5 | Eve | 2 | {2,5} | f 4 | David | 2 | {2,4} | f 6 | Frank | 4 | {2,4,6} | f 1 | Alice | 6 | {2,4,6,1} | f 3 | Charlie | 1 | {2,4,6,1,3} | f 2 | Bob | 1 | {2,4,6,1,2} | t 7 | Grace | 3 | {2,4,6,1,3,7} | f (8 rows)
In cases where detecting a cycle requires checking multiple fields, an array of rows can be used to track visited combinations:
WITH RECURSIVE referrals AS (
SELECT
customer_id,
name,
referred_by,
region,
ARRAY[ROW(customer_id, region)] AS path,
FALSE AS cycle
FROM customers
WHERE customer_id = 2
UNION ALL
SELECT
customers.customer_id,
customers.name,
customers.referred_by,
customers.region,
referrals.path || ROW(customers.customer_id, customers.region),
ROW(customers.customer_id, customers.region) = ANY(referrals.path)
FROM customers
JOIN referrals ON customers.referred_by = referrals.customer_id
WHERE NOT referrals.cycle
)
SELECT * FROM referrals;
customer_id | name | referred_by | region | path | cycle -------------+---------+-------------+--------+--------------------------------------------------------------------------+------- 2 | Bob | 1 | North | {"(2,North)"} | f 4 | David | 2 | South | {"(2,North)","(4,South)"} | f 5 | Eve | 2 | South | {"(2,North)","(5,South)"} | f 6 | Frank | 4 | South | {"(2,North)","(4,South)","(6,South)"} | f 1 | Alice | 6 | North | {"(2,North)","(4,South)","(6,South)","(1,North)"} | f 2 | Bob | 1 | North | {"(2,North)","(4,South)","(6,South)","(1,North)","(2,North)"} | t 3 | Charlie | 1 | North | {"(2,North)","(4,South)","(6,South)","(1,North)","(3,North)"} | f 7 | Grace | 3 | East | {"(2,North)","(4,South)","(6,South)","(1,North)","(3,North)","(7,East)"} | f (8 rows)
A useful technique to prevent infinite loops when testing recursive queries is to add a LIMIT
clause in the main query.
For example, without LIMIT
, this recursive query would run forever:
WITH RECURSIVE numbers AS (
SELECT 1 AS n
UNION ALL
SELECT n + 1 FROM numbers
)
SELECT n FROM numbers LIMIT 10;
n ---- 1 2 3 4 5 6 7 8 9 10 (10 rows)
Turn recursive CTEs off
Recursive CTEs can be deactivated by setting the gp_recursive_cte
server configuration parameter to false
.
This command turns recursive CTEs off for the current session:
SET gp_recursive_cte = false;
You can now check that recursive CTEs are turned off, for example:
WITH RECURSIVE numbers AS (
SELECT 1 AS n
UNION ALL
SELECT n + 1 FROM numbers WHERE n < 5
)
SELECT n FROM numbers;
The query should fail with the following error:
RECURSIVE clauses in WITH queries are currently disabled
Share a CTE
In Greengage DB, GPORCA may automatically reuse the results of a CTE referenced more than once, depending on its cost-based decisions.
The Postgres planner allows you to control whether a CTE referenced more than once is recomputed for each reference using the gp_cte_sharing
configuration parameter:
-
If
gp_cte_sharing
ison
, the planner computes the CTE once and shares the results among all references. -
If
gp_cte_sharing
isoff
, each reference of the CTE is executed independently.
Data-modifying statements
For a SELECT
command, you can use the data-modifying commands INSERT
, UPDATE
, or DELETE
in the WITH
clause.
This enables you to perform multiple operations within a single query.
If the main query references a data-modifying statement in the WITH
clause, the statement runs exactly once to completion, regardless of whether the main query consumes all or any of its output.
This contrasts with a SELECT
in the WITH
clause, which is executed only as needed, depending on how much output the main query requests.
To follow the examples in this section, create the stg_sales
table:
CREATE TABLE stg_sales
(
id INT,
date DATE,
product_name TEXT,
category TEXT,
quantity INT,
unit_price NUMERIC(8, 2)
)
DISTRIBUTED BY (id);
Insert data:
INSERT INTO stg_sales (id, date, product_name, category, quantity, unit_price)
VALUES (1, '2025-01-03', 'Laptop', 'Electronics', 1, 1200.00),
(2, '2025-01-04', 'Headphones', 'Electronics', 2, 150.00),
(3, '2025-01-05', 'Coffee Maker', 'Home', 1, 85.00),
(4, '2025-01-06', 'T-Shirt', 'Clothing', 3, 25.00),
(5, '2025-01-07', 'Desk Chair', 'Home', 1, 200.00),
(6, '2025-01-08', 'Smartphone', 'Electronics', 2, 800.00),
(7, '2025-01-09', 'Headphones', 'Electronics', 1, 150.00),
(8, '2025-01-10', 'T-Shirt', 'Clothing', 2, 25.00),
(9, '2025-01-11', 'Laptop', 'Electronics', 1, 1200.00);
Delete
This query deletes rows from stg_sales
before January 6, 2025, and returns the details of the deleted rows using a data-modifying CTE:
WITH deleted_sales AS (
DELETE FROM stg_sales
WHERE date < '2025-01-06'
RETURNING id, product_name, date)
SELECT *
FROM deleted_sales;
id | product_name | date ----+--------------+------------ 3 | Coffee Maker | 2025-01-05 1 | Laptop | 2025-01-03 2 | Headphones | 2025-01-04 (3 rows)
Note that data-modifying statements in a WITH
clause must have the RETURNING
clause refer to their result sets in the parent query.
The DELETE … RETURNING
clause is not supported for append-optimized tables.
Update
This query updates the unit price of Laptop
by 10% in the stg_sales
table and returns the updated rows using a data-modifying CTE:
WITH updated_sales AS (
UPDATE stg_sales
SET unit_price = unit_price * 1.10
WHERE product_name = 'Laptop'
RETURNING id, product_name, unit_price)
SELECT *
FROM updated_sales;
id | product_name | unit_price ----+--------------+------------ 9 | Laptop | 1320.00 (1 row)
This query updates the unit price of Laptop
by 10% in the stg_sales
table, but the main query retrieves all rows from the original table rather than the updated rows returned by the CTE:
WITH updated_sales AS (
UPDATE stg_sales
SET unit_price = unit_price * 1.10
WHERE product_name = 'Laptop'
RETURNING id, product_name, unit_price)
SELECT id, product_name, unit_price
FROM stg_sales;
id | product_name | unit_price ----+--------------+------------ 4 | T-Shirt | 25.00 7 | Headphones | 150.00 8 | T-Shirt | 25.00 5 | Desk Chair | 200.00 6 | Smartphone | 800.00 9 | Laptop | 1320.00 (6 rows)
Recursive CTEs
To follow the examples in this section, create the stg_customers
table:
CREATE TABLE stg_customers
(
customer_id INT,
name TEXT,
referred_by INT,
region TEXT
)
DISTRIBUTED BY (customer_id);
This statement inserts sample customer data with referral relationships:
INSERT INTO stg_customers (customer_id, name, referred_by, region)
VALUES (1, 'Alice', NULL, 'North'),
(2, 'Bob', 1, 'North'),
(3, 'Charlie', 1, 'North'),
(4, 'David', 2, 'South'),
(5, 'Eve', 2, 'South'),
(6, 'Frank', 4, 'South'),
(7, 'Grace', 3, 'East');
/*
Alice
├── Bob
│ ├── David
│ │ └── Frank
│ └── Eve
└── Charlie
└── Grace
*/
Recursive self-references in data-modifying statements are not allowed. You can include a data-modifying statement in the main query.
This query uses a recursive CTE to find all customers referred directly or indirectly by a customer with customer_id
equal to 2
, and then deletes them from the stg_customers
table:
WITH RECURSIVE referrals(customer_id, name, referred_by) AS (
SELECT customer_id,
name,
referred_by
FROM stg_customers
WHERE customer_id = 2
UNION ALL
SELECT stg_customers.customer_id,
stg_customers.name,
stg_customers.referred_by
FROM stg_customers
INNER JOIN referrals ON stg_customers.referred_by = referrals.customer_id
)
DELETE FROM stg_customers
WHERE customer_id IN (SELECT customer_id FROM referrals);
To check that customers' records are deleted, query stg_customers
:
SELECT *
FROM stg_customers;
The result looks as follows:
customer_id | name | referred_by | region -------------+---------+-------------+-------- 3 | Charlie | 1 | North 7 | Grace | 3 | East 1 | Alice | | North (3 rows)