Subqueries
A subquery is an SQL query nested inside another query. It runs first, and the main query uses its result. This topic explains how to use subqueries in SELECT queries.
Subquery classifications
Subqueries can be categorized in several ways depending on how they behave and where they are used in the main query. This section outlines the main types of subqueries based on:
-
the structure of the data they return (the number of rows and columns);
-
whether they depend on values from the outer query (correlation);
-
where they appear in the main query structure.
Returned row/column structure
This classification focuses on the shape of the subquery result in terms of rows and columns. The structure of the returned data determines the contexts in which the subquery can be used.
-
Scalar subquery
A scalar subquery is a subquery that selects a single column or expression and returns exactly one row — a single scalar value. It often returns the result of an aggregate function, such as
SUM()
,AVG()
,MIN()
, orMAX()
. Scalar subqueries can be used anywhere a single value is expected, such as in comparisons using operators like=
,<
,>
, and others. They are also commonly used in theSELECT
list to add calculated values to each row of the result. -
Table subquery
A table subquery is a subquery that returns one or more rows, optionally with multiple columns. Its usage depends on the number of rows and columns it returns:
-
One row, multiple columns
Can be used in row-wise comparisons involving multiple columns, for example,
(col1, col2) = (SELECT …)
. -
Multiple rows, single column
Commonly used in subquery expressions, such as
IN
,ANY
,ALL
, and similar operators. -
Multiple rows and columns
Often used as a derived table in the
FROM
or JOIN clause.
-
Correlation
This classification describes whether the subquery depends on values from the outer query.
-
Uncorrelated subquery
Runs independently of the outer query. It returns the same result, regardless of which row the outer query is processing.
-
Correlated subquery
Depends on values from the current row of the outer query. It cannot run on its own because it references columns from the outer query. Greengage DB uses one of the following methods to execute correlated subqueries:
-
Unnest a correlated query into join operations
This is the most efficient method for handling most correlated subqueries.
-
Run a correlated query on every row of the outer query
This method executes the subquery for every row of the outer query. It is less efficient and is used when the subquery appears in the
SELECT
list or in conditions involvingOR
.
-
Location in the main query
Subqueries can appear in different parts of a SELECT
query, each serving a specific purpose:
-
In the
WHERE
clause — to filter rows based on the results of another query.An SQL statement with a subquery in the
WHERE
clause has the following form:SELECT <columns> FROM <table_name> WHERE (<column> [, <column> ...]) [operator] (<subquery>);
This allows conditions such as
IN
,ANY
,ALL
, or comparisons (=
,<
,>
, and others) with scalar subqueries. -
In the
FROM
orJOIN
clauses — to define a temporary table (also known as a derived table).An SQL statement with a subquery in the
FROM
clause has the following form:SELECT <expression> FROM (<subquery>) AS <alias>;
An SQL statement with a subquery used in the
JOIN
clause has this form:SELECT <expression> FROM <table_name> JOIN (<subquery>) AS <alias> ON <join_condition>;
Every table in the
FROM
orJOIN
clause must have an alias. All columns in the subquerySELECT
list must have unique names. -
In the
SELECT
list — to compute values based on related data.An SQL statement with a subquery in the
SELECT
list has the following form:SELECT <expression>, (<subquery>) [AS <alias>] FROM <table_name>;
Usage examples
Scalar subquery
The following example uses a scalar subquery in the WHERE
clause to filter products whose price is above the average price of all products:
SELECT name,
price
FROM products
WHERE price > (SELECT AVG(price) FROM products);
The result should look like this:
name | price ------------+--------- Laptop | 1200.00 Smartphone | 800.00 (2 rows)
Correlated subquery
This example uses a correlated scalar subquery in the WHERE
clause to filter products whose price is above the average price within their own category:
SELECT name,
price
FROM products p1
WHERE price > (SELECT AVG(price)
FROM products p2
WHERE p2.category_code = p1.category_code);
The result looks as follows:
name | price ------------+--------- Laptop | 1200.00 Smartphone | 800.00 Jacket | 60.00 Desk Chair | 200.00 (4 rows)
Subqueries in WHERE
IN operator
This example uses a subquery with the IN
operator to filter products sold on or after March 1, 2025.
SELECT name,
price
FROM products
WHERE id IN (SELECT product_id
FROM sales
WHERE sales.date >= '2025-03-01');
The result should look as follows:
name | price --------------+--------- Headphones | 150.00 Coffee Maker | 85.00 T-Shirt | 25.00 Desk Chair | 200.00 Laptop | 1200.00 (5 rows)
The following example uses a subquery with the IN
operator to perform a row-wise comparison involving two columns.
The subquery filters products whose (category_code, price)
pair matches a pair where the price is under 100
and the category code is home
:
SELECT name,
price
FROM products
WHERE (category_code, price) IN (SELECT category_code,
price
FROM products
WHERE price < 100
AND category_code = 'home');
The result is as follows:
name | price --------------+------- Coffee Maker | 85.00 Blender | 55.00 (2 rows)
NOT IN operator
This example uses the NOT IN
operator with a subquery to find products that have never been sold:
SELECT name,
price
FROM products
WHERE id NOT IN (SELECT product_id FROM sales);
The output should look like this:
name | price ---------+------- Jacket | 60.00 Blender | 55.00 (2 rows)
ANY operator
This example uses the ANY
operator with a subquery to filter products whose price is greater than at least one product in the elec
category:
SELECT name,
price
FROM products
WHERE price > ANY (SELECT price
FROM products
WHERE category_code = 'elec');
The output looks as follows:
name | price ------------+--------- Desk Chair | 200.00 Smartphone | 800.00 Laptop | 1200.00 (3 rows)
ALL operator
The following example uses the ALL
operator with a subquery to find products whose price is greater than every product in the home
category:
SELECT name,
price
FROM products
WHERE price > ALL (SELECT price
FROM products
WHERE category_code = 'home');
The result looks like this:
name | price ------------+--------- Laptop | 1200.00 Smartphone | 800.00 (2 rows)
EXISTS operator
This example uses the EXISTS
operator with a correlated subquery to select products that have at least one matching sale:
SELECT name,
price
FROM products
WHERE EXISTS (SELECT 1
FROM sales
WHERE sales.product_id = products.id);
The output looks like this:
name | price --------------+--------- Coffee Maker | 85.00 T-Shirt | 25.00 Laptop | 1200.00 Headphones | 150.00 Desk Chair | 200.00 Smartphone | 800.00 (6 rows)
To find products that have never been sold, use NOT EXISTS
:
SELECT name,
price
FROM products
WHERE NOT EXISTS (SELECT 1
FROM sales
WHERE sales.product_id = products.id);
The output looks as follows:
name | price ---------+------- Jacket | 60.00 Blender | 55.00 (2 rows)
Single-row comparison
The following example uses a single-row comparison to find products that have the same category_code
and price
as the product named T-Shirt
:
SELECT name,
price
FROM products
WHERE (category_code, price) = (SELECT category_code,
price
FROM products
WHERE name = 'T-Shirt');
The following example performs the same single-row comparison as before but uses the explicit ROW
constructor:
SELECT name,
price
FROM products
WHERE ROW (category_code, price) = (SELECT category_code,
price
FROM products
WHERE name = 'T-Shirt');
The subquery must return exactly one row with two columns:
name | price ---------+------- T-Shirt | 25.00 (1 row)
Subqueries in FROM
This example demonstrates using a subquery as a derived table in the FROM
clause.
The subquery selects products priced under 100
, and the outer query retrieves their IDs and names:
SELECT low_price_products.id,
low_price_products.name
FROM (SELECT id, name
FROM products
WHERE price < 100) AS low_price_products;
The output looks as follows:
id | name ----+-------------- 3 | Coffee Maker 4 | T-Shirt 7 | Jacket 8 | Blender (4 rows)
The following example uses a subquery as a derived table to calculate the total quantity sold for each product.
The outer query then joins this summary with the products
table to display product names alongside their total sales quantities:
SELECT products.name,
sales_summary.total_quantity
FROM (SELECT sales.product_id,
SUM(sales.quantity) AS total_quantity
FROM sales
GROUP BY sales.product_id) AS sales_summary
JOIN products ON products.id = sales_summary.product_id;
The output should look like this:
name | total_quantity --------------+---------------- Coffee Maker | 2 T-Shirt | 7 Laptop | 3 Desk Chair | 3 Smartphone | 3 Headphones | 6 (6 rows)
Subqueries in JOIN
This example demonstrates using a derived table in JOIN
.
The subquery calculates the latest sale date per product, and the outer query joins this result with the products
table to combine product details with their most recent sale information:
SELECT products.name,
recent_sales.last_sale_date
FROM products
JOIN (SELECT sales.product_id,
MAX(sales.date) AS last_sale_date
FROM sales
GROUP BY sales.product_id) AS recent_sales ON products.id = recent_sales.product_id;
The result should look as follows:
name | last_sale_date --------------+---------------- Coffee Maker | 2025-03-01 T-Shirt | 2025-03-20 Laptop | 2025-03-15 Desk Chair | 2025-03-05 Smartphone | 2025-02-18 Headphones | 2025-03-10 (6 rows)
Subqueries in SELECT
This example uses a correlated scalar subquery in the SELECT
list to count the total sales for each product.
The subquery references the outer query’s products.id
and runs once per product to return the sales count alongside each product name:
SELECT name,
(SELECT COUNT(*)
FROM sales
WHERE sales.product_id = products.id) AS sales_count
FROM products;
This example uses LEFT JOIN
and GROUP BY
to calculate the sales count per product, producing the same result as a correlated scalar subquery above:
SELECT products.name,
COUNT(sales.product_id) AS sales_count
FROM products
LEFT JOIN sales ON sales.product_id = products.id
GROUP BY products.name;
The result should look like this:
name | sales_count --------------+------------- Coffee Maker | 2 T-Shirt | 3 Jacket | 0 Blender | 0 Headphones | 3 Laptop | 3 Desk Chair | 2 Smartphone | 2 (8 rows)