Hello, I’m DocuDroid!
Submitting feedback
Thank you for rating our AI Search!
We would be grateful if you could share your thoughts so we can improve our AI Search for you and other readers.
GitHub

Subqueries

Andrey Aksenov

A subquery is an SQL query nested inside another query. It runs first, and the main query uses its result. This topic explains how to use subqueries in SELECT queries.

Subquery classifications

Subqueries can be categorized in several ways depending on how they behave and where they are used in the main query. This section outlines the main types of subqueries based on:

  • the structure of the data they return (the number of rows and columns);

  • whether they depend on values from the outer query (correlation);

  • where they appear in the main query structure.

Returned row/column structure

This classification focuses on the shape of the subquery result in terms of rows and columns. The structure of the returned data determines the contexts in which the subquery can be used.

  • Scalar subquery

    A scalar subquery is a subquery that selects a single column or expression and returns exactly one row — a single scalar value. It often returns the result of an aggregate function, such as SUM(), AVG(), MIN(), or MAX(). Scalar subqueries can be used anywhere a single value is expected, such as in comparisons using operators like =, <, >, and others. They are also commonly used in the SELECT list to add calculated values to each row of the result.

  • Table subquery

    A table subquery is a subquery that returns one or more rows, optionally with multiple columns. Its usage depends on the number of rows and columns it returns:

    • One row, multiple columns

      Can be used in row-wise comparisons involving multiple columns, for example, (col1, col2) = (SELECT …​).

    • Multiple rows, single column

      Commonly used in subquery expressions, such as IN, ANY, ALL, and similar operators.

    • Multiple rows and columns

      Often used as a derived table in the FROM or JOIN clause.

Correlation

This classification describes whether the subquery depends on values from the outer query.

  • Uncorrelated subquery

    Runs independently of the outer query. It returns the same result, regardless of which row the outer query is processing.

  • Correlated subquery

    Depends on values from the current row of the outer query. It cannot run on its own because it references columns from the outer query. Greengage DB uses one of the following methods to execute correlated subqueries:

    • Unnest a correlated query into join operations

      This is the most efficient method for handling most correlated subqueries.

    • Run a correlated query on every row of the outer query

      This method executes the subquery for every row of the outer query. It is less efficient and is used when the subquery appears in the SELECT list or in conditions involving OR.

Location in the main query

Subqueries can appear in different parts of a SELECT query, each serving a specific purpose:

  • In the WHERE clause — to filter rows based on the results of another query.

    An SQL statement with a subquery in the WHERE clause has the following form:

    SELECT <columns>
    FROM <table_name>
    WHERE (<column> [, <column> ...]) [operator] (<subquery>);

    This allows conditions such as IN, ANY, ALL, or comparisons (=, <, >, and others) with scalar subqueries.

  • In the FROM or JOIN clauses — to define a temporary table (also known as a derived table).

    An SQL statement with a subquery in the FROM clause has the following form:

    SELECT <expression>
    FROM (<subquery>) AS <alias>;

    An SQL statement with a subquery used in the JOIN clause has this form:

    SELECT <expression>
    FROM <table_name>
    JOIN (<subquery>) AS <alias> ON <join_condition>;

    Every table in the FROM or JOIN clause must have an alias. All columns in the subquery SELECT list must have unique names.

  • In the SELECT list — to compute values based on related data.

    An SQL statement with a subquery in the SELECT list has the following form:

    SELECT <expression>,
           (<subquery>) [AS <alias>]
    FROM <table_name>;

Usage examples

Prerequisites

To follow the examples in this topic, connect to the Greengage DB master host using psql as described in Connect to Greengage DB via psql. Create a new database and connect to it:

CREATE DATABASE marketplace;
\c marketplace

Then, create three related tables and populate them with data:

Create a table to store the different product categories:

CREATE TABLE categories
(
    category_code TEXT PRIMARY KEY,
    name          TEXT
)
    DISTRIBUTED REPLICATED;

Insert data:

INSERT INTO categories
VALUES ('elec', 'Electronics'),
       ('cloth', 'Clothing'),
       ('home', 'Home');

Create a table to hold information about each product, including the category it belongs to:

CREATE TABLE products
(
    id            INT,
    name          TEXT,
    category_code TEXT REFERENCES categories (category_code),
    price         NUMERIC(8, 2),
    stock         INT
)
    WITH (appendoptimized = true)
    DISTRIBUTED BY (id);

Insert data:

INSERT INTO products
VALUES (1, 'Laptop', 'elec', 1200.00, 15),
       (2, 'Headphones', 'elec', 150.00, 40),
       (3, 'Coffee Maker', 'home', 85.00, 20),
       (4, 'T-Shirt', 'cloth', 25.00, 100),
       (5, 'Desk Chair', 'home', 200.00, 10),
       (6, 'Smartphone', 'elec', 800.00, 30),
       (7, 'Jacket', 'cloth', 60.00, 50),
       (8, 'Blender', 'home', 55.00, 25);
-- Products with IDs 7 and 8 do not exist in the 'sales' table

Create a table holding individual sales transactions:

CREATE TABLE sales
(
    id         INT,
    product_id INT,
    quantity   INT,
    date       DATE
)
    WITH (appendoptimized = true)
    DISTRIBUTED BY (product_id);

Insert data:

INSERT INTO sales
VALUES (1, 1, 1, '2025-01-03'),
       (2, 2, 2, '2025-01-05'),
       (3, 3, 1, '2025-01-07'),
       (4, 4, 3, '2025-01-10'),
       (5, 5, 1, '2025-01-15'),
       (6, 6, 2, '2025-01-22'),
       (7, 2, 1, '2025-02-01'),
       (8, 4, 2, '2025-02-03'),
       (9, 1, 1, '2025-02-07'),
       (12, 6, 1, '2025-02-18'),
       (13, 3, 1, '2025-03-01'),
       (14, 5, 2, '2025-03-05'),
       (16, 2, 3, '2025-03-10'),
       (17, 1, 1, '2025-03-15'),
       (18, 4, 2, '2025-03-20');

Scalar subquery

The following example uses a scalar subquery in the WHERE clause to filter products whose price is above the average price of all products:

SELECT name,
       price
FROM products
WHERE price > (SELECT AVG(price) FROM products);

The result should look like this:

    name    |  price
------------+---------
 Laptop     | 1200.00
 Smartphone |  800.00
(2 rows)

Correlated subquery

This example uses a correlated scalar subquery in the WHERE clause to filter products whose price is above the average price within their own category:

SELECT name,
       price
FROM products p1
WHERE price > (SELECT AVG(price)
               FROM products p2
               WHERE p2.category_code = p1.category_code);

The result looks as follows:

    name    |  price
------------+---------
 Laptop     | 1200.00
 Smartphone |  800.00
 Jacket     |   60.00
 Desk Chair |  200.00
(4 rows)

Subqueries in WHERE

IN operator

This example uses a subquery with the IN operator to filter products sold on or after March 1, 2025.

SELECT name,
       price
FROM products
WHERE id IN (SELECT product_id
             FROM sales
             WHERE sales.date >= '2025-03-01');

The result should look as follows:

     name     |  price
--------------+---------
 Headphones   |  150.00
 Coffee Maker |   85.00
 T-Shirt      |   25.00
 Desk Chair   |  200.00
 Laptop       | 1200.00
(5 rows)

The following example uses a subquery with the IN operator to perform a row-wise comparison involving two columns. The subquery filters products whose (category_code, price) pair matches a pair where the price is under 100 and the category code is home:

SELECT name,
       price
FROM products
WHERE (category_code, price) IN (SELECT category_code,
                                        price
                                 FROM products
                                 WHERE price < 100
                                   AND category_code = 'home');

The result is as follows:

     name     | price
--------------+-------
 Coffee Maker | 85.00
 Blender      | 55.00
(2 rows)

NOT IN operator

This example uses the NOT IN operator with a subquery to find products that have never been sold:

SELECT name,
       price
FROM products
WHERE id NOT IN (SELECT product_id FROM sales);

The output should look like this:

  name   | price
---------+-------
 Jacket  | 60.00
 Blender | 55.00
(2 rows)

ANY operator

This example uses the ANY operator with a subquery to filter products whose price is greater than at least one product in the elec category:

SELECT name,
       price
FROM products
WHERE price > ANY (SELECT price
                   FROM products
                   WHERE category_code = 'elec');

The output looks as follows:

    name    |  price
------------+---------
 Desk Chair |  200.00
 Smartphone |  800.00
 Laptop     | 1200.00
(3 rows)

ALL operator

The following example uses the ALL operator with a subquery to find products whose price is greater than every product in the home category:

SELECT name,
       price
FROM products
WHERE price > ALL (SELECT price
                   FROM products
                   WHERE category_code = 'home');

The result looks like this:

    name    |  price
------------+---------
 Laptop     | 1200.00
 Smartphone |  800.00
(2 rows)

EXISTS operator

This example uses the EXISTS operator with a correlated subquery to select products that have at least one matching sale:

SELECT name,
       price
FROM products
WHERE EXISTS (SELECT 1
              FROM sales
              WHERE sales.product_id = products.id);

The output looks like this:

     name     |  price
--------------+---------
 Coffee Maker |   85.00
 T-Shirt      |   25.00
 Laptop       | 1200.00
 Headphones   |  150.00
 Desk Chair   |  200.00
 Smartphone   |  800.00
(6 rows)

To find products that have never been sold, use NOT EXISTS:

SELECT name,
       price
FROM products
WHERE NOT EXISTS (SELECT 1
              FROM sales
              WHERE sales.product_id = products.id);

The output looks as follows:

  name   | price
---------+-------
 Jacket  | 60.00
 Blender | 55.00
(2 rows)

Single-row comparison

The following example uses a single-row comparison to find products that have the same category_code and price as the product named T-Shirt:

SELECT name,
       price
FROM products
WHERE (category_code, price) = (SELECT category_code,
                                       price
                                FROM products
                                WHERE name = 'T-Shirt');

The following example performs the same single-row comparison as before but uses the explicit ROW constructor:

SELECT name,
       price
FROM products
WHERE ROW (category_code, price) = (SELECT category_code,
                                           price
                                    FROM products
                                    WHERE name = 'T-Shirt');

The subquery must return exactly one row with two columns:

  name   | price
---------+-------
 T-Shirt | 25.00
(1 row)

Subqueries in FROM

This example demonstrates using a subquery as a derived table in the FROM clause. The subquery selects products priced under 100, and the outer query retrieves their IDs and names:

SELECT low_price_products.id,
       low_price_products.name
FROM (SELECT id, name
      FROM products
      WHERE price < 100) AS low_price_products;

The output looks as follows:

 id |     name
----+--------------
  3 | Coffee Maker
  4 | T-Shirt
  7 | Jacket
  8 | Blender
(4 rows)

The following example uses a subquery as a derived table to calculate the total quantity sold for each product. The outer query then joins this summary with the products table to display product names alongside their total sales quantities:

SELECT products.name,
       sales_summary.total_quantity
FROM (SELECT sales.product_id,
             SUM(sales.quantity) AS total_quantity
      FROM sales
      GROUP BY sales.product_id) AS sales_summary
         JOIN products ON products.id = sales_summary.product_id;

The output should look like this:

     name     | total_quantity
--------------+----------------
 Coffee Maker |              2
 T-Shirt      |              7
 Laptop       |              3
 Desk Chair   |              3
 Smartphone   |              3
 Headphones   |              6
(6 rows)

Subqueries in JOIN

This example demonstrates using a derived table in JOIN. The subquery calculates the latest sale date per product, and the outer query joins this result with the products table to combine product details with their most recent sale information:

SELECT products.name,
       recent_sales.last_sale_date
FROM products
         JOIN (SELECT sales.product_id,
                      MAX(sales.date) AS last_sale_date
               FROM sales
               GROUP BY sales.product_id) AS recent_sales ON products.id = recent_sales.product_id;

The result should look as follows:

     name     | last_sale_date
--------------+----------------
 Coffee Maker | 2025-03-01
 T-Shirt      | 2025-03-20
 Laptop       | 2025-03-15
 Desk Chair   | 2025-03-05
 Smartphone   | 2025-02-18
 Headphones   | 2025-03-10
(6 rows)

Subqueries in SELECT

This example uses a correlated scalar subquery in the SELECT list to count the total sales for each product. The subquery references the outer query’s products.id and runs once per product to return the sales count alongside each product name:

SELECT name,
       (SELECT COUNT(*)
        FROM sales
        WHERE sales.product_id = products.id) AS sales_count
FROM products;

This example uses LEFT JOIN and GROUP BY to calculate the sales count per product, producing the same result as a correlated scalar subquery above:

SELECT products.name,
       COUNT(sales.product_id) AS sales_count
FROM products
         LEFT JOIN sales ON sales.product_id = products.id
GROUP BY products.name;

The result should look like this:

     name     | sales_count
--------------+-------------
 Coffee Maker |           2
 T-Shirt      |           3
 Jacket       |           0
 Blender      |           0
 Headphones   |           3
 Laptop       |           3
 Desk Chair   |           2
 Smartphone   |           2
(8 rows)