Hello, I’m DocuDroid!
Submitting feedback
Thank you for rating our AI Search!
We would be grateful if you could share your thoughts so we can improve our AI Search for you and other readers.
GitHub

SELECT command overview

Andrey Aksenov

A SELECT statement allows you to retrieve rows from a table or a view. You can choose which columns to return, apply filters to narrow down the results, and sort the output. In Greengage DB, which is designed for analytical workloads, SELECT is the most frequently used statement for exploring and analyzing data.

Overview

Syntax overview

The simplest SELECT syntax for retrieving rows from a table looks as follows:

SELECT [ * | <column> [, ...] ]
  [FROM <table_name> [, ...]]
  [WHERE <condition>]

where:

  • <column> [, …​] — the names of one or more columns to retrieve from the table or view. Use * to select all columns.

  • <table_name> [, …​] — the name of one or more tables or views (optionally schema-qualified) from which to retrieve data.

  • <condition> — an expression used to filter rows returned by the query.

The SELECT statement supports many optional clauses for grouping, filtering, and ordering results:

[ WITH [ RECURSIVE ] <with_query> [, ...] ]
SELECT [ALL | DISTINCT [ON (<expression> [, ...])]]
  * | <expression> [[AS] <output_name>] [, ...]
  [FROM <from_item> [, ...]]
  [WHERE <condition>]
  [GROUP BY <grouping_element> [, ...]]
  [HAVING <condition> [, ...]]
  [WINDOW <window_name> AS (<window_definition>) [, ...]]
  [{UNION | INTERSECT | EXCEPT} [ALL | DISTINCT] <select>]
  [ORDER BY <expression> [ASC | DESC | USING <operator>] [NULLS {FIRST | LAST}] [, ...]]
  [LIMIT {<count> | ALL}]
  [OFFSET <start> [ROW | ROWS]]
  [FETCH {FIRST | NEXT} [<count>] {ROW | ROWS} ONLY]
  [FOR {UPDATE | NO KEY UPDATE | SHARE | KEY SHARE} [OF <table_name> [, ...]] [NOWAIT] [...]]

TABLE { [ONLY] <table_name> [*] | <with_query_name> }

SELECT query execution

The general processing of a SELECT query is as follows:

  1. WITH

    All queries in the WITH clause are computed. These effectively serve as temporary tables that can be referenced in the FROM list.

  2. FROM

    All elements in the FROM list are computed. Each element in the FROM list is a real or virtual table, such as a subquery or a joined table. If more than one element is specified in the FROM list, they are cross-joined together.

  3. WHERE

    If the WHERE clause is specified, all rows that do not satisfy the condition are eliminated from the output.

  4. GROUP BY / HAVING

    If the GROUP BY clause is specified or if there are aggregate function calls, the output is grouped by the specified keys. The results of aggregate functions are computed per group. If the HAVING clause is present, it eliminates groups that do not satisfy the given condition.

  5. SELECT output expressions

    The actual output rows are computed using the expressions listed in the SELECT clause. These can include constants, column references, calculations, function calls, or window functions.

  6. DISTINCT / DISTINCT ON / ALL

    These options control whether and how duplicate rows are removed from the result set:

    • SELECT DISTINCT eliminates duplicate rows from the result.

    • SELECT DISTINCT ON eliminates rows that match all the specified expressions.

    • SELECT ALL (the default) returns all candidate rows, including duplicates.

  7. UNION / INTERSECT / EXCEPT

    Using the operators UNION, INTERSECT, and EXCEPT, the outputs of more than one SELECT statement can be combined into a single result set. Each query must return the same number of columns with compatible data types:

    • The UNION operator returns all rows that appear in one or both input sets.

    • The INTERSECT operator returns only rows that appear in both sets.

    • The EXCEPT operator returns rows from the first set that do not appear in the second.

    By default, these operators eliminate duplicate rows. To retain duplicates, you can use the ALL keyword (for example, UNION ALL). Note that DISTINCT is the default for UNION, INTERSECT, and EXCEPT, even though ALL is the default for SELECT.

  8. ORDER BY

    If the ORDER BY clause is specified, the returned rows are sorted in the specified order. If not, rows are returned in an arbitrary order, typically determined by the most efficient execution plan chosen by DBMS.

  9. LIMIT / OFFSET / FETCH

    If the LIMIT, FETCH FIRST, or OFFSET clause is specified, the SELECT statement only returns a subset of result rows.

  10. FOR UPDATE / …​

    If FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE, or FOR KEY SHARE is specified, the SELECT statement locks the selected rows against concurrent updates.

IMPORTANT

Queries that use ORDER BY on large datasets or DISTINCT on a large number of fields may cause spill files to be created. Learn more in the Manage spill files topic.

Usage examples

Prerequisites

To follow the examples in this topic, connect to the Greengage DB master host using psql as described in Connect to Greengage DB via psql. Create a new database and connect to it:

CREATE DATABASE marketplace;
\c marketplace

Then, create three related tables and populate them with data:

Create a table to store the different product categories:

CREATE TABLE categories
(
    category_code TEXT PRIMARY KEY,
    name          TEXT
)
    DISTRIBUTED REPLICATED;

Insert data:

INSERT INTO categories
VALUES ('elec', 'Electronics'),
       ('cloth', 'Clothing'),
       ('home', 'Home');

Create a table to hold information about each product, including the category it belongs to:

CREATE TABLE products
(
    id            INT,
    name          TEXT,
    category_code TEXT REFERENCES categories (category_code),
    price         NUMERIC(8, 2),
    stock         INT
)
    WITH (appendoptimized = true)
    DISTRIBUTED BY (id);

Insert data:

INSERT INTO products
VALUES (1, 'Laptop', 'elec', 1200.00, 15),
       (2, 'Headphones', 'elec', 150.00, 40),
       (3, 'Coffee Maker', 'home', 85.00, 20),
       (4, 'T-Shirt', 'cloth', 25.00, 100),
       (5, 'Desk Chair', 'home', 200.00, 10),
       (6, 'Smartphone', 'elec', 800.00, 30),
       (7, 'Jacket', 'cloth', 60.00, 50),
       (8, 'Blender', 'home', 55.00, 25);

Create a table holding individual sales transactions:

CREATE TABLE sales
(
    id         INT,
    product_id INT,
    quantity   INT,
    date       DATE
)
    WITH (appendoptimized = true)
    DISTRIBUTED BY (product_id);

Insert data:

INSERT INTO sales
VALUES (1, 1, 1, '2025-01-03'),
       (2, 2, 2, '2025-01-05'),
       (3, 3, 1, '2025-01-07'),
       (4, 4, 3, '2025-01-10'),
       (5, 5, 1, '2025-01-15'),
       (6, 6, 2, '2025-01-22'),
       (7, 2, 1, '2025-02-01'),
       (8, 4, 2, '2025-02-03'),
       (9, 1, 1, '2025-02-07'),
       (10, 8, 1, '2025-02-10'),
       (11, 7, 2, '2025-02-13'),
       (12, 6, 1, '2025-02-18'),
       (13, 3, 1, '2025-03-01'),
       (14, 5, 2, '2025-03-05'),
       (15, 7, 1, '2025-03-07'),
       (16, 2, 3, '2025-03-10'),
       (17, 1, 1, '2025-03-15'),
       (18, 4, 2, '2025-03-20');

Select a constant value

In a SELECT statement, it is possible to include constant values in the result set. This means you can select fixed values such as strings, numbers, or dates without referencing any table columns. This might be useful for generating static labels or returning messages.

The following query selects a constant string value and assigns it an alias:

SELECT 'Hello, world!' AS greeting;

The result looks like this:

   greeting
---------------
 Hello, world!
(1 row)

Select all table columns

To retrieve all columns and rows from a table, use SELECT *:

SELECT *
FROM categories;

Alternatively, you can use the TABLE command:

TABLE categories;

The result looks as follows:

 category_code |    name
---------------+-------------
 elec          | Electronics
 cloth         | Clothing
 home          | Home
(3 rows)

Note that the order of rows returned is not guaranteed unless an explicit ORDER BY clause is used. See the Sort data section below.

CAUTION

SELECT * should be used with caution on large datasets since it accesses all columns and can negatively impact performance.

Select specific columns

Instead of retrieving all columns with SELECT *, you can specify exactly which columns to return. For example, this query returns only the product name and price from the products table:

SELECT name,
       price
FROM products;

You can also use fully qualified column names by including the table name as a prefix:

SELECT products.name,
       products.price
FROM products;

This is useful in queries involving joins, where column names might be ambiguous. The result should look like this:

     name     |  price
--------------+---------
 Coffee Maker |   85.00
 T-Shirt      |   25.00
 Jacket       |   60.00
 Blender      |   55.00
 Laptop       | 1200.00
 Headphones   |  150.00
 Desk Chair   |  200.00
 Smartphone   |  800.00
(8 rows)

Use aliases

Aliases provide temporary names for columns or tables in a query. You can rename the output columns using the AS keyword. This is helpful when you want to give columns more meaningful names, for example:

SELECT name  AS product_name,
       price AS unit_price
FROM products;

The same query without AS:

SELECT name  product_name,
       price unit_price
FROM products;

The result looks as follows:

 product_name | unit_price
--------------+------------
 Coffee Maker |      85.00
 T-Shirt      |      25.00
 Jacket       |      60.00
 Blender      |      55.00
 Desk Chair   |     200.00
 Smartphone   |     800.00
 Laptop       |    1200.00
 Headphones   |     150.00
(8 rows)

You can also use the AS keyword to assign an alias to a table:

SELECT p.name,
       p.price
FROM products AS p;

The same query without AS:

SELECT p.name,
       p.price
FROM products p;

Use functions and expressions

In a SELECT statement, you can include function calls and expressions to compute or transform values without modifying the underlying data.

  • The UPPER() function in a query below converts product names to uppercase:

    SELECT UPPER(name) AS product_name_caps,
           price
    FROM products;

    The result looks as follows:

     product_name_caps |  price
    -------------------+---------
     COFFEE MAKER      |   85.00
     T-SHIRT           |   25.00
     JACKET            |   60.00
     BLENDER           |   55.00
     LAPTOP            | 1200.00
     DESK CHAIR        |  200.00
     SMARTPHONE        |  800.00
     HEADPHONES        |  150.00
    (8 rows)
  • The following query calculates a 10% discount on each product’s price:

    SELECT name,
           price,
           price * 0.9 AS discounted_price
    FROM products;

    The result looks like this:

         name     |  price  | discounted_price
    --------------+---------+------------------
     Coffee Maker |   85.00 |           76.500
     T-Shirt      |   25.00 |           22.500
     Jacket       |   60.00 |           54.000
     Blender      |   55.00 |           49.500
     Desk Chair   |  200.00 |          180.000
     Smartphone   |  800.00 |          720.000
     Laptop       | 1200.00 |         1080.000
     Headphones   |  150.00 |          135.000
    (8 rows)
  • The CASE expression lets you return different values based on conditional logic:

    SELECT name,
           price,
           CASE
               WHEN price >= 500 THEN 'high'
               WHEN price >= 100 THEN 'medium'
               ELSE 'low'
               END AS price_level
    FROM products;

    The result is as follows:

         name     |  price  | price_level
    --------------+---------+-------------
     Coffee Maker |   85.00 | low
     T-Shirt      |   25.00 | low
     Jacket       |   60.00 | low
     Blender      |   55.00 | low
     Laptop       | 1200.00 | high
     Desk Chair   |  200.00 | medium
     Smartphone   |  800.00 | high
     Headphones   |  150.00 | medium
    (8 rows)

Filter data

The WHERE clause allows you to filter rows based on specified conditions. This is useful for narrowing down results and retrieving only the data that meets your criteria. You can compare values, use pattern matching, or combine multiple conditions using logical operators, such as AND and OR.

  • Use the equals operator (=) to find rows where a column matches a specific value:

    SELECT name,
           price
    FROM products
    WHERE name = 'Laptop';

    The result should look like this:

      name  |  price
    --------+---------
     Laptop | 1200.00
    (1 row)
    NOTE

    To check if a value is NULL, use the IS NULL operator.

  • The IN operator lets you match a column value against a list of values:

    SELECT name,
           price
    FROM products
    WHERE name IN ('Laptop', 'Smartphone');

    The result looks like this:

        name    |  price
    ------------+---------
     Laptop     | 1200.00
     Smartphone |  800.00
    (2 rows)
  • The LIKE operator allows you to perform pattern matching using wildcard characters. The percent sign (%) matches any sequence of zero or more characters:

    SELECT name,
           price
    FROM products
    WHERE name LIKE '%phone%';

    The result is as follows:

        name    | price
    ------------+--------
     Smartphone | 800.00
     Headphones | 150.00
    (2 rows)
  • You can combine multiple conditions using AND or OR to apply more specific filters:

    SELECT name,
           price,
           stock
    FROM products
    WHERE price > 100
      AND stock > 10;

    The result should look like this:

        name    |  price  | stock
    ------------+---------+-------
     Laptop     | 1200.00 |    15
     Smartphone |  800.00 |    30
     Headphones |  150.00 |    40
    (3 rows)

Join tables

Joining tables allows you to combine rows from two or more tables based on a related column between them.

You can join tables by specifying the join condition explicitly with the ON keyword. In this example, the products table is joined with the categories table by matching their category_code columns:

SELECT products.name   AS product_name,
       categories.name AS category_name,
       products.stock
FROM products
         JOIN categories ON products.category_code = categories.category_code;

If the column used for joining has the same name in both tables, you can use the USING syntax:

SELECT products.name   AS product_name,
       categories.name AS category_name,
       products.stock
FROM products
         JOIN categories USING (category_code);

The result looks as follows:

 product_name | category_name | stock
--------------+---------------+-------
 Laptop       | Electronics   |    15
 Desk Chair   | Home          |    10
 Smartphone   | Electronics   |    30
 Headphones   | Electronics   |    40
 Coffee Maker | Home          |    20
 T-Shirt      | Clothing      |   100
 Jacket       | Clothing      |    50
 Blender      | Home          |    25
(8 rows)

Similarly, you can join more than two tables. The following query joins three tables to return product, category, and sales information:

SELECT products.name   AS product_name,
       categories.name AS category_name,
       sales.quantity  AS quantity_sold,
       sales.date      AS sale_date
FROM products
         JOIN categories ON products.category_code = categories.category_code
         JOIN sales ON products.id = sales.product_id;

Aggregate data

Aggregate functions allow you to summarize data by grouping rows according to one or more columns. Common aggregate functions include SUM(), COUNT(), AVG(), MIN(), and MAX(). When combined with GROUP BY, these functions compute aggregated values for each group.

The following queries calculate the total stock for each product category by grouping rows by category and summing the stock values:

SELECT category_code,
       SUM(stock) AS total_stock
FROM products
GROUP BY category_code;

The result looks like this:

 category_code | total_stock
---------------+-------------
 home          |          55
 elec          |          85
 cloth         |         150
(3 rows)
SELECT categories.name     AS category_name,
       SUM(products.stock) AS total_stock
FROM products
         JOIN categories USING (category_code)
GROUP BY categories.name;

The result looks like this:

 category_name | total_stock
---------------+-------------
 Home          |          55
 Clothing      |         150
 Electronics   |          85
(3 rows)

Filter aggregated data

The HAVING clause allows you to filter groups after aggregation, similar to how WHERE filters individual rows before grouping. It is useful when you want to return only those groups that meet specific summary conditions.

The following queries show how to use HAVING to filter product categories where the total stock exceeds 80:

SELECT category_code,
       SUM(stock) AS total_stock
FROM products
GROUP BY category_code
HAVING SUM(stock) > 80;

The result should look as follows:

 category_code | total_stock
---------------+-------------
 cloth         |         150
 elec          |          85
(2 rows)
SELECT categories.name     AS category_name,
       SUM(products.stock) AS total_stock
FROM products
         JOIN categories USING (category_code)
GROUP BY categories.name
HAVING SUM(products.stock) > 80;

The result should look as follows:

 category_name | total_stock
---------------+-------------
 Clothing      |         150
 Electronics   |          85
(2 rows)

Sort data

After grouping and aggregating data, you can use the ORDER BY clause to sort the results based on an aggregate value or grouping column.

The following queries sort product categories by total stock in descending order:

SELECT category_code,
       SUM(stock) AS total_stock
FROM products p
GROUP BY category_code
ORDER BY total_stock DESC;

The result looks like this:

 category_code | total_stock
---------------+-------------
 cloth         |         150
 elec          |          85
 home          |          55
(3 rows)
SELECT categories.name     AS category_name,
       SUM(products.stock) AS total_stock
FROM products
         JOIN categories USING (category_code)
GROUP BY categories.name
ORDER BY total_stock DESC;

The result looks like this:

 category_name | total_stock
---------------+-------------
 Clothing      |         150
 Electronics   |          85
 Home          |          55
(3 rows)

Select distinct values

The DISTINCT keyword is used to return only unique values from a specified column or combination of columns. For example, to retrieve the list of unique product categories from the products table, execute this query:

SELECT DISTINCT category_code
FROM products;

The result should look like this:

 category_code
---------------
 home
 elec
 cloth
(3 rows)

Limit returned data

The LIMIT clause restricts the number of rows returned by a query:

SELECT name,
       price
FROM products
LIMIT 5;

The result is as follows:

     name     |  price
--------------+---------
 Laptop       | 1200.00
 Coffee Maker |   85.00
 T-Shirt      |   25.00
 Jacket       |   60.00
 Blender      |   55.00
(5 rows)

Note that when using LIMIT without the ORDER BY clause, the rows returned are arbitrary and may vary between executions because DBMS does not guarantee any specific order.

Combine result sets

The UNION operator allows you to combine the results of two or more SELECT statements into a single result set.

The following query separates products into price categories — low for prices below 100 and high for prices above 500:

SELECT name,
       price,
       'low' AS price_level
FROM products
WHERE price < 100

UNION

SELECT name,
       price,
       'high' AS price_level
FROM products
WHERE price > 500;

The result should look like this:

     name     |  price  | price_level
--------------+---------+-------------
 Coffee Maker |   85.00 | low
 Blender      |   55.00 | low
 Jacket       |   60.00 | low
 Laptop       | 1200.00 | high
 T-Shirt      |   25.00 | low
 Smartphone   |  800.00 | high
(6 rows)

Use subqueries

A subquery is a query nested inside another query. Subqueries can be used in SELECT, FROM, or WHERE clauses to provide a dynamic value or intermediate result that the outer query depends on.

This example uses a subquery in the WHERE clause to filter products whose price is greater than the average price of all products:

SELECT name,
       price
FROM products
WHERE price > (SELECT AVG(price) FROM products);

The result looks like this:

    name    |  price
------------+---------
 Laptop     | 1200.00
 Smartphone |  800.00
(2 rows)

Common table expressions (CTE)

A common table expression (CTE) lets you define a temporary result set that can be referenced within a SELECT statement. In the following example, a CTE named avg_price calculates the average product price:

WITH avg_price AS (SELECT AVG(price) AS value FROM products)
SELECT name,
       price
FROM products,
     avg_price
WHERE price > avg_price.value;

The result should look as follows:

    name    |  price
------------+---------
 Laptop     | 1200.00
 Smartphone |  800.00
(2 rows)

Window functions

Window functions perform calculations across a set of rows that are related to the current row. They are commonly used for running totals, rankings, moving averages, and other cumulative operations.

The following queries calculate a running total of quantity for each product_id over time using the SUM() window function. The window is defined to restart for each product and to follow the order of sale dates. In this query, the window specification is defined directly inside the OVER() clause:

SELECT product_id,
       date,
       quantity,
       SUM(quantity) OVER (PARTITION BY product_id ORDER BY date) AS running_total_quantity
FROM sales
WHERE product_id IN (1, 2)
ORDER BY product_id, date;

This query assigns a name to the window specification using the WINDOW clause:

SELECT product_id,
       date,
       quantity,
       SUM(quantity) OVER w AS running_total_quantity
FROM sales
WHERE product_id IN (1, 2)
WINDOW w AS (PARTITION BY product_id ORDER BY date)
ORDER BY product_id, date;

The result looks like this:

 product_id |    date    | quantity | running_total_quantity
------------+------------+----------+------------------------
          1 | 2025-01-03 |        1 |                      1
          1 | 2025-02-07 |        1 |                      2
          1 | 2025-03-15 |        1 |                      3
          2 | 2025-01-05 |        2 |                      2
          2 | 2025-02-01 |        1 |                      3
          2 | 2025-03-10 |        3 |                      6
(6 rows)