Tables overview

Andrey Aksenov

A table is a primary object used to store data in a database. In Greengage DB, tables function like those in any relational database, but their rows are distributed across the segments of the cluster for parallel processing and scalability. Refer to the PostgreSQL documentation for a detailed explanation of basic table concepts: Table Basics.

The CREATE TABLE command creates a table and defines its structure. When you create a table, you may need to define:

Column data types

Data types define the kind of data that can be stored in columns.
Table and column constraints

Constraints allow you to apply additional limitations to data stored in a table or column.
Table storage model

Greengage DB supports several table storage models optimized for specific workload types, such as OLTP or OLAP. Learn more in the Table storage types topic.
Table distribution policy

Data distribution in Greengage DB enables parallel processing of data by spreading it across multiple segments. Learn more in the Distribution topic.
Data compression

Data compression allows you to decrease data size and improve I/O operations. Learn more in the Data compression topic.
Table partitioning strategy

Partitioning splits a large table into smaller pieces, improving query performance and allowing infrequently accessed data to be stored more cost-effectively. Learn more in the Partitioning topic.

NOTE

The CREATE TABLE AS command allows you to define a new table from the results of a query.

To create an empty table in the current database and define the table structure, use the CREATE TABLE command. The user who executes the command becomes the owner of the table.

The simplified syntax of the CREATE TABLE command is described below.

CREATE [ TEMP ] TABLE <table_name>
(
    [ { <column_name> <data_type> [<column_constraint> [ ... ] ]
      | <table_constraint>
      | LIKE <source_table> [ <like_option> ... ] }
      [, ... ]
    ]
)
[ INHERITS ( <parent_table> [, ... ] ) ]
[ WITH ( <storage_table_directive> [, ... ] ) ]
[ DISTRIBUTED BY (<column_name> [<opclass>], [ ... ] )
    | DISTRIBUTED RANDOMLY | DISTRIBUTED REPLICATED ]
[ PARTITION BY <partition_type> (<column_name>) <partition_specification>]

The main arguments for defining a table, its columns, and constraints are the following:

table_name

A table name. By default, the CREATE TABLE command creates a table in the predefined public schema. To use another schema, specify its name in the table_name argument or change the default schema search path as described in the Schemas topic.
column_name and data_type

A column name and its data type, respectively. See the recommendations for choosing data types in the Column data types section below.
column_constraint

A constraint defined for the specified column. See the Set constraints section below.
table_constraint

A constraint defined at the table level. See the Set constraints section below.

Below are the descriptions of clauses mentioned in the CREATE TABLE syntax snippet above:

TEMP

Create a temporary table that exists only for the duration of the session or until the end of the current transaction. Then the table is dropped automatically.
LIKE

Create a new table by duplicating the column definitions from an existing table, including column names, data types, and not-null constraints.
INHERITS

Create a new table that inherits columns and constraints from parent tables. All modifications applied to the parent table structure also propagate to the child one.
WITH

Define table storage parameters, such as appendoptimized and orientation.

NOTE

The gp_default_storage_options server configuration parameter allows you to set the default values for table storage options, including as appendoptimized and orientation.
DISTRIBUTED

Define the distribution policy, which determines how table rows are distributed across segments in a Greengage DB cluster.
PARTITION BY

Declare the table as partitioned and specify the partitioning strategy.

Here are some recommendations for choosing column data types:

Choose data types that match the kind of data you store.

For example, use numeric types for numbers, character types for text, and date or timestamp types for dates and times.
Use the smallest data type that still fits your data.

For example, use INT instead of BIGINT, or TEXT instead of CHAR(<n>).
Consider future growth when choosing small numeric types.

For example, SMALLINT might be enough for now, but changing to INT later can be expensive if the table already has a lot of data. If the values might increase, it’s better to use a larger type from the start.
Use built-in types designed for specific data.

Use data types such as INET, CIDR, JSON, JSONB, or MACADDR for better performance, validation, and clarity when working with IP addresses, JSON data, and so on.
Use the same data types for columns that are joined.

If data types differ, they must be converted before comparison, which can slow down queries.
Define custom data types when needed.

In particular, use an ENUM type when a column should only allow a fixed set of values (for example, low, medium, and high). ENUM values are stored more efficiently and processed faster than equivalent TEXT values.

Prerequisites

To execute commands described in the following sections, connect to the Greengage DB master host using psql as described in Connect to Greengage DB via psql. Then, create a new database and connect to it:

CREATE DATABASE marketplace;
\c marketplace

The SQL request below creates a new customers table with the following parameters:

The WITH clause is used to create a row-oriented append-optimized table.
The DISTRIBUTED BY clause specifies that the customer_id column is used as the distribution key to spread table rows across segments.

CREATE TABLE customers
(
    customer_id   INTEGER,
    name          TEXT,
    email         TEXT,
    customer_type TEXT
)
    WITH (appendoptimized = true, orientation = row)
    DISTRIBUTED BY (customer_id);

Sample data

INSERT INTO customers (customer_id, name, email, customer_type)
VALUES (1, 'Andrew Fuller', 'andrew@example.com', 'Regular'),
       (2, 'Michael Suyama', 'michael@testmail.com', 'Premium'),
       (3, 'Laura Callahan', 'laura@example.io', 'Regular'),
       (4, 'Nancy Davolio', 'nancy@samplemail.com', 'Premium'),
       (5, 'Steven Buchanan', 'steven@fastmail.net', 'Regular'),
       (6, 'Margaret Peacock', 'margaret@example.com', 'Regular');

You can optionally add a comment for the created table using the COMMENT command:

COMMENT ON TABLE customers IS 'Customer information';

The command below uses the LIKE clause to create a new stg_customers table with the same column names and data types as the customers table:

CREATE TABLE stg_customers
(
    LIKE customers
)
    WITH (appendoptimized = true, orientation = row)
    DISTRIBUTED BY (customer_id);

Note that storage parameters are not copied to the new table and should be defined explicitly using the WITH clause.

To create a table that inherits all columns and constraints from an existing table, use the INHERITS clause. In the example below, a new business_customers table inherits from the customers table and adds an additional company column:

CREATE TABLE business_customers
(
    company TEXT
)
    INHERITS (customers)
    WITH (appendoptimized = true, orientation = row)
    DISTRIBUTED BY (customer_id);

Sample data

INSERT INTO business_customers (customer_id, name, email, customer_type, company)
VALUES (7, 'Robert King', 'robert@demo.org', 'Business', 'King Enterprises'),
       (8, 'Janet Leverling', 'janet@businessmail.com', 'Business', 'Leverling Ltd');

The command below creates a temporary table from the results of the SELECT query using CREATE TABLE AS:

CREATE TEMP TABLE temp_premium_customers AS
SELECT customer_id, name, email
FROM customers
WHERE customer_type = 'Premium'
    DISTRIBUTED BY (customer_id);

To check that the temp_premium_customers table is created, use the \dt meta-command:

                          List of relations
   Schema   |          Name          | Type  |  Owner  |   Storage
------------+------------------------+-------+---------+-------------
 pg_temp_17 | temp_premium_customers | table | gpadmin | heap
 public     | business_customers     | table | gpadmin | append only
 public     | customers              | table | gpadmin | append only
 public     | stg_customers          | table | gpadmin | append only

Reconnect to the database using the \c meta-command:

\c marketplace

The temporary table is automatically dropped when the session ends. To verify this, run \dt again to list the available tables. The output should confirm that the temp_premium_customers table no longer exists:

                      List of relations
 Schema |        Name        | Type  |  Owner  |   Storage
--------+--------------------+-------+---------+-------------
 public | business_customers | table | gpadmin | append only
 public | customers          | table | gpadmin | append only
 public | stg_customers      | table | gpadmin | append only

To learn more about viewing table information, see the View information about tables section below.

The SQL request below creates the orders table with a unique constraint on the order_id column using the UNIQUE clause:

CREATE TABLE orders
(
    order_id    INTEGER UNIQUE,
    customer_id INTEGER,
    product_id  INTEGER,
    price       NUMERIC(6, 2)
)
    DISTRIBUTED BY (order_id);

Every value in the order_id column must be distinct — no duplicates are allowed. For example, the INSERT request below violates the specified unique constraint:

INSERT INTO orders (order_id, customer_id, product_id, price)
VALUES (1, 1, 11, 99.9),
       (1, 2, 35, 25.5);

The following error is raised in this case:

duplicate key value violates unique constraint "orders_order_id_key"

IMPORTANT

Note the following for unique and primary key constraints:

Defining a unique constraint implicitly creates a unique index.
A unique constraint must include all columns of the distribution key and the columns of the partitioning key (if specified).
Unique constraints are not supported for append-optimized tables.

The command below creates the orders table with a not-null constraint on the order_id column using the NOT NULL clause:

CREATE TABLE orders
(
    order_id    INTEGER NOT NULL,
    customer_id INTEGER,
    product_id  INTEGER,
    price       NUMERIC(6, 2)
)
    DISTRIBUTED BY (order_id);

The order_id column should not contain null values. For example, the INSERT request below violates the specified not-null constraint:

INSERT INTO orders (order_id, customer_id, product_id, price)
VALUES (1, 1, 11, 99.9),
       (NULL, 2, 35, 25.5);

The following error is raised:

null value in column "order_id" violates not-null constraint

A primary key constraint is a combination of unique and not-null constraints. Defining a primary key imposes limitations on how a table’s distribution policy can be defined:

A table must be hash-distributed.
Primary key columns must be the same as a table’s distribution key columns.

If a table has a primary key, this column is chosen as the distribution key by default.

The command below creates the orders table with a primary key constraint on the order_id column:

CREATE TABLE orders
(
    order_id    INTEGER PRIMARY KEY,
    customer_id INTEGER,
    product_id  INTEGER,
    price       NUMERIC(6, 2)
)
    DISTRIBUTED BY (order_id);

For example, the INSERT request below violates the specified constraint:

INSERT INTO orders (order_id, customer_id, product_id, price)
VALUES (1, 1, 11, 99.9),
       (1, 2, 35, 25.5);

The following error is thrown:

duplicate key value violates unique constraint "orders_pkey"

A check constraint allows you to specify that values in one or more columns must satisfy a boolean expression.

In the example below, a check constraint requires positive product prices:

CREATE TABLE orders
(
    order_id    INTEGER,
    customer_id INTEGER,
    product_id  INTEGER,
    price       NUMERIC(6, 2) CHECK (price >= 0)
)
    WITH (appendoptimized = true, orientation = row)
    DISTRIBUTED BY (order_id);

For example, the INSERT request below violates the specified constraint:

INSERT INTO orders (order_id, customer_id, product_id, price)
VALUES (1, 1, 11, -10);

The following error is raised:

new row for relation "orders" violates check constraint "orders_price_check"

In the example below, a check constraint is specified at the table level for the price and count columns:

CREATE TABLE orders
(
    order_id    INTEGER,
    customer_id INTEGER,
    product_id  INTEGER,
    price       NUMERIC(6, 2),
    count       INTEGER,
    CHECK (price > 0 AND count >= 1)
)
    WITH (appendoptimized = true, orientation = row)
    DISTRIBUTED BY (order_id);

For example, the INSERT request below violates the specified check constraint:

INSERT INTO orders (order_id, customer_id, product_id, price, count)
VALUES (1, 1, 11, -25.5, 0);

This request raises the following error:

new row for relation "orders" violates check constraint "orders_check"

IMPORTANT

Foreign key constraints are not supported. You can declare them, but referential integrity checks cannot be enforced between distributed table segments.

This example defines the products table and the orders table with a foreign key reference to products:

CREATE TABLE products
(
    product_id INTEGER PRIMARY KEY,
    name       TEXT
)
    DISTRIBUTED REPLICATED;

CREATE TABLE orders
(
    order_id   INTEGER,
    product_id INTEGER REFERENCES products (product_id),
    price      NUMERIC(6, 2)
)
    WITH (appendoptimized = true, orientation = row)
    DISTRIBUTED BY (order_id);

The following warning should be shown when creating the orders table:

referential integrity (FOREIGN KEY) constraints are not supported in Greengage Database, will not be enforced

These requests insert valid rows into products and an orders row with a non-existent product_id:

INSERT INTO products (product_id, name)
VALUES (1, 'Chai'),
       (2, 'Chang');

INSERT INTO orders (order_id, product_id, price)
VALUES (1, 3, 19.99);

No errors or warnings should be thrown.

To change the definition of a table, use the ALTER TABLE command. The SQL request below adds a new signup_date column to the customers table:

ALTER TABLE customers
    ADD COLUMN signup_date DATE;

You can use the \d customers meta-command to check that the column is added:

    Column     |  Type   | Modifiers
---------------+---------+-----------
 customer_id   | integer |
 name          | text    |
 email         | text    |
 customer_type | text    |
 signup_date   | date    |

Note that tables created using the INHERITS clause inherit all columns from the parent tables. For example, running \d business_customers shows that it includes the signup_date column, inherited from its parent customers table:

    Column     |  Type   | Modifiers
---------------+---------+-----------
 customer_id   | integer |
 name          | text    |
 email         | text    |
 customer_type | text    |
 company       | text    |
 signup_date   | date    |

IMPORTANT

Some ALTER TABLE commands, including ADD COLUMN, might require a full table rewrite and take a lock. This blocks access to the table, including reads, until the command finishes. Run these commands during a maintenance window for large tables to avoid service downtime.

For very large tables that require a rewrite, it is more efficient to create a copy of the table, insert the data into the copy, and then rename it. This avoids locking the original table and allows changes without blocking access.

To show the list of tables, use the \dt meta-command:

\dt

The output includes different information, including the schema each table belongs to, the table name, its type, owner, and storage model:

                      List of relations
 Schema |        Name        | Type  |  Owner  |   Storage
--------+--------------------+-------+---------+-------------
 public | business_customers | table | gpadmin | append only
 public | customers          | table | gpadmin | append only
 public | stg_customers      | table | gpadmin | append only

The \dt+ meta-command returns additional fields:

\dt+

The output includes the table’s size and description:

                                      List of relations
 Schema |        Name        | Type  |  Owner  |   Storage   |  Size  |     Description
--------+--------------------+-------+---------+-------------+--------+----------------------
 public | business_customers | table | gpadmin | append only | 448 kB |
 public | customers          | table | gpadmin | append only | 449 kB | Customer information
 public | stg_customers      | table | gpadmin | append only | 320 kB |

To include system tables in the listing, use the \dtS meta-command:

\dtS

The output should include tables from the pg_catalog system catalog schema:

                           List of relations
   Schema   |           Name           | Type  |  Owner  |   Storage
------------+--------------------------+-------+---------+-------------
 pg_catalog | gp_configuration_history | table | gpadmin | heap
 pg_catalog | gp_distribution_policy   | table | gpadmin | heap
 pg_catalog | gp_fastsequence          | table | gpadmin | heap
 pg_catalog | gp_id                    | table | gpadmin | heap
 pg_catalog | gp_segment_configuration | table | gpadmin | heap
 pg_catalog | gp_version_at_initdb     | table | gpadmin | heap
 pg_catalog | pg_aggregate             | table | gpadmin | heap
 pg_catalog | pg_am                    | table | gpadmin | heap
 pg_catalog | pg_amop                  | table | gpadmin | heap
 pg_catalog | pg_amproc                | table | gpadmin | heap
...

To list tables with the specified schema name, you can use regular expressions. For example, the following command returns all tables within the information_schema schema:

\dt information_schema.*

The result should look like this:

                            List of relations
       Schema       |          Name           | Type  |  Owner  | Storage
--------------------+-------------------------+-------+---------+---------
 information_schema | sql_features            | table | gpadmin | heap
 information_schema | sql_implementation_info | table | gpadmin | heap
 information_schema | sql_languages           | table | gpadmin | heap
 information_schema | sql_packages            | table | gpadmin | heap
 information_schema | sql_parts               | table | gpadmin | heap
 information_schema | sql_sizing              | table | gpadmin | heap
 information_schema | sql_sizing_profiles     | table | gpadmin | heap

To display access privileges for database objects, including tables, use the \dp meta-command:

\dp

The result should look like this:

                                 Access privileges
 Schema |        Name        | Type  | Access privileges | Column access privileges
--------+--------------------+-------+-------------------+--------------------------
 public | business_customers | table |                   |
 public | customers          | table |                   |
 public | stg_customers      | table |                   |

The Access privileges column is empty for new tables, meaning no specific privileges are set. See Roles and privileges to learn how to grant privileges to your users to allow access to these tables.

Run the SQL query against the pg_catalog.pg_tables catalog view:

SELECT *
FROM pg_catalog.pg_tables
WHERE schemaname = 'public'
ORDER BY schemaname, tablename;

The output shows information about tables, including their schema, name, owner, and associated attributes like whether they have indexes, rules, or triggers:

 schemaname |     tablename      | tableowner | tablespace | hasindexes | hasrules | hastriggers
------------+--------------------+------------+------------+------------+----------+-------------
 public     | business_customers | gpadmin    |            | f          | f        | f
 public     | customers          | gpadmin    |            | f          | f        | f
 public     | stg_customers      | gpadmin    |            | f          | f        | f

Run the SQL query against the information_schema.tables catalog table. Use table_type = 'BASE TABLE' in the WHERE clause to list tables only:

SELECT table_catalog, table_schema, table_name, table_type
FROM information_schema.tables
WHERE table_type = 'BASE TABLE'
  AND table_schema = 'public'
ORDER BY table_schema, table_name;

The result looks as follows:

 table_catalog | table_schema |     table_name     | table_type
---------------+--------------+--------------------+------------
 marketplace   | public       | business_customers | BASE TABLE
 marketplace   | public       | customers          | BASE TABLE
 marketplace   | public       | stg_customers      | BASE TABLE

To view the structure of the specified table, use the \d meta-command:

\d customers

The output provides a summary of the table’s columns, data types, and other table settings:

Append-Only Table "public.customers"
    Column     |  Type   | Modifiers
---------------+---------+-----------
 customer_id   | integer |
 name          | text    |
 email         | text    |
 customer_type | text    |
 signup_date   | date    |
Compression Type: None
Compression Level: 0
Block Size: 32768
Checksum: t
Number of child tables: 1 (Use \d+ to list them.)
Distributed by: (customer_id)

The \d+ meta-command returns additional information:

\d+ customers

The output might include column storage options, such as compression type and level, child tables created using the INHERITS clause, and so on:

                    Append-Only Table "public.customers"
    Column     |  Type   | Modifiers | Storage  | Stats target | Description
---------------+---------+-----------+----------+--------------+-------------
 customer_id   | integer |           | plain    |              |
 name          | text    |           | extended |              |
 email         | text    |           | extended |              |
 customer_type | text    |           | extended |              |
 signup_date   | date    |           | plain    |              |
Compression Type: None
Compression Level: 0
Block Size: 32768
Checksum: t
Child tables: business_customers
Distributed by: (customer_id)
Options: appendonly=true, orientation=row

The SQL request below returns a number of database relations grouped by their type:

SELECT relkind, COUNT(*)
FROM pg_catalog.pg_class
GROUP BY(relkind)
ORDER BY COUNT(*);

The result looks like this:

 relkind | count
---------+-------
 o       |     3
 M       |     3
 c       |     4
 t       |    25
 r       |    88
 i       |   166
 v       |   166

For example, r is a heap or append-optimized table, i is an index, v is a view.

The command below allows you to get information about a relation by its name:

SELECT relation.oid,
       relation.relname,
       schema.nspname,
       tablespace.spcname,
       relation.relstorage,
       relation.relpersistence,
       relation.relkind,
       relation.relpages,
       relation.reltuples
FROM pg_catalog.pg_class relation
         LEFT JOIN pg_catalog.pg_namespace schema ON schema.oid = relation.relnamespace
         LEFT JOIN pg_catalog.pg_tablespace tablespace ON tablespace.oid = relation.reltablespace
WHERE relation.relname = 'customers';

The result looks as follows:

  oid   |  relname  | nspname | spcname | relstorage | relpersistence | relkind | relpages | reltuples
--------+-----------+---------+---------+------------+----------------+---------+----------+-----------
 147889 | customers | public  |         | a          | p              | r       |        1 |         6

The command below lists TOAST tables with the main tables they are associated with:

SELECT toast_table.oid     AS "TOAST OID",
       toast_table.relname AS "TOAST name",
       main_table.oid      AS "Main OID",
       main_table.relname  AS "Main name"
FROM pg_catalog.pg_class toast_table
         LEFT JOIN pg_catalog.pg_class main_table
                   ON toast_table.oid = main_table.reltoastrelid
WHERE main_table.relname LIKE '%customers%'
ORDER BY main_table.relname;

The output should look like this:

 TOAST OID |   TOAST name    | Main OID |     Main name
-----------+-----------------+----------+--------------------
    147988 | pg_toast_147909 |   147909 | business_customers
    147978 | pg_toast_147889 |   147889 | customers
    147901 | pg_toast_147899 |   147899 | stg_customers

To drop tables, use the DROP TABLE command:

DROP TABLE stg_customers;

To drop a table and its related objects, such as views or inherited tables, use the CASCADE option:

DROP TABLE customers CASCADE;

Greengage DB provides a set of predefined system catalog tables in the pg_catalog schema. Some of these tables can be useful for monitoring your cluster and other DBA activity. Examples of such tables are:

gp_segment_configuration — contains information about mirroring and segment instance configuration.
gp_distribution_policy — includes information on tables and their policy for distributing data across the segments.
gp_configuration_history — contains information about system changes related to fault detection and recovery operations.

Schemas

Table storage types

Tables overview

Overview of table creation

Table definition

CREATE TABLE syntax

Arguments

Clauses

Column data types

Create tables

Create a new table

Create a table using the LIKE clause

Create a table using the INHERITS clause

Create a temporary table

Set constraints

Unique constraints

Not-null constraints

Primary keys

Check constraints

Column constraints

Table constraints

Foreign keys

Alter tables

View information about tables

View the list of tables

psql meta-commands

SQL commands

View the specified table

Useful SQL queries

Drop tables

System catalog tables