Tables overview
A table is a primary object used to store data in a database. In Greengage DB, tables function like those in any relational database, but their rows are distributed across the segments of the cluster for parallel processing and scalability. Refer to the PostgreSQL documentation for a detailed explanation of basic table concepts: Table Basics.
Overview of table creation
Table definition
The CREATE TABLE
command creates a table and defines its structure.
When you create a table, you may need to define:
-
Column data types
Data types define the kind of data that can be stored in columns.
-
Table and column constraints
Constraints allow you to apply additional limitations to data stored in a table or column.
-
Table storage model
Greengage DB supports several table storage models optimized for specific workload types, such as OLTP or OLAP. Learn more in the Table storage types topic.
-
Table distribution policy
Data distribution in Greengage DB enables parallel processing of data by spreading it across multiple segments. Learn more in the Distribution topic.
-
Data compression
Data compression allows you to decrease data size and improve I/O operations. Learn more in the Data compression topic.
-
Table partitioning strategy
Partitioning splits a large table into smaller pieces, improving query performance and allowing infrequently accessed data to be stored more cost-effectively. Learn more in the Partitioning topic.
The CREATE TABLE AS
command allows you to define a new table from the results of a query.
CREATE TABLE syntax
To create an empty table in the current database and define the table structure, use the CREATE TABLE
command.
The user who executes the command becomes the owner of the table.
The simplified syntax of the CREATE TABLE
command is described below.
CREATE [ TEMP ] TABLE <table_name>
(
[ { <column_name> <data_type> [<column_constraint> [ ... ] ]
| <table_constraint>
| LIKE <source_table> [ <like_option> ... ] }
[, ... ]
]
)
[ INHERITS ( <parent_table> [, ... ] ) ]
[ WITH ( <storage_table_directive> [, ... ] ) ]
[ DISTRIBUTED BY (<column_name> [<opclass>], [ ... ] )
| DISTRIBUTED RANDOMLY | DISTRIBUTED REPLICATED ]
[ PARTITION BY <partition_type> (<column_name>) <partition_specification>]
Arguments
The main arguments for defining a table, its columns, and constraints are the following:
-
table_name
A table name. By default, the
CREATE TABLE
command creates a table in the predefinedpublic
schema. To use another schema, specify its name in thetable_name
argument or change the default schema search path as described in the Schemas topic. -
column_name
anddata_type
A column name and its data type, respectively. See the recommendations for choosing data types in the Column data types section below.
-
column_constraint
A constraint defined for the specified column. See the Set constraints section below.
-
table_constraint
A constraint defined at the table level. See the Set constraints section below.
Clauses
Below are the descriptions of clauses mentioned in the CREATE TABLE
syntax snippet above:
-
TEMP
Create a temporary table that exists only for the duration of the session or until the end of the current transaction. Then the table is dropped automatically.
-
LIKE
Create a new table by duplicating the column definitions from an existing table, including column names, data types, and not-null constraints.
-
INHERITS
Create a new table that inherits columns and constraints from parent tables. All modifications applied to the parent table structure also propagate to the child one.
-
WITH
Define table storage parameters, such as
appendoptimized
andorientation
.NOTEThe
gp_default_storage_options
server configuration parameter allows you to set the default values for table storage options, including asappendoptimized
andorientation
. -
DISTRIBUTED
Define the distribution policy, which determines how table rows are distributed across segments in a Greengage DB cluster.
-
PARTITION BY
Declare the table as partitioned and specify the partitioning strategy.
Column data types
Here are some recommendations for choosing column data types:
-
Choose data types that match the kind of data you store.
For example, use numeric types for numbers, character types for text, and
date
ortimestamp
types for dates and times. -
Use the smallest data type that still fits your data.
For example, use
INT
instead ofBIGINT
, orTEXT
instead ofCHAR(<n>)
. -
Consider future growth when choosing small numeric types.
For example,
SMALLINT
might be enough for now, but changing toINT
later can be expensive if the table already has a lot of data. If the values might increase, it’s better to use a larger type from the start. -
Use built-in types designed for specific data.
Use data types such as
INET
,CIDR
,JSON
,JSONB
, orMACADDR
for better performance, validation, and clarity when working with IP addresses, JSON data, and so on. -
Use the same data types for columns that are joined.
If data types differ, they must be converted before comparison, which can slow down queries.
-
Define custom data types when needed.
In particular, use an
ENUM
type when a column should only allow a fixed set of values (for example,low
,medium
, andhigh
).ENUM
values are stored more efficiently and processed faster than equivalentTEXT
values.
Create tables
Create a new table
The SQL request below creates a new customers
table with the following parameters:
-
The
WITH
clause is used to create a row-oriented append-optimized table. -
The
DISTRIBUTED BY
clause specifies that thecustomer_id
column is used as the distribution key to spread table rows across segments.
CREATE TABLE customers
(
customer_id INTEGER,
name TEXT,
email TEXT,
customer_type TEXT
)
WITH (appendoptimized = true, orientation = row)
DISTRIBUTED BY (customer_id);
INSERT INTO customers (customer_id, name, email, customer_type)
VALUES (1, 'Andrew Fuller', 'andrew@example.com', 'Regular'),
(2, 'Michael Suyama', 'michael@testmail.com', 'Premium'),
(3, 'Laura Callahan', 'laura@example.io', 'Regular'),
(4, 'Nancy Davolio', 'nancy@samplemail.com', 'Premium'),
(5, 'Steven Buchanan', 'steven@fastmail.net', 'Regular'),
(6, 'Margaret Peacock', 'margaret@example.com', 'Regular');
You can optionally add a comment for the created table using the COMMENT
command:
COMMENT ON TABLE customers IS 'Customer information';
Create a table using the LIKE clause
The command below uses the LIKE
clause to create a new stg_customers
table with the same column names and data types as the customers
table:
CREATE TABLE stg_customers
(
LIKE customers
)
WITH (appendoptimized = true, orientation = row)
DISTRIBUTED BY (customer_id);
Note that storage parameters are not copied to the new table and should be defined explicitly using the WITH
clause.
Create a table using the INHERITS clause
To create a table that inherits all columns and constraints from an existing table, use the INHERITS
clause.
In the example below, a new business_customers
table inherits from the customers
table and adds an additional company
column:
CREATE TABLE business_customers
(
company TEXT
)
INHERITS (customers)
WITH (appendoptimized = true, orientation = row)
DISTRIBUTED BY (customer_id);
INSERT INTO business_customers (customer_id, name, email, customer_type, company)
VALUES (7, 'Robert King', 'robert@demo.org', 'Business', 'King Enterprises'),
(8, 'Janet Leverling', 'janet@businessmail.com', 'Business', 'Leverling Ltd');
Create a temporary table
The command below creates a temporary table from the results of the SELECT
query using CREATE TABLE AS
:
CREATE TEMP TABLE temp_premium_customers AS
SELECT customer_id, name, email
FROM customers
WHERE customer_type = 'Premium'
DISTRIBUTED BY (customer_id);
To check that the temp_premium_customers
table is created, use the \dt
meta-command:
List of relations Schema | Name | Type | Owner | Storage ------------+------------------------+-------+---------+------------- pg_temp_17 | temp_premium_customers | table | gpadmin | heap public | business_customers | table | gpadmin | append only public | customers | table | gpadmin | append only public | stg_customers | table | gpadmin | append only
Reconnect to the database using the \c
meta-command:
\c marketplace
The temporary table is automatically dropped when the session ends.
To verify this, run \dt
again to list the available tables.
The output should confirm that the temp_premium_customers
table no longer exists:
List of relations Schema | Name | Type | Owner | Storage --------+--------------------+-------+---------+------------- public | business_customers | table | gpadmin | append only public | customers | table | gpadmin | append only public | stg_customers | table | gpadmin | append only
To learn more about viewing table information, see the View information about tables section below.
Set constraints
Unique constraints
The SQL request below creates the orders
table with a unique constraint on the order_id
column using the UNIQUE
clause:
CREATE TABLE orders
(
order_id INTEGER UNIQUE,
customer_id INTEGER,
product_id INTEGER,
price NUMERIC(6, 2)
)
DISTRIBUTED BY (order_id);
Every value in the order_id
column must be distinct — no duplicates are allowed.
For example, the INSERT
request below violates the specified unique constraint:
INSERT INTO orders (order_id, customer_id, product_id, price)
VALUES (1, 1, 11, 99.9),
(1, 2, 35, 25.5);
The following error is raised in this case:
duplicate key value violates unique constraint "orders_order_id_key"
Note the following for unique and primary key constraints:
-
Defining a unique constraint implicitly creates a unique index.
-
A unique constraint must include all columns of the distribution key and the columns of the partitioning key (if specified).
-
Unique constraints are not supported for append-optimized tables.
Not-null constraints
The command below creates the orders
table with a not-null constraint on the order_id
column using the NOT NULL
clause:
CREATE TABLE orders
(
order_id INTEGER NOT NULL,
customer_id INTEGER,
product_id INTEGER,
price NUMERIC(6, 2)
)
DISTRIBUTED BY (order_id);
The order_id
column should not contain null values.
For example, the INSERT
request below violates the specified not-null constraint:
INSERT INTO orders (order_id, customer_id, product_id, price)
VALUES (1, 1, 11, 99.9),
(NULL, 2, 35, 25.5);
The following error is raised:
null value in column "order_id" violates not-null constraint
Primary keys
A primary key constraint is a combination of unique and not-null constraints. Defining a primary key imposes limitations on how a table’s distribution policy can be defined:
-
A table must be hash-distributed.
-
Primary key columns must be the same as a table’s distribution key columns.
If a table has a primary key, this column is chosen as the distribution key by default.
The command below creates the orders
table with a primary key constraint on the order_id
column:
CREATE TABLE orders
(
order_id INTEGER PRIMARY KEY,
customer_id INTEGER,
product_id INTEGER,
price NUMERIC(6, 2)
)
DISTRIBUTED BY (order_id);
For example, the INSERT
request below violates the specified constraint:
INSERT INTO orders (order_id, customer_id, product_id, price)
VALUES (1, 1, 11, 99.9),
(1, 2, 35, 25.5);
The following error is thrown:
duplicate key value violates unique constraint "orders_pkey"
Check constraints
A check constraint allows you to specify that values in one or more columns must satisfy a boolean expression.
Column constraints
In the example below, a check constraint requires positive product prices:
CREATE TABLE orders
(
order_id INTEGER,
customer_id INTEGER,
product_id INTEGER,
price NUMERIC(6, 2) CHECK (price >= 0)
)
WITH (appendoptimized = true, orientation = row)
DISTRIBUTED BY (order_id);
For example, the INSERT
request below violates the specified constraint:
INSERT INTO orders (order_id, customer_id, product_id, price)
VALUES (1, 1, 11, -10);
The following error is raised:
new row for relation "orders" violates check constraint "orders_price_check"
Table constraints
In the example below, a check constraint is specified at the table level for the price
and count
columns:
CREATE TABLE orders
(
order_id INTEGER,
customer_id INTEGER,
product_id INTEGER,
price NUMERIC(6, 2),
count INTEGER,
CHECK (price > 0 AND count >= 1)
)
WITH (appendoptimized = true, orientation = row)
DISTRIBUTED BY (order_id);
For example, the INSERT
request below violates the specified check constraint:
INSERT INTO orders (order_id, customer_id, product_id, price, count)
VALUES (1, 1, 11, -25.5, 0);
This request raises the following error:
new row for relation "orders" violates check constraint "orders_check"
Foreign keys
Foreign key constraints are not supported. You can declare them, but referential integrity checks cannot be enforced between distributed table segments.
This example defines the products
table and the orders
table with a foreign key reference to products
:
CREATE TABLE products
(
product_id INTEGER PRIMARY KEY,
name TEXT
)
DISTRIBUTED REPLICATED;
CREATE TABLE orders
(
order_id INTEGER,
product_id INTEGER REFERENCES products (product_id),
price NUMERIC(6, 2)
)
WITH (appendoptimized = true, orientation = row)
DISTRIBUTED BY (order_id);
The following warning should be shown when creating the orders
table:
referential integrity (FOREIGN KEY) constraints are not supported in Greengage Database, will not be enforced
These requests insert valid rows into products
and an orders
row with a non-existent product_id
:
INSERT INTO products (product_id, name)
VALUES (1, 'Chai'),
(2, 'Chang');
INSERT INTO orders (order_id, product_id, price)
VALUES (1, 3, 19.99);
No errors or warnings should be thrown.
Alter tables
To change the definition of a table, use the ALTER TABLE
command.
The SQL request below adds a new signup_date
column to the customers
table:
ALTER TABLE customers
ADD COLUMN signup_date DATE;
You can use the \d customers
meta-command to check that the column is added:
Column | Type | Modifiers ---------------+---------+----------- customer_id | integer | name | text | email | text | customer_type | text | signup_date | date |
Note that tables created using the INHERITS
clause inherit all columns from the parent tables.
For example, running \d business_customers
shows that it includes the signup_date
column, inherited from its parent customers
table:
Column | Type | Modifiers ---------------+---------+----------- customer_id | integer | name | text | email | text | customer_type | text | company | text | signup_date | date |
Some ALTER TABLE
commands, including ADD COLUMN
, might require a full table rewrite and take a lock.
This blocks access to the table, including reads, until the command finishes.
Run these commands during a maintenance window for large tables to avoid service downtime.
For very large tables that require a rewrite, it is more efficient to create a copy of the table, insert the data into the copy, and then rename it. This avoids locking the original table and allows changes without blocking access.
View information about tables
View the list of tables
psql meta-commands
-
To show the list of tables, use the
\dt
meta-command:\dt
The output includes different information, including the schema each table belongs to, the table name, its type, owner, and storage model:
List of relations Schema | Name | Type | Owner | Storage --------+--------------------+-------+---------+------------- public | business_customers | table | gpadmin | append only public | customers | table | gpadmin | append only public | stg_customers | table | gpadmin | append only
-
The
\dt+
meta-command returns additional fields:\dt+
The output includes the table’s size and description:
List of relations Schema | Name | Type | Owner | Storage | Size | Description --------+--------------------+-------+---------+-------------+--------+---------------------- public | business_customers | table | gpadmin | append only | 448 kB | public | customers | table | gpadmin | append only | 449 kB | Customer information public | stg_customers | table | gpadmin | append only | 320 kB |
-
To include system tables in the listing, use the
\dtS
meta-command:\dtS
The output should include tables from the
pg_catalog
system catalog schema:List of relations Schema | Name | Type | Owner | Storage ------------+--------------------------+-------+---------+------------- pg_catalog | gp_configuration_history | table | gpadmin | heap pg_catalog | gp_distribution_policy | table | gpadmin | heap pg_catalog | gp_fastsequence | table | gpadmin | heap pg_catalog | gp_id | table | gpadmin | heap pg_catalog | gp_segment_configuration | table | gpadmin | heap pg_catalog | gp_version_at_initdb | table | gpadmin | heap pg_catalog | pg_aggregate | table | gpadmin | heap pg_catalog | pg_am | table | gpadmin | heap pg_catalog | pg_amop | table | gpadmin | heap pg_catalog | pg_amproc | table | gpadmin | heap ...
-
To list tables with the specified schema name, you can use regular expressions. For example, the following command returns all tables within the
information_schema
schema:\dt information_schema.*
The result should look like this:
List of relations Schema | Name | Type | Owner | Storage --------------------+-------------------------+-------+---------+--------- information_schema | sql_features | table | gpadmin | heap information_schema | sql_implementation_info | table | gpadmin | heap information_schema | sql_languages | table | gpadmin | heap information_schema | sql_packages | table | gpadmin | heap information_schema | sql_parts | table | gpadmin | heap information_schema | sql_sizing | table | gpadmin | heap information_schema | sql_sizing_profiles | table | gpadmin | heap
-
To display access privileges for database objects, including tables, use the
\dp
meta-command:\dp
The result should look like this:
Access privileges Schema | Name | Type | Access privileges | Column access privileges --------+--------------------+-------+-------------------+-------------------------- public | business_customers | table | | public | customers | table | | public | stg_customers | table | |
The
Access privileges
column is empty for new tables, meaning no specific privileges are set. See Roles and privileges to learn how to grant privileges to your users to allow access to these tables.
SQL commands
-
Run the SQL query against the
pg_catalog.pg_tables
catalog view:SELECT * FROM pg_catalog.pg_tables WHERE schemaname = 'public' ORDER BY schemaname, tablename;
The output shows information about tables, including their schema, name, owner, and associated attributes like whether they have indexes, rules, or triggers:
schemaname | tablename | tableowner | tablespace | hasindexes | hasrules | hastriggers ------------+--------------------+------------+------------+------------+----------+------------- public | business_customers | gpadmin | | f | f | f public | customers | gpadmin | | f | f | f public | stg_customers | gpadmin | | f | f | f
-
Run the SQL query against the
information_schema.tables
catalog table. Usetable_type = 'BASE TABLE'
in theWHERE
clause to list tables only:SELECT table_catalog, table_schema, table_name, table_type FROM information_schema.tables WHERE table_type = 'BASE TABLE' AND table_schema = 'public' ORDER BY table_schema, table_name;
The result looks as follows:
table_catalog | table_schema | table_name | table_type ---------------+--------------+--------------------+------------ marketplace | public | business_customers | BASE TABLE marketplace | public | customers | BASE TABLE marketplace | public | stg_customers | BASE TABLE
View the specified table
-
To view the structure of the specified table, use the
\d
meta-command:\d customers
The output provides a summary of the table’s columns, data types, and other table settings:
Append-Only Table "public.customers" Column | Type | Modifiers ---------------+---------+----------- customer_id | integer | name | text | email | text | customer_type | text | signup_date | date | Compression Type: None Compression Level: 0 Block Size: 32768 Checksum: t Number of child tables: 1 (Use \d+ to list them.) Distributed by: (customer_id)
-
The
\d+
meta-command returns additional information:\d+ customers
The output might include column storage options, such as compression type and level, child tables created using the
INHERITS
clause, and so on:Append-Only Table "public.customers" Column | Type | Modifiers | Storage | Stats target | Description ---------------+---------+-----------+----------+--------------+------------- customer_id | integer | | plain | | name | text | | extended | | email | text | | extended | | customer_type | text | | extended | | signup_date | date | | plain | | Compression Type: None Compression Level: 0 Block Size: 32768 Checksum: t Child tables: business_customers Distributed by: (customer_id) Options: appendonly=true, orientation=row
Useful SQL queries
-
The SQL request below returns a number of database relations grouped by their type:
SELECT relkind, COUNT(*) FROM pg_catalog.pg_class GROUP BY(relkind) ORDER BY COUNT(*);
The result looks like this:
relkind | count ---------+------- o | 3 M | 3 c | 4 t | 25 r | 88 i | 166 v | 166
For example,
r
is a heap or append-optimized table,i
is an index,v
is a view. -
The command below allows you to get information about a relation by its name:
SELECT relation.oid, relation.relname, schema.nspname, tablespace.spcname, relation.relstorage, relation.relpersistence, relation.relkind, relation.relpages, relation.reltuples FROM pg_catalog.pg_class relation LEFT JOIN pg_catalog.pg_namespace schema ON schema.oid = relation.relnamespace LEFT JOIN pg_catalog.pg_tablespace tablespace ON tablespace.oid = relation.reltablespace WHERE relation.relname = 'customers';
The result looks as follows:
oid | relname | nspname | spcname | relstorage | relpersistence | relkind | relpages | reltuples --------+-----------+---------+---------+------------+----------------+---------+----------+----------- 147889 | customers | public | | a | p | r | 1 | 6
-
The command below lists TOAST tables with the main tables they are associated with:
SELECT toast_table.oid AS "TOAST OID", toast_table.relname AS "TOAST name", main_table.oid AS "Main OID", main_table.relname AS "Main name" FROM pg_catalog.pg_class toast_table LEFT JOIN pg_catalog.pg_class main_table ON toast_table.oid = main_table.reltoastrelid WHERE main_table.relname LIKE '%customers%' ORDER BY main_table.relname;
The output should look like this:
TOAST OID | TOAST name | Main OID | Main name -----------+-----------------+----------+-------------------- 147988 | pg_toast_147909 | 147909 | business_customers 147978 | pg_toast_147889 | 147889 | customers 147901 | pg_toast_147899 | 147899 | stg_customers
Drop tables
To drop tables, use the DROP TABLE
command:
DROP TABLE stg_customers;
To drop a table and its related objects, such as views or inherited tables, use the CASCADE
option:
DROP TABLE customers CASCADE;
System catalog tables
Greengage DB provides a set of predefined system catalog tables in the pg_catalog
schema.
Some of these tables can be useful for monitoring your cluster and other DBA activity.
Examples of such tables are:
-
gp_segment_configuration
— contains information about mirroring and segment instance configuration. -
gp_distribution_policy
— includes information on tables and their policy for distributing data across the segments. -
gp_configuration_history
— contains information about system changes related to fault detection and recovery operations.