Distribution
In Greengage DB, tables are distributed across cluster segments to enable parallel processing and improve query performance. When creating or altering a table, you can specify one of the available distribution policies to define how table rows are distributed across segments.
For optimal performance, both data and workloads should be distributed evenly across segments. This ensures that all segments start and finish their tasks simultaneously. The more evenly the data is distributed, the better the overall cluster performance.
Overview
The diagram below illustrates how rows from the customers
and orders
tables might be distributed across four segments.
Both tables are distributed using the customer ID as the distribution key.
As a result, operations such as joins, sorts, and aggregations can be performed locally on each segment, reducing data movement and improving query performance.
┌──────────────── Segment 0 ─────────────┐ ┌──────────────── Segment 1 ─────────────┐ │ customers (id, customer_name) │ │ customers (id, customer_name) │ │ 1 │ 'Andrew Fuller' │ │ 11 │ 'Robert King' │ │ 2 │ 'Michael Suyama' │ │ 12 │ 'Laura Callahan' │ │ ... │ │ ... │ │ orders (id, customer_id, product_name) │ │ orders (id, customer_id, product_name) │ │ 101 │ 1 │ 'Laptop' │ │ 303 │ 11 │ 'Keyboard' │ │ 106 │ 1 │ 'Smartphone' │ │ 305 │ 11 │ 'Headphones' │ │ 110 │ 2 │ 'Tablet' │ │ 311 │ 12 │ 'Smartwatch' │ │ 117 │ 2 │ 'Monitor' │ │ 315 │ 12 │ 'Laptop' │ │ ... │ │ ... │ └────────────────────────────────────────┘ └────────────────────────────────────────┘ ┌──────────────── Segment 2 ─────────────┐ ┌──────────────── Segment 3 ─────────────┐ │ customers (id, customer_name) │ │ customers (id, customer_name) │ │ 21 │ 'Nancy Davolio' │ │ 31 │ 'Steven Buchanan' │ │ 22 │ 'Janet Leverling' │ │ 32 │ 'Margaret Peacock' │ │ ... │ │ ... │ │ orders (id, customer_id, product_name) │ │ orders (id, customer_id, product_name) │ │ 502 │ 21 │ 'Smartphone' │ │ 701 │ 31 │ 'Smartphone' │ │ 504 │ 21 │ 'Speakers' │ │ 705 │ 31 │ 'Laptop' │ │ 508 │ 22 │ 'Printer' │ │ 712 │ 32 │ 'Keyboard' │ │ 511 │ 22 │ 'Tablet' │ │ 717 │ 32 │ 'Smartwatch' │ │ ... │ │ ... │ └────────────────────────────────────────┘ └────────────────────────────────────────┘
Distribution policies
To specify the distribution policy, use the DISTRIBUTED
clause in the CREATE TABLE command:
CREATE TABLE <table_name> (
<column_definitions>
)
[
-- Hash distribution
DISTRIBUTED BY (<column_name> [<opclass>] [, ... ] )
-- Random distribution
| DISTRIBUTED RANDOMLY
-- Replicated distribution
| DISTRIBUTED REPLICATED
];
-
Hash distribution
Distributes data by computing a hash of one or more columns defined as the distribution key. For optimal data balance, use the table’s primary key or a unique column (or set of columns) as the distribution key. If there is no suitable column, consider using random distribution, which distributes rows across segments randomly.
NOTEYou can specify a custom hash function for hash distribution using the
opclass
argument, which defines a hash operator class. A custom hash operator class lets you use a different comparison method instead of the default equality operator (=
) in hash-based operations. This might be useful for supporting co-located joins on custom operators, such as case-insensitive comparisons or domain-specific equality definitions. See the CREATE OPERATOR CLASS page in the PostgreSQL documentation to learn how to define a new operator class. -
Random distribution
Distributes data across segments randomly. Because segment selection is random, data distribution may become uneven over time. Recommended for tables without unique columns, especially when the table is expected to grow significantly.
-
Replicated distribution
Stores a complete copy of the table on every cluster segment. Suitable for small tables, such as reference tables. Using replicated distribution can help eliminate data motion during joins.
Always explicitly define the distribution policy when creating a table.
If the DISTRIBUTED
clause is omitted, Greengage DB assigns a hash distribution policy to the table using either the PRIMARY KEY
(if defined) or the first column of the table as the distribution key.
This behavior is controlled by the gp_create_table_random_default_distribution
configuration parameter, which is set to OFF
by default.
Recommendations on choosing a distribution policy
Consider the points below when deciding on a table distribution policy.
Even data distribution
To ensure optimal query performance, data should be evenly distributed across all segments in the cluster. Uneven distribution or data skew can lead to performance degradation as segments with more data perform more work during query execution. Choose a distribution key that is unique for each record, such as the primary key.
Avoid using the following types of non-uniformly distributed columns as distribution keys:
-
Dates and timestamps, which often concentrate around specific time periods.
-
Columns that contain a lot of
NULL
values, which can concentrate data on fewer segments. -
Columns with custom data types, with unpredictable distribution characteristics.
Avoid using columns with potentially large data sizes (such as TEXT
) as distribution keys.
Hashing these columns can be computationally expensive, which may degrade performance during data insertion and join operations.
Local and distributed operations
Local operations are more efficient than distributed ones. Query performance improves when joins, sorts, and aggregations are executed locally on each segment. Cluster-wide operations require redistributing tuples across segments, which introduces overhead. When tables use the same distribution key, joins and sorts on that column can be performed locally. In contrast, tables with a random distribution policy cannot benefit from local join operations.
Even query processing
For the best performance, all segments should handle an equal share of the query workload.
Query workload can be skewed if a table’s data distribution policy and the query predicates are not well-matched.
For example, the sales
table is distributed by product_id
.
If a query filters on a specific product_id
(the WHERE
clause) to analyze sales performance for that product, the query workload is concentrated on a single segment.
Columns that are frequently used in WHERE
clauses are often good candidates as partition key columns.
Specify a distribution policy
Hash distribution
The query below shows how to create a new append-optimized customers
table whose rows are hash-distributed by the customer_id
column:
CREATE TABLE customers
(
customer_id INTEGER,
customer_name VARCHAR(25)
)
WITH (appendoptimized = true)
DISTRIBUTED BY (customer_id);
INSERT INTO customers (customer_id, customer_name)
SELECT customer_number AS customer_id,
'customer' || customer_number AS name
FROM generate_series(1, 1000) AS customer_number;
The next query creates the orders
table with a customer_id
column, which is also used as a distribution key.
In this case, joining the customers
and orders
tables by this column might be done locally.
CREATE TABLE orders
(
order_id INTEGER,
customer_id INTEGER,
amount DECIMAL(6, 2)
)
WITH (appendoptimized = true)
DISTRIBUTED BY (customer_id);
INSERT INTO orders (order_id, customer_id, amount)
SELECT order_number AS order_id,
FLOOR(RANDOM() * 1000 + 1)::INTEGER AS customer_id,
ROUND((100 + RANDOM() * 1000)::NUMERIC, 2) AS amount
FROM generate_series(1, 100000) AS order_number;
Random distribution
The query below shows how to create the events
table whose rows are distributed randomly:
CREATE TABLE events
(
event_id SERIAL,
timestamp TIMESTAMP NOT NULL,
user_id INTEGER NOT NULL,
event_type VARCHAR(255) NOT NULL
)
WITH (appendoptimized = true)
DISTRIBUTED RANDOMLY;
INSERT INTO events (timestamp, user_id, event_type)
SELECT NOW() - (INTERVAL '7 days' * random()) AS timestamp,
(random() * 10)::INTEGER + 1 AS user_id,
(ARRAY ['click', 'launch', 'purchase', 'logout'])[floor(random() * 4 + 1)] AS event_type
FROM generate_series(1, 4000);
Replicated distribution
The following query demonstrates how to create the pickup_points
table replicated across all segments of the cluster:
CREATE TABLE pickup_points
(
point_id INTEGER,
address TEXT,
phone_number TEXT
)
WITH (appendoptimized = true)
DISTRIBUTED REPLICATED;
INSERT INTO pickup_points (point_id, address, phone_number)
SELECT point_id_seq,
'Address ' || point_id_seq,
'123-456-' || LPAD(point_id_seq::TEXT, 4, '0')
FROM generate_series(1, 100) AS point_id_seq;
View distribution information
View table distribution settings
To view distribution options of the specified table, use the \d
meta-command:
\d orders
In the output below, the Distributed by
section shows that the orders
table is distributed by the customer_id
column:
Append-Only Table "public.orders" Column | Type | Modifiers -------------+--------------+----------- order_id | integer | customer_id | integer | amount | numeric(6,2) | Compression Type: None Compression Level: 0 Block Size: 32768 Checksum: t Distributed by: (customer_id)
Check data distribution
To ensure uniform data distribution after both initial and incremental loads, several ways can be used, for instance:
-
The
get_ao_distribution()
function shows the distribution of rows in an append-optimized table:SELECT get_ao_distribution('orders');
The result shows how many rows are on each segment:
get_ao_distribution --------------------- (1,24828) (0,25823) (3,24064) (2,25285)
-
The
gp_toolkit.gp_skew_coefficients
view shows data distribution skew by calculating the coefficient of variation (CV) for the data stored on each segment:SELECT skcrelname, skccoeff FROM gp_toolkit.gp_skew_coefficients;
The result might look like this:
skcrelname | skccoeff ---------------+------------------------- customers | 2.865891368027290280000 orders | 2.979247332241260000000 events | 2.222611077089286890000 pickup_points | 0.000000000000000000000
The lower the value, the better. Higher values indicate greater data skew.
-
The
gp_toolkit.gp_skew_idle_fractions
view shows data distribution skew by calculating the percentage of the system that is idle during a table scan, which is an indicator of computational skew:SELECT sifrelname, siffraction FROM gp_toolkit.gp_skew_idle_fractions;
The result might look like this:
sifrelname | siffraction ---------------+------------------------ customers | 0.02723735408560311284 orders | 0.03187081284126553847 events | 0.02439024390243902439 pickup_points | 0.00000000000000000000
For example, a value of 0.02 indicates 2% skew. Tables that have more than 10% skew should have their distribution policies evaluated.
-
The
pg_catalog.gp_segment_configuration
table contains information about the cluster segments. Thegp_segment_id
field corresponds to thecontent
column of thegp_segment_configuration
table and can be used to get a row count for each segment. The following query shows the number of rows per segment as well as the variance from the minimum and maximum numbers of rows:SELECT 'orders' AS "Table name", MAX(row_count) AS "Max seg rows", MIN(row_count) AS "Min seg rows", (MAX(row_count) - MIN(row_count)) * 100.0 / MAX(row_count) AS "Percentage difference between Max & Min" FROM (SELECT COUNT(*) AS row_count, gp_segment_id FROM orders GROUP BY gp_segment_id) AS seg_row_counts;
The result might look like this:
Table name | Max seg rows | Min seg rows | Percentage difference between Max & Min ------------+--------------+--------------+----------------------------------------- orders | 25781 | 24056 | 6.6909739730809511
NOTEA query that references the
gp_segment_id
system column on a replicated table fails because Greengage DB does not allow queries to reference replicated tables' system columns.
Change a distribution policy
You can change the distribution policy using the ALTER TABLE
command.
To specify a new distribution policy, use the SET DISTRIBUTED
clause:
ALTER TABLE pickup_points
SET DISTRIBUTED BY (point_id);
You can also redistribute table data using the current distribution policy.
To do this, set the reorganize
parameter to true
in the SET WITH
clause:
ALTER TABLE orders
SET WITH (reorganize = true);
This might be useful to correct data skew or when new segments have been added to the cluster.
Changing a table’s distribution redistributes its data across the segments. This operation can be resource-intensive. Run these commands during a maintenance window for large tables to avoid service downtime.
Example: Incorrect distribution choice
This section illustrates how an inappropriate distribution choice can negatively impact the performance of an analytical query that involves a join.
Recommended distribution
This query joins the customers
and orders
tables on the common distribution key customer_id
, aggregates the total purchase amount per customer, and returns the top 5 customers by total purchases:
SELECT c.customer_id,
c.customer_name AS customer_name,
SUM(o.amount) AS total_purchases
FROM customers c
JOIN orders o USING (customer_id)
GROUP BY c.customer_id, c.customer_name
ORDER BY total_purchases DESC
LIMIT 5;
EXPLAIN
provides a detailed breakdown of the query execution plan:
EXPLAIN (COSTS OFF)
SELECT c.customer_id,
c.customer_name AS customer_name,
SUM(o.amount) AS total_purchases
FROM customers c
JOIN orders o USING (customer_id)
GROUP BY c.customer_id, c.customer_name
ORDER BY total_purchases DESC
LIMIT 5;
The command output indicates that no data motion is required for this query, except for Gather Motion. This is the typical operation in which segments return their results to the master:
Limit -> Gather Motion 4:1 (slice1; segments: 4) Merge Key: (sum(o.amount)) -> Limit ...
QUERY PLAN ---------------------------------------------------------------------------- Limit -> Gather Motion 4:1 (slice1; segments: 4) Merge Key: (sum(o.amount)) -> Limit -> Sort Sort Key: (sum(o.amount)) -> HashAggregate Group Key: c.customer_id, c.customer_name -> Hash Join Hash Cond: (o.customer_id = c.customer_id) -> Seq Scan on orders o -> Hash -> Seq Scan on customers c Optimizer: Pivotal Optimizer (GPORCA) (14 rows)
Incorrect distribution
Change the distribution key of the orders
table from customer_id
to order_id
:
ALTER TABLE orders
SET DISTRIBUTED BY (order_id);
Run EXPLAIN
again:
EXPLAIN (COSTS OFF)
SELECT c.customer_id,
c.customer_name AS customer_name,
SUM(o.amount) AS total_purchases
FROM customers c
JOIN orders o USING (customer_id)
GROUP BY c.customer_id, c.customer_name
ORDER BY total_purchases DESC
LIMIT 5;
In comparison with the previous example, two additional motion operations are involved:
-
Redistribute Motion — moves tuples between the segments to complete the join. This is required because the
customers
table is distributed bycustomer_id
, while theorders
table is distributed byorder_id
. -
Broadcast Motion — a copy of the table is sent to all segments, as they do not contain the necessary data for the join. While this is not a concern for small tables, it can lead to performance issues when joining large tables.
Limit -> Gather Motion 4:1 (slice3; segments: 4) Merge Key: (pg_catalog.sum((sum(o.amount)))) .... -> Redistribute Motion 4:4 (slice2; segments: 4) .... -> Broadcast Motion 4:4 (slice1; segments: 4) ....
QUERY PLAN --------------------------------------------------------------------------------------------------------- Limit -> Gather Motion 4:1 (slice3; segments: 4) Merge Key: (pg_catalog.sum((sum(o.amount)))) -> Limit -> Sort Sort Key: (pg_catalog.sum((sum(o.amount)))) -> HashAggregate Group Key: c.customer_id, c.customer_name -> Redistribute Motion 4:4 (slice2; segments: 4) Hash Key: c.customer_id, c.customer_name -> Result -> HashAggregate Group Key: c.customer_id, c.customer_name -> Hash Join Hash Cond: (o.customer_id = c.customer_id) -> Seq Scan on orders o -> Hash -> Broadcast Motion 4:4 (slice1; segments: 4) -> Seq Scan on customers c Optimizer: Pivotal Optimizer (GPORCA) (20 rows)