Configure proxies for interconnect
Overview
Greengage DB’s networking layer — interconnect — is used for transferring queries and their results between cluster instances. It implements interprocess communication between PostgreSQL instances running on different segments, allowing the system to function as a single logical database.
When a query is submitted to the master segment, the query dispatcher (QD) generates a parallel execution plan and establishes network connections to all segments in the cluster. These connections are used to send the query commands and receive results. On each segment, query executor (QE) processes are launched to execute their part of the plan. QEs may also initiate direct connections to other segments to exchange intermediate results. This segment-to-segment data exchange is the foundation of Greengage DB’s massively parallel query execution model.
Interconnect modes
Greengage DB offers different interconnect modes, allowing you to adjust the interconnect mechanism to suit your environment.
The active mode is defined by the gp_interconnect_type configuration parameter.
By default, interconnect works in the UDPIFC mode, where each internal connection uses dedicated network ports on the participating hosts. While this mode provides high throughput, it can lead to excessive port usage and connection management overhead in large clusters.
The proxy interconnect mode addresses this issue by consolidating internal interaction to a single network connection on each cluster host. This significantly reduces the number of network ports in use, simplifies connection management, and improves the efficiency of data transfer within the cluster.
When to use interconnect proxies
Interconnect proxies are particularly useful in environments where port availability is limited, such as clusters running behind firewalls or network address translation (NAT). They are also recommended for large deployments with many segments, where the default connection-per-executor model may exhaust port ranges or increase the load on the operating system’s networking stack.
Configure proxy addresses
To start using interconnect proxies in a Greengage DB cluster, define a proxy address — hostname or IP — and port for each cluster instance. This includes master, standby master, and all segments: primary and mirror.
The configuration is provided through the gp_interconnect_proxy_addresses configuration parameter.
Its value is a single-quoted comma-separated string of entries with the following structure:
<db_id>:<content_id>:<segment_address>:<port>
where:
-
<db_id>— the unique identifier of a segment instance. -
<content_id>— the content identifier of the segment (-1for the master). -
<segment_address>— the hostname or IP address that the instance uses. -
<port>— the TCP port to reserve for the interconnect proxy on this instance.
You can obtain <db_id>, <content_id>, and <segment_address> values from the gp_segment_configuration system catalog table:
SELECT dbid, content, address FROM gp_segment_configuration;
The output can look as follows:
dbid | content | address
------+---------+---------
1 | -1 | mdw
2 | 0 | sdw1
6 | 0 | sdw2
3 | 1 | sdw1
7 | 1 | sdw2
10 | -1 | smdw
4 | 2 | sdw2
8 | 2 | sdw1
5 | 3 | sdw2
9 | 3 | sdw1
(10 rows)
Next, assign <port> values for each segment.
Each instance must have exactly one proxy port defined.
Prepare the configuration string by concatenating all entries without whitespace or line breaks. Separate entries with commas. Enclose the whole string in single quotes. Example:
'1:-1:mdw:40000,2:0:sdw1:40000,3:1:sdw1:40500,4:2:sdw2:40000,5:3:sdw2:40500,6:0:sdw2:41000,7:1:sdw2:41500,8:2:sdw1:41000,9:3:sdw1:41500,10:-1:smdw:40000'
Set this string as the gp_interconnect_proxy_addresses value using the gpconfig utility:
$ gpconfig --skipvalidation -c gp_interconnect_proxy_addresses -v "'1:-1:mdw:40000,2:0:sdw1:40000,3:1:sdw1:40500,4:2:sdw2:40000,5:3:sdw2:40500,6:0:sdw2:41000,7:1:sdw2:41500,8:2:sdw1:41000,9:3:sdw1:41500,10:-1:smdw:40000'"
Note the following:
-
Setting
gp_interconnect_proxy_addressesrequires the--skipvalidationoption. -
The parameter value is a single-quoted string, which must itself be enclosed in double quotes when passed to
gpconfig.
After updating the parameter, reload the cluster configuration to apply it:
$ gpstop -u
If a segment hostname resolves to a different IP address at runtime, rerun gpstop -u to reload the gp_interconnect_proxy_addresses value.
Verify proxy configuration
Once interconnect proxies are configured, you should verify that queries execute correctly and that performance meets expectations.
Start a test session that uses proxies by setting the gp_interconnect_type parameter to proxy in PGOPTIONS:
$ PGOPTIONS="-c gp_interconnect_type=proxy" psql postgres
Run checks described below from such a session.
Distributed query execution
To confirm that network communication works as expected, execute a query that involves all segments. For example, the following query counts rows on each segment separately:
SELECT gp_segment_id, COUNT(*) FROM orders GROUP BY gp_segment_id ;
A successful execution produces results similar to:
gp_segment_id | count
---------------+--------
3 | 249860
0 | 249933
1 | 249548
2 | 250659
(4 rows)
If a query hangs or returns an error, then the proxy configuration is likely incorrect.
Review the values set in gp_interconnect_proxy_addresses, correct them, and reapply the configuration as described in Configure proxy addresses.
Logs and error messages
After running distributed queries, review the cluster logs for proxy-related errors or warnings. Misconfigured proxy addresses or ports may not always block queries immediately but can generate log entries such as connection failures or timeouts.
You can check logs with:
-
the
gplogfilterutility; -
the
gp_toolkit.gp_log_*views; -
any external log parsing and monitoring tools.
For more details, see Logging.
Regular log inspection is recommended after turning on proxies, especially during initial deployment.
Performance checks
Compare workload performance with and without using proxies. While interconnect proxies reduce the number of open ports and connections, the additional proxy layer can introduce slight overhead in some workloads.
Suggested checks:
-
Measure query runtimes for typical reporting and analytical workloads.
-
Run a few large aggregations or joins that exchange data across many segments.
-
Monitor cluster resource usage (CPU, memory, and network throughput) with standard system and DBMS monitoring tools.
These tests help confirm whether proxies provide the expected balance of connection efficiency and query performance in your environment.
Turn on interconnect proxies
After verifying proxy configuration and testing it in a session, you can turn on interconnect proxies for all cluster workloads.
Set the gp_interconnect_type configuration parameter to proxy:
$ gpconfig -c gp_interconnect_type -v 'proxy'
Reload the configuration to apply the change:
$ gpstop -u
This change is cluster-wide: all new sessions will use proxies for interconnect communication.
Turn off interconnect proxies
If you need to stop using interconnect proxies, switch the cluster back to the default interconnect mode:
$ gpconfig -c gp_interconnect_type -v 'udpifc'
$ gpstop -u
For a full rollback, also clear the proxy address configuration:
$ gpconfig -c gp_interconnect_proxy_addresses -v ''
$ gpstop -r
When you set gp_interconnect_proxy_addresses to an empty string, this change takes effect after a full cluster restart (gpstop -r).