Use gp_toolkit

Pavel Semyonov

Contents

Check disk usage
Check for cluster issues
Monitor locks in the cluster
View database logs
View configuration parameters
Monitor resource usage
- Resource groups
- Resource queues
Inspect append-optimized tables
- Segment files
- AO table visibility maps
View users and roles

gp_toolkit is a built-in administrative schema in Greengage DB. It provides a collection of views, functions, and helper tables that expose information from system catalogs, cluster logs, and environment status. The objects in gp_toolkit usually wrap queries against internal system tables, presenting the results in a more structured and human-friendly form. This makes them useful for everyday cluster monitoring and maintenance.

The gp_toolkit schema is accessible to all database users, although certain objects may require superuser privileges. In most cases, the output is also limited to the objects the current user is permitted to access.

TIP

To simplify the use of gp_toolkit, add the schema name to the search path. This allows you to use gp_toolkit objects without explicit schema specification:

SET search_path TO public, gp_toolkit;

This section introduces examples of using gp_toolkit for typical Greengage DB administration tasks, such as checking disk usage, identifying bloat, finding skewed data, or inspecting resource consumption. Examples highlight practical use cases, while more details on database maintenance tasks can be found in the corresponding documentation topics.

Inspect materialized views with gp_toolkit

By default, gp_toolkit views provide information only about tables. If you want them to also include materialized views, you need to update the internal view __gp_user_tables. This requires superuser privileges.

Retrieve the current definition of __gp_user_tables using the \d+ meta-command:

\d+ gp_toolkit.__gp_user_tables

The output includes the view definition, for example:

<...>
View definition:
 SELECT fn.fnnspname AS autnspname,
    fn.fnrelname AS autrelname,
    pgc.relkind AS autrelkind,
    pgc.reltuples AS autreltuples,
    pgc.relpages AS autrelpages,
    pgc.relacl AS autrelacl,
    pgc.oid AS autoid,
    pgc.reltoastrelid AS auttoastoid,
    pgc.relstorage AS autrelstorage
   FROM pg_class pgc,
    gp_toolkit.__gp_fullname fn
  WHERE (pgc.relnamespace IN ( SELECT __gp_user_namespaces.aunoid
           FROM gp_toolkit.__gp_user_namespaces)) AND pgc.relkind = 'r'::"char" AND pgc.oid = fn.fnoid;

Look for the condition that filters objects by their relation kind (relkind):

pgc.relkind = 'r'::"char"

Recreate the view with CREATE OR REPLACE VIEW AS, adding materialized views as an additional relation kind (pgc.relkind = 'm'):

CREATE OR REPLACE VIEW gp_toolkit.__gp_user_tables
AS
 SELECT fn.fnnspname AS autnspname,
    fn.fnrelname AS autrelname,
    pgc.relkind AS autrelkind,
    pgc.reltuples AS autreltuples,
    pgc.relpages AS autrelpages,
    pgc.relacl AS autrelacl,
    pgc.oid AS autoid,
    pgc.reltoastrelid AS auttoastoid,
    pgc.relstorage AS autrelstorage
   FROM pg_class pgc,
    gp_toolkit.__gp_fullname fn
  WHERE (pgc.relnamespace IN ( SELECT __gp_user_namespaces.aunoid
           FROM gp_toolkit.__gp_user_namespaces))
    AND (pgc.relkind = 'r' OR pgc.relkind = 'm')
    AND pgc.oid = fn.fnoid;

Update access permissions:

GRANT SELECT ON TABLE gp_toolkit.__gp_user_tables TO public;

Keep in mind that changes to gp_toolkit are not persistent: they are lost after a database backup and restore. You must reapply this update each time after restoring the database.

To analyze the sizes of objects stored in Greengage DB, use the gp_toolkit views gp_size_of_*. These views let you check disk usage at different levels: database, schema, table, and index. Namely, the following views are available:

gp_size_of_database — sizes of user databases in the cluster, in bytes. This includes all databases except the default databases postgres, template0, and template1. For example, to rank databases from the largest to the smallest, use this query:
```
SELECT * FROM gp_toolkit.gp_size_of_database
ORDER BY sodddatsize DESC;
```
Example output:
```
 sodddatname | sodddatsize
-------------+-------------
 books_store |    86444564
 util        |    85855700
 marketplace |    84544020
 diskquota   |    84544020
(4 rows)
```

gp_size_of_schema_disk — sizes of schemas in the current database, in bytes. The view displays two values: total table data and total index size for each schema, including public and user-defined schemas. Example:

SELECT * FROM gp_toolkit.gp_size_of_schema_disk;

Example output:

 sosdnsp  | sosdschematablesize | sosdschemaidxsize
----------+---------------------+-------------------
 internal |                   0 |                 0
 public   |           120455168 |          85360640
 sales    |              163840 |                 0
(3 rows)

gp_size_of_table_disk — sizes of tables in the current database. The size is split into components: main table data, oversized attributes (TOAST), and metadata for append-optimized tables. For example, this query finds the five largest tables in the current database:

SELECT * FROM gp_toolkit.gp_size_of_table_disk
ORDER BY sotdsize DESC
LIMIT 5;

Example output:

 sotdoid | sotdsize  | sotdtoastsize | sotdadditionalsize | sotdschemaname | sotdtablename
---------+-----------+---------------+--------------------+----------------+---------------
   16469 | 420872192 |        163840 |                  0 | public         | orders
   24718 |    131072 |        163840 |                  0 | public         | items
   16515 |     98304 |        163840 |                  0 | sales          | customers
   16431 |     68456 |             0 |             589824 | sales          | orders
(4 rows)

The gp_size_of_table_uncompressed view is useful for append-optimized tables with compression: it shows the table’s logical size before compression. For non-compressed tables, the value matches the on-disk size.

gp_size_of_index — sizes of indexes in the current database. For example, to check the sizes of all indexes of a given table:

SELECT * FROM gp_toolkit.gp_size_of_index
WHERE soitableschemaname = 'public' AND soitablename = 'orders';

Example output:

 soioid | soitableoid | soisize  | soiindexschemaname |    soiindexname    | soitableschemaname | soitablename
--------+-------------+----------+--------------------+--------------------+--------------------+--------------
  16487 |       16469 | 62750720 | public             | orders_comment_idx | public             | orders
  16486 |       16469 | 22609920 | public             | orders_total_idx   | public             | orders
(3 rows)

You can also retrieve the same aggregated information using the gp_size_of_all_table_indexes view. It reports the combined size of indexes per table.

gp_size_of_table_and_indexes_disk — total on-disk size of each table, including all its indexes. For example, to find the largest tables by the total of table data and indexes:

SELECT
    sotaidtablename,
    sotaidschemaname,
    sotaidtablesize,
    sotaididxsize,
    sotaidtablesize + sotaididxsize AS total_size
FROM gp_toolkit.gp_size_of_table_and_indexes_disk
ORDER BY total_size DESC
LIMIT 3;

Example output:

 sotaidtablename | sotaidschemaname | sotaidtablesize | sotaididxsize | total_size
-----------------+------------------+-----------------+---------------+------------
 orders          | public           |       120455168 |      85360640 |  205815808
 customers       | sales            |          262144 |        262144 |     524288
 orders          | sales            |          163840 |             0 |     163840
(3 rows)

The gp_disk_free external table displays the free space available on each segment host, in kilobytes.

The following query lists cluster hosts by free space from most occupied to least:

SELECT * FROM gp_toolkit.gp_disk_free ORDER BY dfspace;

The output may look like this:

 dfsegment | dfhostname |  dfdevice  | dfspace
-----------+------------+------------+----------
         3 |    sdw2    |  /dev/vda2 | 19906016
         2 |    sdw2    |  /dev/vda2 | 19906016
         0 |    sdw1    |  /dev/vda2 | 20304960
         1 |    sdw1    |  /dev/vda2 | 20304960
(4 rows)

Note that gp_disk_free has one row per segment. If multiple segments run on the same host, the host appears multiple times. Also, the dfspace column reflects total free space on the underlying filesystem, not space reserved for a particular segment.

TIP

For stable cluster operation, leave at least 30% of disk space free on each host. Running close to full capacity increases the risk of query failures, spill file overflow, and transaction aborts.

To monitor usage of spill files across the cluster, use the gp_workfile_* views.

With the gp_workfile_usage_per_query view, you can check which queries created spill files, how many files, and how much space they occupy on each segment. Query example:

SELECT datname, query, segid, size, numfiles
FROM gp_toolkit.gp_workfile_usage_per_query
ORDER BY size;

Example output:

   datname   |                                           query                                            | segid |   size   | numfiles
-------------+--------------------------------------------------------------------------------------------+-------+----------+----------
 marketplace | SELECT DISTINCT (item_id, price, count), value FROM items GROUP BY item_id ORDER BY value; |     2 | 18350080 |        1
 marketplace | SELECT DISTINCT (item_id, price, count), value FROM items GROUP BY item_id ORDER BY value; |     0 | 18317312 |        1
 marketplace | SELECT DISTINCT (item_id, price, count), value FROM items GROUP BY item_id ORDER BY value; |     3 | 18317312 |        1
 marketplace | SELECT DISTINCT (item_id, price, count), value FROM items GROUP BY item_id ORDER BY value; |     1 | 18219008 |        1
(4 rows)

The gp_workfile_entries view splits the same information by specific query operators — sorts, joins, and others — that triggered spilling. The gp_workfile_usage_per_segment view aggregates spill file usage information per segment, helping detect uneven disk I/O load.

To find more information about checking spill files and query examples, see Check spill files.

The gp_pgdatabase_invalid view reports segments that are currently marked down in the cluster configuration. Each row corresponds to a database segment on a failed segment instance.

SELECT * FROM gp_toolkit.gp_pgdatabase_invalid;

Example output:

 pgdbidbid | pgdbiisprimary | pgdbicontent | pgdbivalid | pgdbidefinedprimary
-----------+----------------+--------------+------------+---------------------
         4 | f              |            2 | f          | t
         5 | f              |            3 | f          | t
         6 | f              |            0 | f          | f
         7 | f              |            1 | f          | f
(4 rows)

This output shows failed segment instances, including whether they are primary or mirrors (pgdbiisprimary), and their content ID (pgdbicontent).

To learn about recovering failed segments, see Check and recover segments.

Two gp_toolkit views, gp_bloat_diag and gp_bloat_expected_pages, help diagnose bloat in heap tables:

gp_bloat_diag — lists only tables with suspected bloat. For each table, it provides a diagnostic message with an estimate of the bloat amount: moderate or significant. This estimate is based on the ratio between the actual and expected number of pages that the table uses. To quickly find tables with significant bloat, use a query like this:
```
SELECT *
FROM gp_toolkit.gp_bloat_diag
WHERE bdidiag LIKE '%significant%';
```
gp_bloat_expected_pages — shows actual and expected page counts for all heap tables. This is useful for custom checks where you want to define your own threshold:
```
SELECT *
FROM gp_toolkit.gp_bloat_expected_pages
WHERE btdrelpages / btdexppages > 5
ORDER BY btdrelpages / btdexppages;
```
You can also filter results by schema, check individual tables, or rank all tables by excess page count.

Learn more about identifying bloat and find examples in Check bloat in tables.

The gp_stats_missing view of gp_toolkit lists tables that lack statistics in the system catalog.

SELECT * FROM gp_toolkit.gp_stats_missing;

Example output:

 smischema | smitable | smisize | smicols | smirecs
-----------+----------+---------+---------+---------
 public    | test     | f       |       1 |       0
 sales     | orders   | f       |       5 |       0
(2 rows)

Such tables require statistics collection (ANALYZE) to ensure the optimizer generates efficient query plans. Learn more about database statistics in Collect statistics via ANALYZE.

gp_toolkit.gp_skew_* views provide information about data skew in the cluster. These views help identify uneven data distribution across segments, which can cause performance degradation:

gp_skew_coefficients — measures skew using the coefficient of variation (CV) for the table’s data on each segment. Higher CV values indicate greater skew. For example, this query ranks tables by CV:

SELECT skcnamespace, skcrelname, skccoeff
FROM gp_toolkit.gp_skew_coefficients
ORDER BY skccoeff DESC;

Example output:

 skcnamespace | skcrelname |         skccoeff
--------------+------------+-------------------------
 sales        | customers  | 2.979247332241260000000
 public       | orders     | 0.187996312020564400000
 sales        | orders     |                       0
(3 rows)

gp_skew_idle_fractions — measures computational skew — the fraction of system resources that remain idle during the table scan. If this value exceeds 0.1 (10%), the table may not be evenly distributed:

SELECT sifnamespace, sifrelname, siffraction
FROM gp_toolkit.gp_skew_idle_fractions;

Example output:

 sifnamespace | sifrelname |      siffraction
--------------+------------+------------------------
 sales        | orders     |                      0
 public       | orders     | 0.00262906977208079502
 sales        | customers  | 0.50000000000000000000
(3 rows)

For more details on checking and fixing skew, see Check data distribution.

The gp_toolkit.gp_locks_* views provide information about database locks. These views are useful for diagnosing blocking issues or slow queries in a Greengage DB cluster:

gp_locks_on_relation — shows locks that are currently held on database relations. For each lock, it shows the relation name, the lock type, whether the lock has been granted, and the query holding or waiting for the lock. Example:

SELECT lorlocktype, lordatabase, lorrelname, lorpid, lormode, lorgranted
FROM gp_toolkit.gp_locks_on_relation
WHERE lorcurrentquery LIKE '%gp_toolkit.gp_locks_on_relation%';

The result shows that this query itself acquires multiple locks on system catalog tables such as pg_class or pg_locks:

 lorlocktype | lordatabase |            lorrelname             | lorpid |     lormode     | lorgranted
-------------+-------------+-----------------------------------+--------+-----------------+------------
 relation    |       12812 | gp_locks_on_relation              |   3388 | AccessShareLock | t
 relation    |           0 | pg_authid                         |   3388 | AccessShareLock | t
 relation    |           0 | pg_authid_oid_index               |   3388 | AccessShareLock | t
 relation    |           0 | pg_authid_rolname_index           |   3388 | AccessShareLock | t
 relation    |           0 | pg_authid_rolresgroup_index       |   3388 | AccessShareLock | t
 relation    |           0 | pg_authid_rolresqueue_index       |   3388 | AccessShareLock | t
 relation    |       12812 | pg_class                          |   3388 | AccessShareLock | t
 relation    |       12812 | pg_class_oid_index                |   3388 | AccessShareLock | t
 relation    |       12812 | pg_class_relname_nsp_index        |   3388 | AccessShareLock | t
 relation    |       12812 | pg_class_tblspc_relfilenode_index |   3388 | AccessShareLock | t
 relation    |           0 | pg_database                       |   3388 | AccessShareLock | t
 relation    |           0 | pg_database_datname_index         |   3388 | AccessShareLock | t
 relation    |           0 | pg_database_oid_index             |   3388 | AccessShareLock | t
 relation    |       12812 | pg_locks                          |   3388 | AccessShareLock | t
 relation    |       12812 | pg_stat_activity                  |   3388 | AccessShareLock | t
(15 rows)

gp_locks_on_resqueue — displays locks in resource queues if queues are used for managing cluster resources. This view helps administrators check which resource queues currently have waiting queries due to concurrency limits or queue-level locks.

SELECT lorrsqname, lorlocktype, lorobjid, lormode, lorwaiting
FROM gp_toolkit.gp_locks_on_resqueue
WHERE lorwaiting = true;

Example output:

 lorrsqname |  lorlocktype   | lorobjid |    lormode    | lorwaiting
------------+----------------+----------+---------------+------------
 reporting  | resource queue |    16449 | ExclusiveLock | t
(1 row)

gp_toolkit provides views for accessing Greengage DB logs from the SQL console. These views are based on external tables that read cluster log files. Only superusers can access these views.

gp_log_command_timings — displays timestamps and durations of SQL commands executed in the cluster. For example, to find users that submitted queries longer than 5 seconds in the past hour:

SELECT
    logsession,
    loguser,
    logdatabase,
    logcmdcount,
    logtimemin AS start_time,
    logduration
FROM gp_toolkit.gp_log_command_timings
WHERE logduration > INTERVAL '5 seconds'
  AND logtimemin > now() - INTERVAL '1 hour'
ORDER BY logtimemin DESC;

Example result:

 logsession | loguser | logdatabase | logcmdcount |          start_time           |   logduration
------------+---------+-------------+-------------+-------------------------------+-----------------
 con8       | alice   | postgres    | cmd1        | 2025-08-28 03:49:50.140174+00 | 00:00:25.940927
(1 row)

To view query texts, use the views described below, for example, gp_log_database.

gp_log_database — shows logged events of the current database. You can filter entries by severity to view only errors, for example:

SELECT
    logtime,
    loguser,
    logdatabase,
    logseverity,
    logmessage
FROM gp_toolkit.gp_log_database
WHERE logseverity IN ('ERROR','FATAL','PANIC')
ORDER BY logtime DESC
LIMIT 5;

Example result:

            logtime            | loguser | logdatabase | logseverity |               logmessage
-------------------------------+---------+-------------+-------------+-----------------------------------------
 2025-08-28 03:50:16.081062+00 | alice   | postgres    | ERROR       | canceling statement due to user request
 2025-08-28 03:50:16.079898+00 | alice   | postgres    | ERROR       | canceling MPP operation
 2025-08-28 03:50:16.079793+00 | alice   | postgres    | ERROR       | canceling MPP operation
 2025-08-28 03:50:16.079789+00 | alice   | postgres    | ERROR       | canceling MPP operation
 2025-08-28 03:50:16.079692+00 | alice   | postgres    | ERROR       | canceling MPP operation
(5 rows)

gp_log_master_concise — provides a short-form view of the master instance’s log entries. Useful for quick scanning:

SELECT
    logtime,
    logseverity,
    logmessage
FROM gp_toolkit.gp_log_master_concise
ORDER BY logtime DESC
LIMIT 10;

A fragment of the output:

            logtime            | logseverity |                                            logmessage
-------------------------------+-------------+--------------------------------------------------------------------------------------------------
 2025-08-28 03:58:10.772125+00 | LOG         | statement: SELECT                                                                               +
                               |             |     logtime,                                                                                    +
                               |             |     logseverity,                                                                                +
                               |             |     logmessage                                                                                  +
                               |             | FROM gp_toolkit.gp_log_master_concise                                                           +
                               |             | ORDER BY logtime DESC                                                                           +
                               |             | LIMIT 5;
 2025-08-28 03:58:07.293417+00 | LOG         | An exception was encountered during the execution of statement: select pg_sleep(10) from orders;
 2025-08-28 03:58:07.293379+00 | ERROR       | canceling statement due to user request
 2025-08-28 03:58:02.799049+00 | LOG         | statement: SELECT                                                                               +
                               |             |     logtime,                                                                                    +
                               |             |     logseverity,                                                                                +
                               |             |     logmessage                                                                                  +
                               |             | FROM gp_toolkit.gp_log_master_concise                                                           +
                               |             | ORDER BY logtime DESC                                                                           +
                               |             | LIMIT 5;
 2025-08-28 03:57:48.153789+00 | LOG         | statement: select pg_sleep(10) from orders;

gp_log_system — shows detailed logs from all cluster instances: master, segments, and their mirrors. This view provides the most detail and is useful for troubleshooting distributed issues:

SELECT
    logtime,
    logseverity,
    logmessage,
    loghost,
    logpid AS pid,
    logfile
FROM gp_toolkit.gp_log_system
WHERE logseverity IN ('ERROR','FATAL','PANIC')
ORDER BY logtime DESC
LIMIT 5;

Example output:

            logtime            | logseverity |               logmessage                |   loghost    |  pid  |     logfile
-------------------------------+-------------+-----------------------------------------+--------------+-------+------------------
 2025-08-28 04:02:51.496107+00 | ERROR       | column "procpid" does not exist         | [local]      | p2418 | parse_relation.c
 2025-08-28 04:02:08.205361+00 | ERROR       | column "hostname" does not exist        | [local]      | p2418 | parse_relation.c
 2025-08-28 04:01:19.729128+00 | ERROR       | permission denied for relation orders   | [local]      | p2523 | aclchk.c
 2025-08-28 03:58:07.293379+00 | ERROR       | canceling statement due to user request | [local]      | p2523 | postgres.c
 2025-08-28 03:58:07.290905+00 | ERROR       | canceling MPP operation                 | 10.92.40.164 | p1531 | postgres.c
(5 rows)

Learn more about logging in Greengage DB in Logging.

The gp_param_setting('parameter_name') function in gp_toolkit lists values of a given configuration parameter on all segments, including master (segment ID -1):

SELECT * FROM gp_toolkit.gp_param_setting('max_connections');

The output may look as follows:

 paramsegment |    paramname    | paramvalue
--------------+-----------------+------------
           -1 | max_connections | 250
            0 | max_connections | 750
            1 | max_connections | 750
            2 | max_connections | 750
            3 | max_connections | 750
(5 rows)

The gp_param_settings_seg_value_diffs view lists parameters that have different values across segments, which can indicate configuration inconsistency. This applies only to local parameters — the ones that are set independently on each segment. Parameters expected to differ, such as port, are excluded. Example:

SELECT * FROM gp_toolkit.gp_param_settings_seg_value_diffs;

The output may look as follows:

       psdname        | psdvalue | psdcount
----------------------+----------+----------
 tcp_keepalives_count | 9        |        3
 tcp_keepalives_count | 5        |        1
(2 rows)

See Configure DBMS via GUCs for more information about cluster configuration.

If the cluster uses resource groups, you can query the gp_toolkit.gp_resgroup_* views to check their configuration and usage:

gp_resgroup_config — displays the configured limits of each resource group. Example:

SELECT
    groupname,
    concurrency,
    cpu_rate_limit,
    memory_limit
FROM gp_toolkit.gp_resgroup_config
ORDER BY groupname;

Example output:

 groupid |   groupname    | concurrency | cpu_rate_limit | memory_limit
---------+----------------+-------------+----------------+--------------
    6437 | default_group  | 20          | 30             | 0
    6438 | admin_group    | 10          | 10             | 10
   16411 | rg_plcontainer | 0           | -1             | 20
   16403 | rg_analysts    | 10          | 20             | 20
(4 rows)

gp_resgroup_status — shows current usage across all segments, including numbers of running and queued transactions. For example, this query highlights groups where queries are waiting in the queue:

SELECT rsgname,
       num_running,
       num_queued
FROM gp_toolkit.gp_resgroup_status
WHERE num_queued > 0
ORDER BY num_queued DESC;

Example output:

    rsgname     | num_running | num_queued
----------------+-------------+------------
 admin_group    |          10 |          2
 rg_analysts    |           5 |          1
(2 rows)

gp_resgroup_status_per_host and gp_resgroup_status_per_segment — aggregate current resource groups usage by hosts and segments. They are useful to identify hotspots where a single host or segment is overloaded compared to the rest of the cluster.

See Check resource groups for more details and examples.

If the cluster uses resource queues, you can monitor them through the gp_resqueue_* views:

gp_resqueue_status — shows configured limits and current usage of resource queues, for example:

SELECT
    rsqname,
    rsqcountlimit,
    rsqcountvalue,
    rsqmemorylimit,
    rsqmemoryvalue
FROM gp_toolkit.gp_resqueue_status;

Example result:

 queueid |  rsqname   | rsqcountlimit | rsqcountvalue | rsqmemorylimit | rsqmemoryvalue
---------+------------+---------------+---------------+----------------+----------------
    6055 | pg_default |            21 |             0 |             -1 |              0
   16449 | reporting  |            22 |             2 |             -1 |    2.62144e+08
   16456 | adhoc      |            15 |             0 |    2.68435e+08 |              0
(3 rows)

gp_resq_activity — shows information about queries currently running or waiting in resource queues:

SELECT * FROM gp_toolkit.gp_resq_activity;

Example result:

 resqprocpid | resqrole | resqoid | resqname  |           resqstart           | resqstatus
-------------+----------+---------+-----------+-------------------------------+------------
        3460 | charlie  |   16449 | reporting | 2025-08-04 10:25:48.318273+00 | running
        3351 | alice    |   16449 | reporting | 2025-08-04 10:25:53.574034+00 | running
(2 rows)

gp_resq_activity_by_queue — provides a summarized view of current activity per queue without query-level detail. This is useful for quickly checking the overall queue state.
gp_resq_priority_statement — shows priorities of queries in resource queues.

gp_resq_role — shows which roles are assigned to which resource queues:

SELECT * FROM gp_toolkit.gp_resq_role
WHERE rrrsqname = 'reporting';

Example result:

 rrrolname | rrrsqname
-----------+------------
 alice     | reporting
 charlie   | reporting
(4 rows)

See Check resource queues for more details and examples.

To examine the storage details of append-optimized (AO) tables, gp_toolkit provides __gp_ao* functions. There are two groups of such functions: __gp_aoseg*/__gp_aocsseg* and __gp_aovisimap*.

IMPORTANT

gp_toolkit functions for AO tables do not collect segment-level data when executed on the master. To retrieve visibility or storage information from all cluster segments, you must either call the functions on specific segments or use gp_dist_random() to get results from all segments:

SELECT gp_segment_id, (gp_toolkit.__gp_aovisimap_entry('orders')).*
FROM gp_dist_random('gp_id');

The __gp_aoseg*/__gp_aocsseg* functions return metadata from AO tables' segment files:

__gp_aoseg(regclass) — provides metadata information for the segment files of a given row-oriented AO table.

Example:

SELECT gp_segment_id, (gp_toolkit.__gp_aoseg('orders')).*
FROM gp_dist_random('gp_id');

Sample rows:

 gp_segment_id | segno |  eof  | tupcount | varblockcount | eof_uncompressed | modcount | formatversion | state
---------------+-------+-------+----------+---------------+------------------+----------+---------------+-------
             0 |     1 | 16592 |      485 |             3 |            16592 |        3 |             3 |     1
             1 |     1 | 17584 |      514 |             3 |            17584 |        3 |             3 |     1
             2 |     1 | 16360 |      478 |             3 |            16360 |        3 |             3 |     1
             3 |     1 | 17920 |      524 |             3 |            17920 |        3 |             3 |     1
(4 rows)

The output includes the tuple count per segment file (tupcount), the number of updates to this file (modcount), and the file state (1 for active segment files, 2 for files marked for removal during AO compaction).

If called in the default way on master, the function only shows general information available there:

SELECT * FROM gp_toolkit.__gp_aoseg('orders');

Result:

 segno | eof | tupcount | varblockcount | eof_uncompressed | modcount | formatversion | state
-------+-----+----------+---------------+------------------+----------+---------------+-------
     1 |   0 |     2001 |             0 |                0 |        4 |             3 |     1
(1 row)

__gp_aoseg_history(regclass) — provides a detailed history of segment file changes, useful for advanced analysis and diagnostics.
__gp_aocsseg(regclass) and __gp_aoseg_history(regclass) are the counterparts of __gp_aoseg(regclass) and __gp_aoseg_history(regclass) for column-oriented AO tables.

The __gp_aovisimap* functions return information about AO table visibility maps, which track visible and hidden rows of the table:

__gp_aovisimap(regclass) — lists all hidden rows in a given AO table, showing tuple ID, segment file, and row number:

SELECT gp_segment_id, (gp_toolkit.__gp_aovisimap('orders')).*
FROM gp_dist_random('gp_id');

Output fragment:

  gp_segment_id |      tid       | segno | row_num
----------------+----------------+-------+---------
              1 | (33554432,502) |     1 |     501
              1 | (33554432,503) |     1 |     502
              1 | (33554432,504) |     1 |     503
              1 | (33554432,505) |     1 |     504
<...>

__gp_aovisimap_hidden_info(regclass) — shows the number of hidden rows in each segment file of the specified AO table:

SELECT gp_segment_id, (gp_toolkit.__gp_aovisimap_hidden_info('orders')).*
FROM gp_dist_random('gp_id');

Example output:

  gp_segment_id | segno | hidden_tupcount | total_tupcount
----------------+-------+-----------------+----------------
              1 |     1 |              19 |            485
              0 |     1 |              30 |            514
              3 |     1 |              25 |            478
              2 |     1 |              27 |            524
(4 rows)

__gp_aovisimap_compaction_info(regclass) — provides aggregated statistics about hidden rows in an AO table, helping to determine whether compaction is needed. Example:

SELECT * FROM gp_toolkit.__gp_aovisimap_compaction_info('orders'::regclass);

Example result:

NOTICE:  gp_appendonly_compaction_threshold = 10
 content | datafile | compaction_possible | hidden_tupcount | total_tupcount | percent_hidden
---------+----------+---------------------+-----------------+----------------+----------------
       2 |        1 | f                   |              30 |            514 |           5.84
       1 |        1 | f                   |              19 |            485 |           3.92
       0 |        1 | f                   |              27 |            524 |           5.15
       3 |        1 | f                   |              25 |            478 |           5.23
(4 rows)

For details, see Vacuum append-optimized tables.

__gp_aovisimap_entry(regclass) — returns raw visibility map entries for a table as sequences of bits (bitmap). Each entry is a string of 0 and 1 characters, where 0 means a visible row and 1 means a hidden row in a segment file. Segments that don’t have hidden rows are not shown. For better readability, the output shows only the leading part of the bitmap until the last hidden row (marked 1). Trailing zeros aren’t shown. Example:
```
SELECT gp_segment_id, (gp_toolkit.__gp_aovisimap_entry('orders')).*
FROM gp_dist_random('gp_id');
```
Example result:
```
 gp_segment_id | segno | first_row_num | hidden_tupcount | bitmap
---------------+-------+---------------+-----------------+--------
             1 |     1 |             0 |               1 | 01
             2 |     1 |             0 |               2 | 0101
(1 row)
```

The gp_toolkit.gp_roles_assigned view provides a cluster-wide listing of all roles. For group roles, the view expands membership and shows all roles assigned to them, making it easier to understand effective privileges.

Example:

SELECT * FROM gp_toolkit.gp_roles_assigned;

The output can look like this:

 raroleid |  rarolename  | ramemberid | ramembername
----------+--------------+------------+--------------
    16511 | sales_team   |      16444 | alice
    16511 | sales_team   |      16455 | bob
       10 | gpadmin      |            |
    16444 | alice        |            |
    16455 | bob          |            |
    16476 | charlie      |            |
(6 rows)

Learn more about managing roles and users in Roles and privileges.

Use resource queues

Collect statistics via ANALYZE

Use gp_toolkit

Check disk usage

Database object sizes

Free disk space

Spill files

Check for cluster issues

Failed segments

Bloat

Missing statistics

Data skew

Monitor locks in the cluster

View database logs

View configuration parameters

Monitor resource usage

Resource groups

Resource queues

Inspect append-optimized tables

Segment files

AO table visibility maps

View users and roles