Hello, I’m DocuDroid!
Submitting feedback
Thank you for rating our AI Search!
We would be grateful if you could share your thoughts so we can improve our AI Search for you and other readers.
GitHub

Use diskquota

Pavel Semyonov

diskquota is a Greengage DB module that allows limiting disk space usage on cluster hosts.

In a distributed environment, uncontrolled disk consumption on hosts can affect the entire cluster and lead to query failures or service disruption. diskquota enables administrators to enforce disk usage limits for databases, schemas, roles, and tablespaces, helping isolate workloads and prevent resource exhaustion. Limits are enforced automatically, and operations that exceed configured quotas are blocked.

The source code for diskquota is available in the diskquota repository.

Build and install

Prerequisites

For building diskquota, git and cmake version 3.20 or newer are required.

To start using diskquota, you need to build it from source code and deploy the resulting binaries on all cluster hosts:

  1. Log in as gpadmin.

  2. Clone the diskquota repository, specifying the required tag:

    $ git clone --branch 2.4.1 https://github.com/GreengageDB/diskquota.git
  3. Change the current directory to diskquota:

    $ cd diskquota
  4. Create the build directory and change the current directory to it:

    $ mkdir -p build
    $ cd build
  5. Compile the module:

    $ cmake ..

    The output should end with a message like this:

    -- Build files have been written to: /home/gpadmin/diskquota/build
  6. Install the module:

    $ make install

    The command output includes a message with the location and name of the module file:

    -- Installing: /usr/local/gpdb/lib/postgresql/diskquota-2.4.so

To check that diskquota is installed successfully, connect to any database in your cluster using psql and check available extensions:

SELECT * FROM pg_available_extensions WHERE name = 'diskquota';

Result:

   name    | default_version | installed_version |         comment
-----------+-----------------+-------------------+-------------------------
 diskquota | 2.4             |                   | Disk Quota Main Program
(1 row)

Enable diskquota

To make diskquota available at the cluster level:

  1. Create a database named diskquota. The module uses it to store metadata about databases where it is enabled.

    $ createdb diskquota
  2. Append diskquota to the current list of preload libraries, specifying its version:

    $ gpconfig -c shared_preload_libraries -v '<current_value>,diskquota-2.4'

    where <current_value> is the value of the shared_preload_libraries configuration parameter before this step (can be empty).

    TIP

    You can get the current shared_preload_libraries value using gpconfig -s or use the following gpconfig call to append diskquota to the list preserving the current value:

    $ gpconfig -c shared_preload_libraries \
             -v "$(psql -At -c "SELECT array_to_string(array_append(string_to_array(current_setting('shared_preload_libraries'), ','), 'diskquota-2.4'), ',')" postgres)"
  3. Restart the cluster:

    $ gpstop -r

Activate diskquota in a database

To start enforcing disk quota limits in a database, connect to it using psql and register the diskquota extension:

CREATE EXTENSION diskquota;

Then, initialize diskquota with existing table size information by calling the diskquota.init_table_size_table() function. This operation may take some time.

SELECT diskquota.init_table_size_table();

About diskquota limits

What is counted

The disk usage for a table is calculated as the total size of its data files, indexes, TOAST tables, and free space map. For append-optimized tables, disk usage also includes their visibility map (including its B-tree index) and block directory tables.

IMPORTANT

diskquota counts the size that table files occupy on disk, not the size of the table’s visible rows. See Remove expired table rows via VACUUM for more information about visible and expired rows.

If the total size of these objects exceeds the configured limit, the table is placed in the diskquota denylist. This blocks all data-modifying operations that increase disk usage (such as INSERT, UPDATE, and COPY) on the table until one of the following conditions is met:

  • The table size falls below the limit as a result of operations such as DROP, TRUNCATE, or VACUUM FULL.

    NOTE

    Deleting rows and lazy VACUUM (without FULL) do not release disk space immediately, so these operations do not help remove tables from the denylist.

  • The limit is increased above the current table size using diskquota functions.

Read-only operations such as SELECT are always allowed.

When limits are checked

diskquota checks disk usage periodically rather than continuously. Because of this, there may be a delay between exceeding a limit and placing a table in the denylist, as well as between freeing space and removing it from the denylist.

Additionally, with soft limits (see below), quota checks are only performed before executing statements that may increase disk usage. As a result, some operations may exceed the configured limits.

To adjust the interval between disk usage checks, set the diskquota.naptime configuration parameter (in seconds).

Levels of quota enforcement

diskquota allows setting disk usage limits at the following levels:

  • Schema level — disk space available to all tables in a single schema.

  • Role level — disk space available to all tables owned by a specific role within a database. Note that although roles are cluster-wide objects, role-level quotas are enforced independently for each database.

  • Schema tablespace level — disk space available to all tables of a specific schema within a specific tablespace.

  • Role tablespace level — disk space available to all tables owned by a specific role within a specific tablespace.

  • Per-segment tablespace level — disk space that can be used on a single segment when a schema tablespace or role tablespace quota is enforced.

Soft and hard limits

By default, diskquota enforces soft limits. With soft limits, quota checks are performed before executing a statement that increases disk usage. If the limit is not exceeded at that moment, the statement is allowed to run to completion, even if it causes the limit to be exceeded during execution.

Hard limits provide stricter enforcement. When they are enabled, diskquota also checks limits during statement execution and terminates the statement if a limit is exceeded. To enable hard limits, set the diskquota.hard_limit configuration parameter to on.

IMPORTANT

Due to the implementation of PostgreSQL transaction semantics, neither soft nor hard limits guarantee that quotas will never be exceeded. For illustrations of such cases, see the examples: Example: using a schema disk quota and Example: setting a hard limit.

diskquota worker processes

diskquota uses background worker processes on the master host.

A single diskquota launcher process manages multiple worker processes. Worker processes are responsible for collecting table size information and enforcing quotas in databases. The number of worker processes is defined by the diskquota.max_workers configuration parameter.

NOTE

Worker processes are taken from the shared pool of Greengage DB background workers, whose size is determined by the max_worker_processes server configuration parameter. Therefore, diskquota.max_workers cannot exceed max_worker_processes.

If the number of diskquota-enabled databases is less than or equal to diskquota.max_workers, diskquota operates in static mode. In this mode, each database is assigned a dedicated worker process. It remains active as long as the extension is installed in that database.

If more databases use diskquota than there are available workers, diskquota switches to dynamic mode. In this mode, the launcher starts and stops worker processes for each disk usage check (that is, every diskquota.naptime seconds).

The total number of databases that can be monitored simultaneously is limited by the diskquota.max_monitored_databases configuration parameter.

Set or remove disk quota

diskquota provides the following functions to set and remove disk quota on different levels:

  • set_schema_quota() — schema-level quota;

  • set_role_quota() — role-level quota;

  • set_schema_tablespace_quota() — schema-level quota in a tablespace;

  • set_role_tablespace_quota() — role-level quota in a tablespace;

  • set_per_segment_quota() — per-segment tablespace quota.

The functions accept object identifiers and limit values as arguments. Limit values must include a unit: MB, GB, TB, or PB.

To remove a quota, specify the limit value as -1. This removes the restriction and allows the object to use all available disk space (subject to other quotas, if any).

IMPORTANT

Multiple quotas can apply to the same table simultaneously (for example, schema-level, role-level, and tablespace-level quotas). In such cases, diskquota enforces the most restrictive limit. If any of the applicable quotas is exceeded, the table is placed in the denylist and write operations are blocked.

This means that increasing one quota (for example, schema-level) may not restore write access if another quota (such as role-level or tablespace-level) is still exceeded.

Schema or role quota

To limit the size of all tables in a schema to 300 GB, call diskquota.set_schema_quota():

SELECT diskquota.set_schema_quota('sales', '300 GB');

To set a limit for all tables owned by a role in the current database, use diskquota.set_role_quota():

SELECT diskquota.set_role_quota('alice', '10 GB');

To remove a quota from the specified schema or role, call the same functions with -1 as the limit:

SELECT diskquota.set_schema_quota('sales', '-1');

Tablespace quota: schema and role levels

Tablespace quotas limit the amount of disk space that a schema or role can use within a specific tablespace. This is useful when different tablespaces are backed by different storage types, for example, fast or archival storage.

To set tablespace-level quotas, use the set_schema_tablespace_quota() and set_role_tablespace_quota() functions. Their arguments are: schema or role name, tablespace name, and quota value with a unit.

Examples:

SELECT diskquota.set_schema_tablespace_quota('sales', 'fast_storage', '10 GB');
SELECT diskquota.set_role_tablespace_quota('alice', 'fast_storage', '500 MB');

To remove a tablespace quota from the specified schema or role, call the same functions with -1 as the limit:

SELECT diskquota.set_schema_tablespace_quota('sales', 'fast_storage', '-1');

Per-segment tablespace quota

Per-segment tablespace quotas are applied in addition to schema or role tablespace quotas. They limit how much disk space a single primary segment can use within a tablespace and help prevent issues caused by data skew.

Use the diskquota.set_per_segment_quota() function to configure this limit. Its arguments are the tablespace name and a ratio value.

The ratio defines how much more disk space a single segment can use compared to the average per-segment usage. The average is calculated as:

average_per_segment = tablespace_quota / segment_num

The ratio acts as a multiplier for this average.

For example, a tablespace quota is set for a schema at 80 GB:

SELECT diskquota.set_schema_tablespace_quota('sales', 'fast_storage', '80 GB');

If the cluster has 8 primary segments, the average per-segment usage is 80 GB / 8, which is 10 GB. To allow a single segment to use up to twice this amount, set the ratio to 2.0:

SELECT diskquota.set_per_segment_quota('fast_storage', '2.0');

In this case, any single segment can use up to 20 GB in the fast_storage tablespace for the sales schema. The total for all segments cannot exceed 80 GB.

To remove a per-segment tablespace quota, call the function with -1 as the ratio:

SELECT diskquota.set_per_segment_quota('fast_storage', '-1');

This removes the per-segment quota while leaving the schema-level tablespace quota of 80 GB in effect.

Note that the per-segment quota is stored as a ratio rather than an absolute value. If the tablespace quota is changed using set_schema_tablespace_quota() or set_role_tablespace_quota(), the effective per-segment limit changes accordingly.

Check diskquota usage

View diskquota module information

The diskquota.status() function returns general information about the diskquota module:

SELECT diskquota.status();

Example result:

              status
----------------------------------
 ("soft limits",on)
 ("hard limits",off)
 ("current binary version",2.4.0)
 ("current schema version",2.4)
(4 rows)

View monitored databases

To view databases in which diskquota is enabled, query the diskquota_namespace.database_list table in the diskquota database:

\c diskquota
SELECT d.datname
FROM diskquota_namespace.database_list q,
     pg_database d
WHERE q.dbid = d.oid
ORDER BY d.datname;

Example result:

   datname
-------------
 marketplace
 books_store
(2 rows)

View configured quotas and disk usage

The following views show configured quotas and disk usage in the current database:

  • diskquota.show_fast_schema_quota_view — active schema-level quotas:

    SELECT * FROM diskquota.show_fast_schema_quota_view;

    Example result:

      schema_name | schema_oid | quota_in_mb | nspsize_in_bytes
    --------------+------------+-------------+------------------
     sales        |      16653 |        1024 |         52690944
    (1 row)
  • diskquota.show_fast_role_quota_view — active role-level quotas:

    SELECT * FROM diskquota.show_fast_role_quota_view;

    Example result:

     role_name | role_oid | quota_in_mb | rolsize_in_bytes
    -----------+----------+-------------+------------------
     alice     |    16657 |        1024 |        360972288
    (1 row)
  • diskquota.show_fast_schema_tablespace_quota_view — active schema-level tablespace quotas:

    SELECT * FROM diskquota.show_fast_schema_tablespace_quota_view;

    Example result:

      schema_name | schema_oid | tablespace_name | tablespace_oid | quota_in_mb | nspsize_tablespace_in_bytes
    --------------+------------+-----------------+----------------+-------------+-----------------------------
     sales        |      16653 | fast_storage    |          16671 |          90 |                      786432
    (1 row)
  • diskquota.show_fast_role_tablespace_quota_view — active role-level tablespace quotas:

    SELECT * FROM diskquota.show_fast_role_tablespace_quota_view;

    Example result:

     role_name | role_oid | tablespace_name | tablespace_oid | quota_in_mb | rolsize_tablespace_in_bytes
    -----------+----------+-----------------+----------------+-------------+-----------------------------
     alice     |    16657 | fast_storage    |          16671 |          90 |                           0
    (1 row)
  • diskquota.show_segment_ratio_quota_view — per-segment disk quota ratios:

    SELECT * FROM diskquota.show_segment_ratio_quota_view;

    Example result:

     tablespace_name | tablespace_oid | per_seg_quota_ratio
    -----------------+----------------+---------------------
     fast_storage    |          16671 |                   2
    (1 row)

Example: using a schema disk quota

  1. Create a schema in a database with diskquota enabled:

    CREATE SCHEMA limited_schema;
  2. Set a limit of 1 MB for this schema using diskquota.set_schema_quota():

    SELECT diskquota.set_schema_quota('limited_schema', '1 MB');
  3. Create a table in the schema and insert a small amount of data that does not exceed the limit:

    CREATE TABLE limited_schema.counter_table(i int);
    INSERT INTO limited_schema.counter_table SELECT generate_series(1,100);

    The operation succeeds because the quota is not exceeded.

    Check the current disk space usage in the diskquota.show_fast_schema_quota_view view:

    SELECT * FROM diskquota.show_fast_schema_quota_view;

    Result:

      schema_name   | schema_oid | quota_in_mb | nspsize_in_bytes
    ----------------+------------+-------------+------------------
     limited_schema |      16653 |           1 |           131072
    (1 row)
  4. Insert a large amount of data that exceeds the 1 MB limit:

    INSERT INTO limited_schema.counter_table SELECT generate_series(1,10000000);

    The operation succeeds because soft limits are enforced by default. With soft limits, the quota is checked before the statement starts. Since the limit is not exceeded at that moment, the statement is allowed to complete even though it adds more data than the quota allows.

    The diskquota.show_fast_schema_quota_view view now shows that the schema occupies significantly more space than the configured limit:

    SELECT * FROM diskquota.show_fast_schema_quota_view;

    Result:

      schema_name   | schema_oid | quota_in_mb | nspsize_in_bytes
    ----------------+------------+-------------+------------------
     limited_schema |      16653 |           1 |        360972288
    (1 row)

    During the next periodic check (after up to diskquota.naptime seconds), diskquota detects that the limit is exceeded and places the schema in the denylist. Further data-modifying operations that increase disk usage are then blocked.

  5. Try to insert one more row:

    INSERT INTO limited_schema.counter_table VALUES (10000001);

    The operation fails with the error:

    ERROR:  schema's disk space quota exceeded with name: limited_schema
  6. To remove the schema quota, set its value to -1:

    SELECT diskquota.set_schema_quota('limited_schema', '-1');

    Data-modifying operations are now allowed again:

    INSERT INTO limited_schema.counter_table VALUES (10000001);

    Result:

    INSERT 0 1

Example: setting a hard limit

To observe hard limit behavior:

  1. Set the 1 MB quota for the limited_schema again and truncate counter_table:

    SELECT diskquota.set_schema_quota('limited_schema', '1 MB');
    TRUNCATE TABLE limited_schema.counter_table;
  2. Enable hard limits by setting the diskquota.hard_limit parameter to on and reload the configuration:

    $ gpconfig -c diskquota.hard_limit -v 'on'
    $ gpstop -u
  3. Attempt to insert data that exceeds the limit:

    INSERT INTO limited_schema.counter_table SELECT generate_series(1,10000000);

    In this case, the statement is terminated during execution as soon as the quota is exceeded:

    ERROR:  schema's disk space quota exceeded with name: 16653  (seg1 10.92.41.158:10001 pid=4105)

Termination logically rolls back inserts made by the statement, so COUNT(*) shows that the table is still empty:

SELECT COUNT(*) FROM limited_schema.counter_table;

Result:

 count
-------
     0
(1 row)

However, data files that the statement has created before the termination remain on disk. You can check it with pg_total_relation_size():

SELECT pg_size_pretty(pg_total_relation_size('limited_schema.counter_table'));

Result:

 pg_size_pretty
----------------
 76 MB
(1 row)
NOTE

The output shows that just like soft limits, hard limits cannot guarantee that the quota is never exceeded.

Thus, further write operations are blocked although the table appears to be empty. To make them possible, you must reduce the physical size of the table below the quota using operations such as:

  • VACUUM FULL

  • TRUNCATE

  • DROP

These operations reclaim disk space and allow diskquota to remove the table from the denylist during the next check cycle.

Pause diskquota

Use the diskquota.pause() function to temporarily stop disk usage checks, for example, to exclude certain operations from quota enforcement or perform maintenance:

SELECT diskquota.pause();
NOTE

In some cases, VACUUM and VACUUM FULL operations can temporarily increase the size of a table, which may cause it to exceed the configured quota. To avoid unintended failures, it is recommended to pause diskquota before running VACUUM or VACUUM FULL.

To resume diskquota checks, call diskquota.resume():

SELECT diskquota.resume();

The effect of diskquota.pause() is not persistent and is lost after a cluster restart. If needed, call this function again after the restart to keep diskquota paused.

Deactivate diskquota

There are two ways to stop enforcing disk quota limits:

  • Drop the diskquota extension from a database.

  • Deactivate diskquota cluster-wide.

Deactivate in a database

To stop enforcing quotas in a specific database, drop the diskquota extension. It is recommended to call diskquota.pause() before dropping the extension to avoid possible deadlocks:

SELECT diskquota.pause();
DROP EXTENSION diskquota;

Deactivate cluster-wide

To completely stop using diskquota in the cluster, remove it from the shared_preload_libraries configuration parameter:

$ gpconfig -c shared_preload_libraries -v '<value_without_diskquota>'

where <value_without_diskquota> is the current value of shared_preload_libraries (obtained with gpconfig -s shared_preload_libraries) without the diskquota-<version> entry.

Then restart the cluster:

$ gpstop -r

To enable diskquota again, repeat the steps described in Enable diskquota.

Functions reference

init_table_size_table() returns void

Calculates sizes of database objects for the diskquota extension. Call this function after creating the diskquota extension in a database to initialize internal metadata used for quota enforcement.

set_role_quota(role_name text, quota text) returns void

Sets a disk quota for a role. The total size of all relations owned by this role in the current database cannot exceed this limit. Quota must be specified as a string consisting of a number and a unit: MB, GB, TB, or PB.

set_role_tablespace_quota(role_name text, tablespace_name text, quota text) returns void

Sets a tablespace disk quota for a role. The total size of all relations owned by this role in the specified tablespace cannot exceed this limit. Quota must be specified as a string consisting of a number and a unit: MB, GB, TB, or PB.

set_schema_quota(schema_name text, quota text) returns void

Sets a disk quota for a schema. The total size of all relations in this schema cannot exceed this limit. Quota must be specified as a string consisting of a number and a unit: MB, GB, TB, or PB.

set_schema_tablespace_quota(schema_name text, tablespace_name text, quota text) returns void

Sets a tablespace disk quota for a schema. The total size of all relations in this schema within the specified tablespace cannot exceed this limit. Quota must be specified as a string consisting of a number and a unit: MB, GB, TB, or PB.

set_per_segment_quota(tablespace_name text, ratio float4) returns void

Sets a per-segment quota ratio for a tablespace. The ratio limits how much disk space a single segment can use relative to the average per-segment usage for the corresponding tablespace quota.

pause() returns void

Temporarily disables quota enforcement by stopping error reporting when limits are exceeded. Note that diskquota background processes continue to run and collect size statistics.

See also: Pause diskquota.

resume() returns void

Resumes quota enforcement after it has been paused with pause().

See also: Pause diskquota.

status() returns table

Displays the diskquota binary and schema versions, as well as the status of soft and hard limit enforcement in the current database. The result includes two columns: name and status.

Configuration parameters reference

diskquota.hard_limit

Enables or disables hard limits on disk usage. When enabled, queries are terminated during execution as soon as a quota is exceeded.

Use this parameter in environments where strict quota enforcement is required and temporary quota overruns are unacceptable. Be aware that enabling hard limits may lead to more frequent query cancellations.

Learn more in Soft and hard limits.

Value range Default value Definition scope Context

boolean

off

master

sighup

diskquota.hashmap_overflow_report_timeout

Specifies the interval (in seconds) between warning messages about diskquota shared hashmap overflow.

Hashmap overflow indicates that internal tracking structures are undersized for the current workload. Frequent warnings may indicate that diskquota.max_active_tables or related limits should be increased.

Value range Default value Definition scope Context

0 — INT_MAX / 1000

60

master

superuser

diskquota.max_active_tables

Specifies the maximum number of relations whose sizes can be tracked simultaneously by diskquota. If the number of active tables exceeds this limit, some tables may not be tracked correctly, which can lead to delayed or inaccurate quota enforcement.

Increase this value in environments with a large number of frequently modified tables.

Value range Default value Definition scope Context

1 — INT_MAX

307200

master

postmaster

diskquota.max_monitored_databases

Specifies the maximum number of databases that diskquota can monitor simultaneously. If this limit is reached, additional databases with the diskquota extension enabled will not be monitored.

Increase this value in clusters with many databases using diskquota.

Value range Default value Definition scope Context

1 — 1024

50

master

postmaster

diskquota.max_quota_probes

Specifies the maximum number of quota probes in the cluster. A quota probe is an internal object responsible for monitoring a single specific quota, such as a role or a schema quota. diskquota creates probes for all objects whose size can be limited even if no limits are configured for them. For example, if there are 100 roles in the cluster, 100 probes are required for them even if only 10 roles have configured quotas.

Taking into account available levels of quota enforcement, the diskquota.max_quota_probes value must be at least the following:

role_num * database_num + schema_num + role_num * tablespace_num * database_num + schema_num * tablespace_num

where:

  • role_num is the number of roles in the cluster;

  • database_num is the number of databases;

  • schema_num is the number of schemas in the cluster;

  • tablespace_num is the number of tablespaces in the cluster.

Higher values allow accommodating more new objects over time, but increase memory consumption (48 bytes per probe).

Value range Default value Definition scope Context

131072 — INT_MAX

1048576

master

postmaster

diskquota.max_reject_entries

Specifies the maximum number of denylist entries per database. Each entry corresponds to an object (for example, a schema or role) that has exceeded its quota. If this limit is reached, new violations may not be tracked correctly.

Increase this value in environments with many simultaneously exceeded quotas.

Value range Default value Definition scope Context

1 — INT_MAX

8192

master

postmaster

diskquota.max_table_segments

Specifies the maximum number of table segments that diskquota can track.

This value should be greater than or equal to the total number of segment files across all monitored tables. In large clusters with many segments and partitions, this value may need to be increased.

Value range Default value Definition scope Context

131072 — INT_MAX

10485760

master

postmaster

diskquota.max_workers

Specifies the maximum number of diskquota worker processes that can run simultaneously. If the number of monitored databases exceeds this value, diskquota switches to dynamic mode, which may increase the latency of quota checks.

Increase this value to improve responsiveness of quota enforcement in clusters with many databases, keeping in mind the limit imposed by max_worker_processes.

Value range Default value Definition scope Context

1 — 20

10

master

postmaster

diskquota.naptime

Specifies the interval (in seconds) between periodic relation size calculations.

Smaller values make quota enforcement more responsive but increase system overhead. Larger values reduce overhead but increase the delay between exceeding a quota and enforcement.

Adjust this parameter based on the required balance between responsiveness and performance.

Value range Default value Definition scope Context

1 — INT_MAX

2

master

sighup

diskquota.worker_timeout

Specifies the timeout (in seconds) for waiting for a worker process to complete its task. If workers frequently time out, it may indicate system overload or insufficient worker capacity.

Increase this value in environments with large datasets or slow storage.

Value range Default value Definition scope Context

1 — INT_MAX

60

master

sighup