Github

Pre-installation configuration

Andrey Aksenov

This topic describes how to prepare the operating system environment before Greengage DB installation (based on Greenplum).

Prerequisites

Before configuring the operating system environment, ensure your host systems meet the requirements described in these topics:

Configure an operating system

Host name

You are free to assign any names to your Greengage DB cluster hosts. However, following the standard naming convention can help maintain consistency:

  • A master host — mdw.

  • A standby master host — smdw.

  • Segment hosts — sdw1, sdw2, sdw3, and so on.

NIC bonding is recommended if a cluster host has multiple network interfaces. If a cluster host has several unbonded network interfaces, the convention is to add the dash (-) and number to a host name, for example, sdw1-1, sdw1-2, and so on.

Host name resolution

Greengage DB requires consistent name resolution across all hosts in the cluster. For example, you can use a DNS service or define mapping rules manually in the /etc/hosts file.

The /etc/hosts file should include all the host names and interface address names for every host in the cluster. For example, /etc/hosts might look as follows for a Greengage DB cluster with one master, one standby master, and two segment hosts:

# ...
192.168.1.10 mdw
192.168.1.20 smdw
192.168.1.30 sdw1
192.168.1.40 sdw2

Maximum transmission unit (MTU) settings

If the maximum transmission unit (MTU) size used in your Interconnect network is 9000 bytes (jumbo frame MTU), you must configure the operating system’s MTU settings for host network interfaces accordingly.

SELinux

You need to turn off SELinux for all Greengage DB cluster hosts running RHEL or CentOS. To do this, open the /etc/selinux/config file and set the SELINUX parameter to disabled:

SELINUX=disabled

To apply the changes, reboot the system.

Kernel parameters

The /etc/sysctl.conf file allows you to set kernel parameters that may enhance performance, optimization, and consistency in various environments. If necessary, you can adjust these parameters according to your specific setup.

The recommended parameter values are the following:

kernel.core_pipe_limit=0
net.core.rmem_max=2097152
net.core.rmem_default=26214400
net.core.wmem_max=2097152
net.core.wmem_default=26214400
net.ipv6.conf.all.disable_ipv6=1             # If IPv4 is used
net.ipv6.conf.default.disable_ipv6=1         # If IPv4 is used
kernel.sysrq=1
kernel.core_uses_pid=1
kernel.shmmni=4096
kernel.sem=250 2048000 200 8192
kernel.msgmnb=65536
kernel.msgmax=65536
kernel.msgmni=2048
net.ipv4.tcp_syncookies=1
net.ipv4.conf.default.accept_source_route=0
net.ipv4.tcp_max_syn_backlog=4096
net.ipv4.conf.all.arp_filter=1
net.ipv4.ip_local_port_range=10000 65535     # Affects Greengage DB port settings
net.ipv4.ipfrag_high_thresh=41943040
net.ipv4.ipfrag_low_thresh=31457280
net.ipv4.ipfrag_time=60
net.core.netdev_max_backlog=10000
vm.overcommit_memory=2
vm.overcommit_ratio=95
vm.swappiness=10
vm.zone_reclaim_mode=0
vm.dirty_expire_centisecs=500
vm.dirty_writeback_centisecs=100
vm.dirty_background_ratio=0
vm.dirty_ratio=0
vm.dirty_background_bytes=1610612736
vm.dirty_bytes=4294967296

Note the following:

  • Set the net.ipv6.conf.all.disable_ipv6 and net.ipv6.conf.default.disable_ipv6 parameters to 1 if IPv4 is used for networking.

  • The port range configured using net.ipv4.ip_local_port_range and the ports set in the database configuration file shouldn’t conflict. If net.ipv4.ip_local_port_range is 10000 65535, set the PORT_BASE and MIRROR_PORT_BASE values outside this range, for example:

    PORT_BASE=6000
    MIRROR_PORT_BASE=7000

To apply the changes made in /etc/sysctl.conf, run the sysctl command:

$ sudo sysctl --system

System resources limits

The /etc/security/limits.conf file can be used to increase resource limits for Greengage DBMS. This example shows how to set the maximum number of files and processes the gpadmin user can have open or running simultaneously:

gpadmin soft nofile 524288
gpadmin hard nofile 524288
gpadmin soft nproc 150000
gpadmin hard nproc 150000

The gpadmin is a dedicated operating system account that should be created on each host to run and administer Greengage DBMS. You can learn more in Create the Greengage DB administrative user.

IPC object removal

The RemoveIPC option in the /etc/systemd/logind.conf file controls whether IPC objects are removed when a non-system user logs out. You need to deactivate IPC object removal by setting RemoveIPC to no:

RemoveIPC=no

To apply the changes, restart systemd-logind:

$ sudo systemctl restart systemd-logind
NOTE

You can also deactivate IPC object removal by creating the gpadmin user as a system account. To do this, pass both -r and -m options to the useradd command.

Transparent huge pages (THP)

Deactivate Transparent huge pages (THP) as it might degrade Greengage DBMS performance. To learn how to do this, see your operating system documentation. For example, the Configuring Transparent Huge Pages topic describes how to manage Transparent Huge Pages in RHEL.

XFS mount options

Use the mount command with the following recommended options to mount storage devices:

rw,nodev,noatime,inode64
rw,nodev,noatime,nobarrier,inode64

Disk I/O settings

  • Read-ahead value

    Each disk device file should have a read-ahead (blockdev) value of 8192.

  • Disk I/O scheduler

    The table below lists the recommended scheduler for specific storage device types.

    Storage device type Disk scheduler

    Non-Volatile Memory Express (NVMe)

    none

    Solid-state drives (SSD)

    • Ubuntu — none

    • RHEL/CentOS — noop

    Hard disk drives (HDD)

    • Ubuntu — mq-deadline

    • RHEL/CentOS — deadline

    To learn how to set the scheduler, see the documentation for your operating system.

SSH connection threshold

Greengage DB cluster management utilities, such as gpinitsystem or gpexpand, use secure shell (SSH) connections between systems to perform their tasks. In large deployments, these utilities may exceed the host’s maximum threshold for unauthenticated connections. To prevent such issues, specify MaxStartups in the /etc/ssh/sshd_config file as follows:

MaxStartups 1000:30:1022

Then, restart the sshd service:

$ sudo systemctl restart sshd

Time synchronization

Use Network time protocol (NTP) to synchronize the system clocks on all the hosts that constitute your Greengage DB cluster. The recommended NTP primary source is one of the following:

  • Master host

    In this case, the standby master and all segment hosts connect to it.

  • External NTP server

    In this case, all cluster hosts connect to it.

Depending on your operating system, NTP may be implemented by the ntpd daemon, the chronyd daemon, or other. Refer to the corresponding documentation to learn how to synchronize the system clocks.

Create the Greengage DB administrative user

This section explains how to create a system user account used to run and administer Greengage DB.

Create the gpadmin user

  1. Create the gpadmin group:

    $ sudo groupadd gpadmin
  2. Create a system gpadmin user and add it to the gpadmin group:

    $ sudo useradd gpadmin -r -m -g gpadmin
  3. Set the password for the gpadmin user:

    $ sudo passwd gpadmin
  4. Grant the gpadmin user the ability to execute commands with superuser privileges using sudo:

$ sudo adduser gpadmin sudo
$ sudo usermod -aG wheel gpadmin
  1. Allow users with superuser privileges to execute commands without entering a password. First, execute the visudo command:

    $ sudo visudo

    Then, add the following line to the opened file and save it:

%sudo    ALL=(ALL)    NOPASSWD: ALL
%wheel    ALL=(ALL)    NOPASSWD: ALL

Generate an SSH key pair

The gpadmin user must have an SSH key pair installed on each cluster host. This allows you to enable passwordless SSH to let the gpadmin user SSH from any host to any other host without entering a password or passphrase.

To generate an SSH key pair for gpadmin:

  1. Switch to the gpadmin user:

    $ su - gpadmin
  2. Switch to a Bash shell:

    $ bash
  3. Generate an SSH key pair for the gpadmin user using ssh-keygen:

    $ ssh-keygen -t rsa -b 4096
  4. Press Enter to use the default paths for SSH key files:

    Generating public/private rsa key pair.
    Enter file in which to save the key (/home/gpadmin/.ssh/id_rsa):
  5. Press Enter two times to skip specifying a passphrase:

    Created directory '/home/gpadmin/.ssh'.
    Enter passphrase (empty for no passphrase):
    Enter same passphrase again:

    The output should include the following lines:

    Your identification has been saved in /home/gpadmin/.ssh/id_rsa
    Your public key has been saved in /home/gpadmin/.ssh/id_rsa.pub

Configure endpoint security software

Endpoint security software can interfere with Greengage DBMS operations and affect database performance and stability.

  • Firewall

    Turn off firewall software. Otherwise, configure it according to network requirements described in this topic: Network requirements for Greengage DB installation.

  • Antivirus

    It is recommended that you turn off antivirus software before Greengage DB installation and during operation. Otherwise, contact your antivirus software vendor to determine the settings necessary to let Greengage DB operate correctly.