gpinitsystem
Initializes a Greengage DB system using configuration parameters specified in the file.
Synopsis
gpinitsystem -c <cluster_configuration_file>
[ -h <hostfile_gpinitsystem> ]
[ -B <parallel_processes> ]
[ -p <postgresql_conf_param_file> ]
[ -s <standby_master_host>
[ -P <standby_master_port> ]
[ -S <standby_master_datadir>
| --standby_datadir=<standby_master_datadir> ] ]
[ --ignore-warnings ]
[ -m <number> | --max_connections=<number> ]
[ -b <size> | --shared_buffers=<size> ]
[ -n <locale> | --locale=<locale> ]
[ --lc-collate=<locale> ]
[ --lc-ctype=<locale> ]
[ --lc-messages=<locale> ]
[ --lc-monetary=<locale> ]
[ --lc-numeric=<locale> ]
[ --lc-time=<locale> ]
[ -e <password> | --su_password=<password> ]
[ --mirror-mode={group|spread} ]
[ -a ] [ -q ]
[ -l <logfile_directory> ]
[ -D ]
[ -I <input_configuration_file> ]
[ -O <output_configuration_file> ]
gpinitsystem -v | --version
gpinitsystem -? | --help
Description
The gpinitsystem utility creates a Greengage DB instance or writes an input configuration file using the values defined in a cluster configuration file and any command-line options that you provide.
See Initialization configuration file format for more information about the configuration file.
Before running this utility, make sure that you have installed the Greengage DB software on all the hosts in the array.
With the -O <output_configuration_file> option, gpinitsystem writes all provided configuration information to the specified output file.
This file can be used with the -I option to create a new cluster or re-create a cluster from a backed up configuration.
In a Greengage DBMS, each database instance (the master instance and all segment instances) must be initialized across all of the hosts in the system in such a way that they can all work together as a unified DBMS.
The gpinitsystem utility takes care of initializing the Greengage DB master and each segment instance, and configuring the system as a whole.
Before running gpinitsystem, you must set the GPHOME environment variable to point to the location of your Greengage DB installation on the master host and exchange SSH keys between all host addresses in the array using gpssh-exkeys.
gpinitsystem performs the following tasks:
-
Verifies that the parameters in the configuration file are correct.
-
Ensures that a connection can be established to each host address. If a host address cannot be reached, the utility will exit.
-
Verifies the locale settings.
-
Displays the configuration that will be used and prompts the user for confirmation.
-
Initializes the master instance.
-
Initializes the standby master instance (if specified).
-
Initializes the primary segment instances.
-
Initializes the mirror segment instances (if mirroring is configured).
-
Configures the Greengage DB system and checks for errors.
-
Starts the Greengage DB system.
gpinitsystem uses secure shell (SSH) connections between systems to perform its tasks.
In large Greengage DB deployments, cloud deployments, or deployments with a large number of segments per host, this utility may exceed the host’s maximum threshold for unauthenticated connections.
Consider updating the SSH MaxStartups and MaxSessions configuration parameters to increase this threshold.
For more information about SSH configuration options, refer to the SSH documentation for your Linux distribution.
Options
- -a
-
Do not prompt the user for confirmation.
- -B <parallel_processes>
-
The number of segments to create in parallel. If not specified, the utility will start up to 4 parallel processes at a time.
- -c <cluster_configuration_file>
-
(Required) The full path and filename of the configuration file, which contains all of the defined parameters to configure and initialize a new Greengage DB system. See Initialization configuration file format for a description of this file. You must provide either
-c <cluster_configuration_file>or-I <input_configuration_file>. - -D
-
Set log output level to debug.
- -h <hostfile_gpinitsystem>
-
(Optional) The full path and filename of a file that contains the host addresses of your segment hosts. If not specified on the command line, you can specify the host file using the
MACHINE_LIST_FILEparameter in the <cluster_configuration_file> file. - -I <input_configuration_file>
-
The full path and filename of an input configuration file, which defines the Greengage DB host systems, the master instance and segment instances on the hosts, using the
QD_PRIMARY_ARRAY,PRIMARY_ARRAY, andMIRROR_ARRAYparameters. The input configuration file is typically created by usinggpinitsystemwith the-O <output_configuration_file>option. Edit those parameters in order to initialize a new cluster or re-create a cluster from a backed up configuration. You must provide either the-c <cluster_configuration_file>option or the-I <input_configuration_file>option togpinitsystem. - --ignore-warnings
-
Control the value returned by
gpinitsystemwhen warnings or an error occurs. The utility returns0if system initialization completes without warnings. If only warnings occur, system initialization completes and the system is operational.With this option,
gpinitsystemalso returns0if warnings occurred during system initialization, and returns a non-zero value if a fatal error occurs.If this option is not specified,
gpinitsystemreturns1if initialization completes with warnings, and returns a value of2or greater if a fatal error occurs.See the
gpinitsystemlog file for warning and error messages. - -n <locale> | --locale=<locale>
-
Set the default locale used by Greengage DB. If not specified, the default locale is
en_US.utf8. A locale identifier consists of a language identifier and a region identifier, and optionally a character set encoding. For example,sv_SEis Swedish as spoken in Sweden,en_USis U.S. English, andfr_CAis French Canadian. If more than one character set can be useful for a locale, then the specifications look like this:en_US.UTF-8(locale specification and character set encoding). On most systems, the commandlocalewill show the locale environment settings andlocale -awill show a list of all available locales. - --lc-collate=<locale>
-
Similar to
--locale, but sets the locale used for collation (sorting data). The sort order cannot be changed after Greengage DB is initialized, so it is important to choose a collation locale that is compatible with the character set encodings that you plan to use for your data. There is a special collation name ofCorPOSIX(byte-order sorting as opposed to dictionary-order sorting). TheCcollation can be used with any character encoding. - --lc-ctype=<locale>
-
Similar to
--locale, but sets the locale used for character classification (what character sequences are valid and how they are interpreted). This cannot be changed after Greengage DB is initialized, so it is important to choose a character classification locale that is compatible with the data you plan to store in Greengage DB. - --lc-messages=<locale>
-
Similar to
--locale, but sets the locale used for messages output by Greengage DB. The current version of Greengage DB does not support multiple locales for output messages (all messages are in English), so changing this setting will not have any effect. - --lc-monetary=<locale>
-
Similar to
--locale, but sets the locale used for formatting currency amounts. - --lc-numeric=<locale>
-
Similar to
--locale, but sets the locale used for formatting numbers. - --lc-time=<locale>
-
Similar to
--locale, but sets the locale used for formatting dates and times. - -l <logfile_directory>
-
The directory to write the log file. Defaults to ~/gpAdminLogs.
- -m <number> | --max_connections=<number>
-
Set the maximum number of client connections allowed to the master. The default is
250. - -O <output_configuration_file>
-
Optional, used during new cluster initialization. This option writes the cluster_configuration_file information (used with
-c) to the specified output_configuration_file. This file defines the Greengage DB members using theQD_PRIMARY_ARRAY,PRIMARY_ARRAY, andMIRROR_ARRAYparameters. Use this file as a template for the-Iinput_configuration_fileoption. See Examples for more information. - -p <postgresql_conf_param_file>
-
(Optional) The name of a file that contains postgresql.conf parameter settings that you want to set for Greengage DB. These settings will be used when the individual master and segment instances are initialized. You can also set parameters after initialization using the gpconfig utility.
- -q
-
Run in quiet mode. Command output is not displayed on the screen, but is still written to the log file.
- -b <size> | --shared_buffers=<size>
-
Set the amount of memory a Greengage DB server instance uses for shared memory buffers. You can specify sizing in kilobytes (
kB), megabytes (MB) or gigabytes (GB). The default is125MB. - -s <standby_master_host>
-
(Optional) If you wish to configure a backup master instance, specify the host name using this option. The Greengage DB software must already be installed and configured on this host.
- -P <standby_master_port>
-
If you configure a standby master instance with
-s, specify its port number using this option. The default port is the same as the master port. To run the standby and master on the same host, you must use this option to specify a different port for the standby. The Greengage DB software must already be installed and configured on the standby host. - -S <standby_master_datadir> | --standby_datadir=<standby_master_datadir>
-
If you configure a standby master host with
-s, use this option to specify its data directory. If you configure a standby on the same host as the master instance, the master and standby must have separate data directories. - -e <password> | --su_password=<password>
-
Use this option to specify the password to set for the Greengage DB superuser account (such as
gpadmin). If this option is not specified, the default passwordgparrayis assigned to the superuser account. You can use the ALTER ROLE command to change the password at a later time.Recommended security best practices:
-
Do not use the default password option for production environments.
-
Change the password immediately after installation.
-
- --mirror-mode={group|spread}
-
Use this option to specify the placement of mirror segment instances on the segment hosts. The default,
group, groups the mirror segments for all of a host’s primary segments on a single alternate host.spreadspreads mirror segments for the primary segments on a host across different hosts in the Greengage DB cluster. Spreading is only allowed if the number of hosts is greater than the number of segment instances per host. See Enable cluster mirroring for information about Greengage DB mirroring strategies. - -v | --version
-
Print the
gpinitsystemversion and exit. - -? | --help
-
Show help about
gpinitsystemcommand line arguments, and exit.
Initialization configuration file format
gpinitsystem requires a cluster configuration file with the following parameters defined.
An example initialization configuration file can be found in $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config.
To avoid port conflicts between Greengage DB and other applications, the Greengage DB port numbers should not be in the range specified by the operating system parameter net.ipv4.ip_local_port_range.
For example, if net.ipv4.ip_local_port_range = 12000 65535, you could set Greengage DB base port numbers to these values:
PORT_BASE = 10000
MIRROR_PORT_BASE = 10500
- ARRAY_NAME
-
(Required) A name for the cluster you are configuring. You can use any name you like. Enclose the name in quotes if the name contains spaces.
- MACHINE_LIST_FILE
-
(Optional) Can be used in place of the
-hoption. This specifies the file that contains the list of the segment host address names that comprise the Greengage DB system. The master host is assumed to be the host from which you are running the utility and should not be included in this file. If your segment hosts have multiple network interfaces, then this file would include all addresses for the host. Give the absolute path to the file. - SEG_PREFIX
-
(Required) This specifies a prefix that will be used to name the data directories on the master and segment instances. The naming convention for data directories in a Greengage DB system is
SEG_PREFIXnumber where number starts with0for segment instances (the master is always-1). So for example, if you choose the prefixgpseg, your master instance data directory would be namedgpseg-1, and the segment instances would be namedgpseg0,gpseg1,gpseg2,gpseg3, and so on. - PORT_BASE
-
(Required) This specifies the base number by which primary segment port numbers are calculated. The first primary segment port on a host is set as
PORT_BASE, and then incremented by one for each additional primary segment on that host. Valid values range from1through65535. - DATA_DIRECTORY
-
(Required) This specifies the data storage location(s) where the utility will create the primary segment data directories. The number of locations in the list dictate the number of primary segments that will get created per physical host If multiple addresses for a host are listed in the host file, the number of segments will be spread evenly across the specified interface addresses. You can specify the same data storage area multiple times if you want your data directories created in the same location. The user who runs
gpinitsystem(for example, thegpadminuser) must have permission to write to these directories.For example, this will create six primary segments per host:
declare -a DATA_DIRECTORY=(/data1/primary /data1/primary /data1/primary /data2/primary /data2/primary /data2/primary) - MASTER_HOSTNAME
-
(Required) The host name of the master instance. This host name must exactly match the configured host name of the machine (run the
hostnamecommand to determine the correct hostname). - MASTER_DIRECTORY
-
(Required) This specifies the location where the data directory will be created on the master host. You must make sure that the user who runs
gpinitsystem(for example, thegpadminuser) has permissions to write to this directory. - MASTER_PORT
-
(Required) The port number for the master instance. This is the port number that users and client connections will use when accessing the Greengage DB system.
- TRUSTED_SHELL
-
(Required) The shell the
gpinitsystemutility uses to run commands on remote hosts. Allowed values:ssh. You must set up your trusted host environment before running thegpinitsystemutility (you can usegpssh-exkeysto do this). - CHECK_POINT_SEGMENTS
-
(Required) Maximum distance between automatic write ahead log (WAL) checkpoints, in log file segments (each segment is normally 16 megabytes). This will set the
checkpoint_segmentsparameter in the postgresql.conf file for each segment instance in the Greengage DB system. - ENCODING
-
(Required) The character set encoding to use. This character set must be compatible with the
--localesettings used, especially--lc-collateand--lc-ctype. Greengage DB supports the same character sets as PostgreSQL. - DATABASE_NAME
-
(Optional) The name of a Greengage DB database to create after the system is initialized. You can always create a database later using the
CREATE DATABASEcommand or the createdb utility. - MIRROR_PORT_BASE
-
(Optional) This specifies the base number by which mirror segment port numbers are calculated. The first mirror segment port on a host is set as
MIRROR_PORT_BASE, and then incremented by one for each additional mirror segment on that host. Valid values range from1through65535and cannot conflict with the ports calculated byPORT_BASE. - MIRROR_DATA_DIRECTORY
-
(Optional) This specifies the data storage location(s) where the utility will create the mirror segment data directories. There must be the same number of data directories declared for mirror segment instances as for primary segment instances (see the
DATA_DIRECTORYparameter). The user who runsgpinitsystem(for example, thegpadminuser) must have permission to write to these directories. For example:declare -a MIRROR_DATA_DIRECTORY=(/data1/mirror /data1/mirror /data1/mirror /data2/mirror /data2/mirror /data2/mirror) - QD_PRIMARY_ARRAY, PRIMARY_ARRAY, MIRROR_ARRAY
-
Required when using
input_configuration filewith-Ioption. These parameters specify the Greengage DB master host, the primary segment, and the mirror segment hosts respectively. During new cluster initialization, use thegpinitsystem-O <output_configuration_file>to populateQD_PRIMARY_ARRAY,PRIMARY_ARRAY,MIRROR_ARRAY.To initialize a new cluster or re-create a cluster from a backed up configuration, edit these values in the input configuration file used with the
-I <input_configuration_file>option. Use one of the following formats to specify the host information:<hostname>~<address>~<port>~<data_directory>/<seg_prefix><segment_id>~<dbid>~<content_id>
or
<host>~<port>~<data_directory>/<seg_prefix><segment_id>~<dbid>~<content_id>
The first format populates the
hostnameandaddressfields in the gp_segment_configuration catalog table with the hostname and address values provided in the input configuration file. The second format populateshostnameandaddressfields with the same value, derived from host.The Greengage DB master always uses the value
-1for the segment ID and content ID. For example,seg_prefix<segment_id>anddbidvalues forQD_PRIMARY_ARRAYuse-1to indicate the master instance:QD_PRIMARY_ARRAY=mdw~mdw~5432~/gpdata/master/gpseg-1~1~-1 declare -a PRIMARY_ARRAY=( sdw1~sdw1~40000~/gpdata/data1/gpseg0~2~0 sdw1~sdw1~40001~/gpdata/data2/gpseg1~3~1 sdw2~sdw2~40000~/gpdata/data1/gpseg2~4~2 sdw2~sdw2~40001~/gpdata/data2/gpseg3~5~3 ) declare -a MIRROR_ARRAY=( sdw2~sdw2~50000~/gpdata/mirror1/gpseg0~6~0 sdw2~sdw2~50001~/gpdata/mirror2/gpseg1~7~1 sdw1~sdw1~50000~/gpdata/mirror1/gpseg2~8~2 sdw1~sdw1~50001~/gpdata/mirror2/gpseg3~9~3 )To re-create a cluster using a known Greengage DB system configuration, you can edit the segment and content IDs to match the values of the system.
- HEAP_CHECKSUM
-
(Optional) This parameter specifies if checksums are enabled for heap data. When enabled, checksums are calculated for heap storage in all databases, enabling Greengage DB to detect corruption in the I/O system. This option is set when the system is initialized and cannot be changed later.
The
HEAP_CHECKSUMoption is on by default and turning it off is strongly discouraged. If you set this option to off, data corruption in storage can go undetected and make recovery much more difficult.To determine if heap checksums are enabled in a Greengage DB system, you can query the
data_checksumsserver configuration parameter with thegpconfigmanagement utility:$ gpconfig -s data_checksums - HBA_HOSTNAMES
-
(Optional) This parameter controls whether
gpinitsystemuses IP addresses or host names in the pg_hba.conf file when updating the file with addresses that can connect to Greengage DB. The default value is0, the utility uses IP addresses when updating the file. When initializing a Greengage DB system, specifyHBA_HOSTNAMES=1to have the utility use host names in the pg_hba.conf file.
Specify hosts using hostnames or IP addresses
When initializing a Greengage DB system with gpinitsystem, you can specify segment hosts using either hostnames or IP addresses.
For example, you can use hostnames or IP addresses in the file specified with the -h option:
-
If you specify a hostname, the resolution of the hostname to an IP address should be done locally for security. For example, you should use entries in a local /etc/hosts file to map a hostname to an IP address. The resolution of a hostname to an IP address should not be performed by an external service such as a public DNS server. You must stop the Greengage DB system before you change the mapping of a hostname to a different IP address.
-
If you specify an IP address, the address should not be changed after the initial configuration. When segment mirroring is enabled, replication from the primary to the mirror segment will fail if the IP address changes from the configured value. For this reason, you should use a hostname when initializing a Greengage DB system unless you have a specific requirement to use IP addresses.
When initializing the Greengage DB system, gpinitsystem uses the initialization information to populate the gp_segment_configuration catalog table and adds hosts to the pg_hba.conf file.
By default, the host IP address is added to the file.
Specify the gpinitsystem configuration file parameter HBA_HOSTNAMES=1 to add hostnames to the file.
Greengage DB uses the address value of the gp_segment_configuration catalog table when looking up host systems for Greengage DB interconnect communication between the master and segment instances and between segment instances, and for other internal communication.
Examples
Initialize a Greengage DB system by supplying a cluster configuration file and a segment host address file, and set up a spread mirroring (--mirror-mode=spread) configuration:
$ gpinitsystem -c init_config -h hostfile_segment_hosts --mirror-mode=spread
Initialize a Greengage DB system and set the superuser remote password:
$ gpinitsystem -c init_config -h hostfile_segment_hosts --su_password=mypassword
Initialize a Greengage DB system with a standby master host:
$ gpinitsystem -c init_config -h hostfile_segment_hosts -s smdw
Initialize a Greengage DB system and write the provided configuration to an output file, for example cluster_init.config:
$ gpinitsystem -c init_config -h hostfile_segment_hosts -O cluster_init.config
The output file uses the QD_PRIMARY_ARRAY and PRIMARY_ARRAY parameters to define master and segment hosts:
ARRAY_NAME="Greengage DB cluster"
TRUSTED_SHELL=ssh
CHECK_POINT_SEGMENTS=8
ENCODING=UNICODE
SEG_PREFIX=gpseg
HEAP_CHECKSUM=on
HBA_HOSTNAMES=0
QD_PRIMARY_ARRAY=mdw~mdw~5432~/data1/master/gpseg-1~1~-1
declare -a PRIMARY_ARRAY=(
sdw1~sdw1~10000~/data1/primary/gpseg0~2~0
sdw1~sdw1~10001~/data1/primary/gpseg1~3~1
sdw2~sdw2~10000~/data1/primary/gpseg2~4~2
sdw2~sdw2~10001~/data1/primary/gpseg3~5~3
)
declare -a MIRROR_ARRAY=(
sdw2~sdw2~10500~/data1/mirror/gpseg0~6~0
sdw2~sdw2~10501~/data1/mirror/gpseg1~7~1
sdw1~sdw1~10500~/data1/mirror/gpseg2~8~2
sdw1~sdw1~10501~/data1/mirror/gpseg3~9~3
)
Initialize a Greengage DB using an input configuration file (a file that defines the Greengage DB cluster) using QD_PRIMARY_ARRAY and PRIMARY_ARRAY parameters:
$ gpinitsystem -I cluster_init.config
The following example uses a host system configured with multiple NICs. If host systems are configured with multiple NICs, you can initialize a Greengage DB system to use each NIC as a Greengage DB host system. You must ensure that the host systems are configured with sufficient resources to support all the segment instances being added to the host. Also, if high availability is enabled, you must ensure that the Greengage DB system configuration supports failover if a host system fails. For information about Greengage DB mirroring schemes, see Enable cluster mirroring.
For this master and segment instance configuration, the host system gp6m is configured with two NICs gp6m-1 and gp6m-2.
In the configuration, the QD_PRIMARY_ARRAY parameter defines the master segment using gp6m-1.
The PRIMARY_ARRAY and MIRROR_ARRAY parameters use gp6m-2 to define a primary and mirror segment instance:
QD_PRIMARY_ARRAY=gp6m~gp6m-1~5432~/data/master/gpseg-1~1~-1
declare -a PRIMARY_ARRAY=(
gp6m~gp6m-2~40000~/data/data1/gpseg0~2~0
gp6s~gp6s~40000~/data/data1/gpseg1~3~1
)
declare -a MIRROR_ARRAY=(
gp6s~gp6s~50000~/data/mirror1/gpseg0~4~0
gp6m~gp6m-2~50000~/data/mirror1/gpseg1~5~1
)