GitHub

Logging

Andrey Aksenov

Greengage DB can log various events, including DBMS startup and shutdown, segment database failures, connection attempts, data manipulation events, and more. These logs can be monitored to help detect potential security threats. For example, you can track who created new user accounts and when they were created.

This topic describes how to configure logging to record information with varying levels of detail and how to view logs that match specific criteria.

Overview

Log files location

Logs are stored in a custom format, which uses comma-separated values (CSV). You can use the gplogfilter utility to view and filter logs.

The master and segment hosts store their own set of log files in the following locations:

  • Master host

    On the master host, logs are stored in the $MASTER_DATA_DIRECTORY/pg_log directory. By default, the gplogfilter utility uses this location if an input log file is not supplied.

  • Segment hosts

    On segment hosts, logs are stored in the /<data_dir>/primary/gpseg<N>/pg_log and /<data_dir>/mirror/gpseg<N>/pg_log directories. For example, on the sdw1 segment host initialized in the Initialize DBMS topic, logs are stored in the following locations:

    • primary segments: /data1/primary/gpseg0/pg_log, /data1/primary/gpseg1/pg_log;

    • mirror segments: /data1/mirror/gpseg2/pg_log, /data1/mirror/gpseg3/pg_log.

NOTE

By default, logs created by administrative utilities such as gpinitsystem, gpstart, and gpconfig are stored in the ~/gpAdminLogs directory of the Greengage DB administrative user.

Examples of log entries

Below are a few examples of log entries that are shown in a format used by gplogfilter:

  • Indicates that the master instance (seg-1) has started and ready to accept client connections:

    2025-02-28 07:20:22.057160 UTC|||p1539|th1004996736||||0|||seg-1|||||LOG: |00000|database system is ready to accept connections|PostgreSQL 9.4.26 (Greengage Database 6.28.0 build dev) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit compiled on Jan 28 2025 14:24:22||||||0||postmaster.c|3406|
  • Shows that the gpadmin user connected locally using a Unix domain socket and created the crm database:

    2025-02-28 07:22:42.388447 UTC|gpadmin|postgres|p2062|th-1924802432|[local]||2025-02-28 07:22:08 UTC|0|con26|cmd9|seg-1||||sx1|LOG: |00000|statement: CREATE DATABASE crm;||||||CREATE DATABASE crm;|0||postgres.c|1688|
  • Shows that the alice user connected to the crm database successfully from a remote host with the 192.168.10.55 IP address:

    2025-02-28 07:23:46.030259 UTC|alice|crm|p2069|th-1924802432|192.168.10.55|49206|2025-02-28 07:23:45 UTC|0|con30||seg-1||||sx1|LOG: |00000|connection authorized: user=alice database=crm|||||||0||postinit.c|329|

Logging configuration parameters

Logging configuration is controlled by server configuration parameters that start with log_ or gp_log_. You can modify the default parameter values as described in Configure logging.

The following sections group important logging configuration parameters into related categories.

Logging levels and verbosity

  • log_min_messages — controls which message levels are written to the server log. Possible values include INFO, WARNING, FATAL, and more.

  • log_error_verbosity — controls the amount of detail written in the server log for each log message. Possible values include DEFAULT, TERSE, and VERBOSE.

  • log_statement — controls which SQL statements are logged. For example, you can enable logging of DDL statements only. Possible values include NONE, DDL, MOD, and ALL.

Client connections

  • log_connections and log_disconnections — controls whether to log client connections and terminations of client sessions.

  • log_hostname — sets whether to include a client’s host name in connection log messages. Depending on your host name resolution configuration, enabling this option may introduce a noticeable performance overhead.

Performance statistics

  • log_statement_stats — writes the total performance statistics of the query parser, planner, and executor to the server log for each SQL query.

  • log_parser_stats — writes performance statistics of the query parser to the server log for each SQL query.

  • log_planner_stats — writes performance statistics of the Postgres-based planner to the server log for each SQL query.

Log file rotation

  • log_rotation_size — sets the maximum size of a log file before it rotates. When the file reaches this size, it closes, and a new log file is created.

  • log_rotation_age — specifies the maximum age of a log file before it is rotated. Once the specified time has passed since the log file was created, a new log file is generated. By default, log files are rotated every 24 hours.

Configure logging

To change a configuration parameter related to logging, use the gpconfig -c command. Below are a few examples of changing the default values for some logging parameters:

  • Enable logging of client connections on the master instance:

    $ gpconfig -c log_connections -v 'on' --masteronly
  • Configure the master instance to log only DDL statements (such as CREATE, ALTER, or DROP):

    $ gpconfig -c log_statement -v 'ddl' --masteronly
  • Set the log file rotation size to 256 MB across the entire Greengage DB cluster:

    $ gpconfig -c log_rotation_size -v 256MB

To apply the changes, either reload the configuration or restart the cluster using gpstop, depending on the parameter. For example, reloading the configuration is enough for log_connections and log_statement, while changing gp_log_format requires a full cluster restart:

$ gpstop -u
$ gpstop -ra

View logs

To view logs, use the gplogfilter utility. By default, this utility searches through the Greengage DB master log files in a location defined by the MASTER_DATA_DIRECTORY environment variable.

NOTE

The gp_toolkit administrative schema provides a set of gp_log_* views that allow you to execute SQL queries against the server logs.

Filter logs

The examples below show how to filter master log files.

  • Show ERROR, FATAL, or PANIC log entries:

    $ gplogfilter --trouble
  • Show the last 10 log entries:

    $ gplogfilter --tail 10
  • Show events that happened in the specified time range:

    $ gplogfilter --begin '2025-02-28 07:20' --end '2025-02-28 07:30'
  • Show all log messages from the last 10 minutes relative to the current time:

    $ gplogfilter --duration :10
  • Select only specific columns:

    $ gplogfilter --columns '2,3,19'

    You can see how column numbers map to column names in the Log format section.

  • Search for log entries that contain the exact string:

    $ gplogfilter --find 'connection authorized'
  • Search for log entries that match the specified regular expression:

    $ gplogfilter --match 'connection authorized:.*database=crm'

Input and output options

The examples below demonstrate how to write the gplogfilter output to files and read the specified log files.

  • Write the output to the masterlog.out file instead of standard output (stdout):

    $ gplogfilter --end '2025-02-28 07:30' --out masterlog.out
  • Compress the output file using the specified compression level:

    $ gplogfilter --end '2025-02-28 07:30' --out masterlog.out --zip=7
  • Read the masterlog.out log file:

    $ gplogfilter masterlog.out
  • Read the compressed log file:

    $ gplogfilter masterlog.out.gz --unzip

View logs on segments

To read log files on the segment hosts, use the gpssh utility. The command below:

  • runs gplogfilter on all segment hosts listed in the hostfile_segment_hosts file;

  • filters the logs to show only ERROR, FATAL, or PANIC messages;

  • writes the output to the seglog.out file.

$ gpssh -f hostfile_segment_hosts -e " \
      source /usr/local/gpdb/greengage_path.sh && \
      gplogfilter /data1/*/*/pg_log/gpdb*.csv \
      --trouble \
  " > seglog.out

Log format

Field number Field name Data type Description

1

event_time

timestamp with time zone

The time when the log entry was written to the log.

Example: 2025-02-28 07:22:42.388447 UTC

2

user_name

varchar(100)

The database user name.

Examples: gpadmin, alice

3

database_name

varchar(100)

The database name.

Examples: postgres, crm

4

process_id

varchar(10)

The system process ID (prefixed with p).

Example: p2062

5

thread_id

varchar(50)

The thread ID (prefixed with th).

Example: th-1924802432

6

remote_host

varchar(100)

On the master, represents the client host address. On segments, returns the master host address.

Examples: 192.168.10.55, [local]

7

remote_port

varchar(10)

The remote port number.

Example: 49206

8

session_start_time

timestamp with time zone

The time when the session connection was initiated

9

transaction_id

int

The top-level transaction ID on the master. This ID is the parent of any subtransactions. On segments, it identifies a local transaction that participates in the distributed transaction

10

gp_session_id

text

The session identifier (prefixed with con)

11

gp_command_count

text

The number of commands the master has received from the client (prefixed with cmd).

Example: cmd9

12

gp_segment

text

The segment content identifier. Prefixed with seg for primaries or mir for mirrors. The master always has a content ID of -1.

Examples: seg-1, seg1

13

slice_id

text

The slice ID, which represents the portion of the query plan being run

14

distr_tranx_id

text

The distributed transaction ID

15

local_tranx_id

text

The local transaction ID

16

sub_tranx_id

text

The subtransaction ID

17

event_severity

varchar(10)

The event severity.

Examples: LOG, ERROR, FATAL

18

sql_state_code

varchar(10)

The SQL state code associated with the log message.

Example: 00000

19

event_message

text

The text of the log or error message.

Examples: statement: CREATE DATABASE crm;, connection authorized: user=alice database=crm

20

event_detail

text

The detailed message text associated with an error or warning message

21

event_hint

text

A hint message text associated with an error or warning message

22

internal_query

text

The internally generated query text

23

internal_query_pos

int

The cursor position within the internally generated query text

24

event_context

text

The context in which this message is generated

25

debug_query_string

text

A user-supplied query string with full details for debugging

26

error_cursor_pos

int

The cursor position within the query string

27

func_name

text

The function in which this message is generated

28

file_name

text

The internal code file where the message originated.

Example: postgres.c

29

file_line

int

The line of the code file where the message originated.

Example: 1688

30

stack_trace

text

The stack trace text associated with this message