Logging
Greengage DB can log various events, including DBMS startup and shutdown, segment database failures, connection attempts, data manipulation events, and more. These logs can be monitored to help detect potential security threats. For example, you can track who created new user accounts and when they were created.
This topic describes how to configure logging to record information with varying levels of detail and how to view logs that match specific criteria.
Overview
Log files location
Logs are stored in a custom format, which uses comma-separated values (CSV).
You can use the gplogfilter
utility to view and filter logs.
The master and segment hosts store their own set of log files in the following locations:
-
Master host
On the master host, logs are stored in the $MASTER_DATA_DIRECTORY/pg_log directory. By default, the
gplogfilter
utility uses this location if an input log file is not supplied. -
Segment hosts
On segment hosts, logs are stored in the /<data_dir>/primary/gpseg<N>/pg_log and /<data_dir>/mirror/gpseg<N>/pg_log directories. For example, on the
sdw1
segment host initialized in the Initialize DBMS topic, logs are stored in the following locations:-
primary segments: /data1/primary/gpseg0/pg_log, /data1/primary/gpseg1/pg_log;
-
mirror segments: /data1/mirror/gpseg2/pg_log, /data1/mirror/gpseg3/pg_log.
-
By default, logs created by administrative utilities such as gpinitsystem
, gpstart
, and gpconfig
are stored in the ~/gpAdminLogs directory of the Greengage DB administrative user.
Examples of log entries
Below are a few examples of log entries that are shown in a format used by gplogfilter
:
-
Indicates that the master instance (
seg-1
) has started and ready to accept client connections:2025-02-28 07:20:22.057160 UTC|||p1539|th1004996736||||0|||seg-1|||||LOG: |00000|database system is ready to accept connections|PostgreSQL 9.4.26 (Greengage Database 6.28.0 build dev) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit compiled on Jan 28 2025 14:24:22||||||0||postmaster.c|3406|
-
Shows that the
gpadmin
user connected locally using a Unix domain socket and created thecrm
database:2025-02-28 07:22:42.388447 UTC|gpadmin|postgres|p2062|th-1924802432|[local]||2025-02-28 07:22:08 UTC|0|con26|cmd9|seg-1||||sx1|LOG: |00000|statement: CREATE DATABASE crm;||||||CREATE DATABASE crm;|0||postgres.c|1688|
-
Shows that the
alice
user connected to thecrm
database successfully from a remote host with the192.168.10.55
IP address:2025-02-28 07:23:46.030259 UTC|alice|crm|p2069|th-1924802432|192.168.10.55|49206|2025-02-28 07:23:45 UTC|0|con30||seg-1||||sx1|LOG: |00000|connection authorized: user=alice database=crm|||||||0||postinit.c|329|
Logging configuration parameters
Logging configuration is controlled by server configuration parameters that start with log_
or gp_log_
.
You can modify the default parameter values as described in Configure logging.
The following sections group important logging configuration parameters into related categories.
Logging levels and verbosity
-
log_min_messages
— controls which message levels are written to the server log. Possible values includeINFO
,WARNING
,FATAL
, and more. -
log_error_verbosity
— controls the amount of detail written in the server log for each log message. Possible values includeDEFAULT
,TERSE
, andVERBOSE
. -
log_statement
— controls which SQL statements are logged. For example, you can enable logging of DDL statements only. Possible values includeNONE
,DDL
,MOD
, andALL
.
Client connections
-
log_connections
andlog_disconnections
— controls whether to log client connections and terminations of client sessions. -
log_hostname
— sets whether to include a client’s host name in connection log messages. Depending on your host name resolution configuration, enabling this option may introduce a noticeable performance overhead.
Performance statistics
-
log_statement_stats
— writes the total performance statistics of the query parser, planner, and executor to the server log for each SQL query. -
log_parser_stats
— writes performance statistics of the query parser to the server log for each SQL query. -
log_planner_stats
— writes performance statistics of the Postgres-based planner to the server log for each SQL query.
Log file rotation
-
log_rotation_size
— sets the maximum size of a log file before it rotates. When the file reaches this size, it closes, and a new log file is created. -
log_rotation_age
— specifies the maximum age of a log file before it is rotated. Once the specified time has passed since the log file was created, a new log file is generated. By default, log files are rotated every 24 hours.
Configure logging
To change a configuration parameter related to logging, use the gpconfig -c
command.
Below are a few examples of changing the default values for some logging parameters:
-
Enable logging of client connections on the master instance:
$ gpconfig -c log_connections -v 'on' --masteronly
-
Configure the master instance to log only DDL statements (such as
CREATE
,ALTER
, orDROP
):$ gpconfig -c log_statement -v 'ddl' --masteronly
-
Set the log file rotation size to 256 MB across the entire Greengage DB cluster:
$ gpconfig -c log_rotation_size -v 256MB
To apply the changes, either reload the configuration or restart the cluster using gpstop
, depending on the parameter.
For example, reloading the configuration is enough for log_connections
and log_statement
, while changing gp_log_format
requires a full cluster restart:
$ gpstop -u
$ gpstop -ra
View logs
To view logs, use the gplogfilter
utility.
By default, this utility searches through the Greengage DB master log files in a location defined by the MASTER_DATA_DIRECTORY
environment variable.
The gp_toolkit
administrative schema provides a set of gp_log_*
views that allow you to execute SQL queries against the server logs.
Filter logs
The examples below show how to filter master log files.
-
Show
ERROR
,FATAL
, orPANIC
log entries:$ gplogfilter --trouble
-
Show the last 10 log entries:
$ gplogfilter --tail 10
-
Show events that happened in the specified time range:
$ gplogfilter --begin '2025-02-28 07:20' --end '2025-02-28 07:30'
-
Show all log messages from the last 10 minutes relative to the current time:
$ gplogfilter --duration :10
-
Select only specific columns:
$ gplogfilter --columns '2,3,19'
You can see how column numbers map to column names in the Log format section.
-
Search for log entries that contain the exact string:
$ gplogfilter --find 'connection authorized'
-
Search for log entries that match the specified regular expression:
$ gplogfilter --match 'connection authorized:.*database=crm'
Input and output options
The examples below demonstrate how to write the gplogfilter
output to files and read the specified log files.
-
Write the output to the masterlog.out file instead of standard output (stdout):
$ gplogfilter --end '2025-02-28 07:30' --out masterlog.out
-
Compress the output file using the specified compression level:
$ gplogfilter --end '2025-02-28 07:30' --out masterlog.out --zip=7
-
Read the masterlog.out log file:
$ gplogfilter masterlog.out
-
Read the compressed log file:
$ gplogfilter masterlog.out.gz --unzip
View logs on segments
To read log files on the segment hosts, use the gpssh
utility.
The command below:
-
runs
gplogfilter
on all segment hosts listed in the hostfile_segment_hosts file; -
filters the logs to show only
ERROR
,FATAL
, orPANIC
messages; -
writes the output to the seglog.out file.
$ gpssh -f hostfile_segment_hosts -e " \
source /usr/local/gpdb/greengage_path.sh && \
gplogfilter /data1/*/*/pg_log/gpdb*.csv \
--trouble \
" > seglog.out
Log format
Field number | Field name | Data type | Description |
---|---|---|---|
1 |
event_time |
timestamp with time zone |
The time when the log entry was written to the log. Example: |
2 |
user_name |
varchar(100) |
The database user name. Examples: |
3 |
database_name |
varchar(100) |
The database name. Examples: |
4 |
process_id |
varchar(10) |
The system process ID (prefixed with Example: |
5 |
thread_id |
varchar(50) |
The thread ID (prefixed with Example: |
6 |
remote_host |
varchar(100) |
On the master, represents the client host address. On segments, returns the master host address. Examples: |
7 |
remote_port |
varchar(10) |
The remote port number. Example: |
8 |
session_start_time |
timestamp with time zone |
The time when the session connection was initiated |
9 |
transaction_id |
int |
The top-level transaction ID on the master. This ID is the parent of any subtransactions. On segments, it identifies a local transaction that participates in the distributed transaction |
10 |
gp_session_id |
text |
The session identifier (prefixed with |
11 |
gp_command_count |
text |
The number of commands the master has received from the client (prefixed with Example: |
12 |
gp_segment |
text |
The segment content identifier.
Prefixed with Examples: |
13 |
slice_id |
text |
The slice ID, which represents the portion of the query plan being run |
14 |
distr_tranx_id |
text |
The distributed transaction ID |
15 |
local_tranx_id |
text |
The local transaction ID |
16 |
sub_tranx_id |
text |
The subtransaction ID |
17 |
event_severity |
varchar(10) |
The event severity. Examples: |
18 |
sql_state_code |
varchar(10) |
The SQL state code associated with the log message. Example: |
19 |
event_message |
text |
The text of the log or error message. Examples: |
20 |
event_detail |
text |
The detailed message text associated with an error or warning message |
21 |
event_hint |
text |
A hint message text associated with an error or warning message |
22 |
internal_query |
text |
The internally generated query text |
23 |
internal_query_pos |
int |
The cursor position within the internally generated query text |
24 |
event_context |
text |
The context in which this message is generated |
25 |
debug_query_string |
text |
A user-supplied query string with full details for debugging |
26 |
error_cursor_pos |
int |
The cursor position within the query string |
27 |
func_name |
text |
The function in which this message is generated |
28 |
file_name |
text |
The internal code file where the message originated. Example: |
29 |
file_line |
int |
The line of the code file where the message originated. Example: |
30 |
stack_trace |
text |
The stack trace text associated with this message |