This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Administration

Standard Operating Procedures (SOP) for database administration tasks

1: Troubleshooting
2: Database Management
3: HBA Management
4: SOP
5: Cluster Management
6: User Management
7: Parameter Tuning
8: Accidental Deletion
9: Clone Replicas
10: Maintenance
11: Version Upgrade

How to maintain existing PostgreSQL clusters with Pigsty?

This section provides standard operating procedures (SOP) for common PostgreSQL administration tasks:

SOP: Standard operating procedures for creating/removing clusters and instances, backup & restore, rolling upgrades, etc.
Failure: Common failure troubleshooting strategies and handling methods, such as disk exhaustion, connection exhaustion, XID wraparound, etc.
Drop: Emergency procedures for handling accidental data deletion, table drops, and database drops
Maintain: Maintenance tasks including regular inspections, post-failover cleanup, bloat management, VACUUM FREEZE, etc.
Tuning: Automatic optimization strategies and adjustment methods for memory, CPU, storage parameters, etc.

1 - Troubleshooting

Common failures and analysis troubleshooting approaches

This document lists potential failures in PostgreSQL and Pigsty, as well as SOPs for locating, handling, and analyzing issues.

Disk Space Exhausted

Disk space exhaustion is the most common type of failure.

Symptoms

When the disk space where the database resides is exhausted, PostgreSQL will not work normally and may exhibit the following symptoms: database logs repeatedly report “no space left on device” errors, new data cannot be written, and PostgreSQL may even trigger a PANIC and force shutdown.

Pigsty includes a NodeFsSpaceFull alert rule that triggers when filesystem available space is less than 10%. Use the monitoring system’s NODE Instance panel to review the FS metrics panel to locate the issue.

Diagnosis

You can also log into the database node and use df -h to view the usage of each mounted partition to determine which partition is full. For database nodes, focus on checking the following directories and their sizes to determine which category of files has filled up the space:

Data directory (/pg/data/base): Stores data files for tables and indexes; pay attention to heavy writes and temporary files
WAL directory (e.g., pg/data/pg_wal): Stores PG WAL; WAL accumulation/replication slot retention is a common cause of disk exhaustion.
Database log directory (e.g., pg/log): If PG logs are not rotated in time and large amounts of errors are written, they may also consume significant space.
Local backup directory (e.g., data/backups): When using pgBackRest or similar tools to save backups locally, this may also fill up the disk.

If the issue occurs on the Pigsty admin node or monitoring node, also consider:

Monitoring data: VictoriaMetrics time-series metrics and VictoriaLogs log storage both consume disk space; check retention policies.
Object storage data: Pigsty’s integrated MinIO object storage may be used for PG backup storage.

After identifying the directory consuming the most space, you can further use du -sh <directory> to drill down and find specific large files or subdirectories.

Resolution

Disk exhaustion is an emergency issue requiring immediate action to free up space and ensure the database continues to operate. When the data disk is not separated from the system disk, a full disk may prevent shell commands from executing. In this case, you can delete the /pg/dummy placeholder file to free up a small amount of emergency space so shell commands can work again. If the database has crashed due to pg_wal filling up, you need to restart the database service after clearing space and carefully check data integrity.

Transaction ID Wraparound

PostgreSQL cyclically uses 32-bit transaction IDs (XIDs), and when exhausted, a “transaction ID wraparound” failure occurs (XID Wraparound).

Symptoms

The typical sign in the first phase is when the age saturation in the PGSQL Persist - Age Usage panel enters the warning zone. Database logs begin to show messages like: WARNING: database "postgres" must be vacuumed within xxxxxxxx transactions.

If the problem continues to worsen, PostgreSQL enters protection mode: when remaining transaction IDs drop to about 1 million, the database switches to read-only mode; when reaching the limit of about 2.1 billion (2^31), it refuses any new transactions and forces the server to shut down to avoid data corruption.

Diagnosis

PostgreSQL and Pigsty enable automatic garbage collection (AutoVacuum) by default, so the occurrence of this type of failure usually has deeper root causes. Common causes include: very long transactions (SAGE), misconfigured Autovacuum, replication slot blockage, insufficient resources, storage engine/extension bugs, disk bad blocks.

First identify the database with the highest age, then use the Pigsty PGCAT Database - Tables panel to confirm the age distribution of tables. Also review the database error logs, which usually contain clues to locate the root cause.

Resolution

Immediately freeze old transactions: If the database has not yet entered read-only protection mode, immediately execute a manual VACUUM FREEZE on the affected database. You can start by freezing the most severely aged tables one by one rather than doing the entire database at once to accelerate the effect. Connect to the database as a superuser and run VACUUM FREEZE table_name; on tables identified with the largest relfrozenxid, prioritizing tables with the highest XID age. This can quickly reclaim large amounts of transaction ID space.
Single-user mode rescue: If the database is already refusing writes or has crashed for protection, you need to start the database in single-user mode to perform freeze operations. In single-user mode, run VACUUM FREEZE database_name; to freeze and clean the entire database. After completion, restart the database in multi-user mode. This can lift the wraparound lock and make the database writable again. Be very careful when operating in single-user mode and ensure sufficient transaction ID margin to complete the freeze.
Standby node takeover: In some complex scenarios (e.g., when hardware issues prevent vacuum from completing), consider promoting a read-only standby node in the cluster to primary to obtain a relatively clean environment for handling the freeze. For example, if the primary cannot vacuum due to bad blocks, you can manually failover to promote the standby to the new primary, then perform emergency vacuum freeze on it. After ensuring the new primary has frozen old transactions, switch the load back.

Connection Exhaustion

PostgreSQL has a maximum connections configuration (max_connections). When client connections exceed this limit, new connection requests will be rejected. The typical symptom is that applications cannot connect to the database and report errors like FATAL: remaining connection slots are reserved for non-replication superuser connections or too many clients already. This indicates that regular connections are exhausted, leaving only slots reserved for superusers or replication.

Diagnosis

Connection exhaustion is usually caused by a large number of concurrent client requests. You can directly review the database’s current active sessions through PGCAT Instance / PGCAT Database / PGCAT Locks. Determine what types of queries are filling the system and proceed with further handling. Pay special attention to whether there are many connections in the “Idle in Transaction” state and long-running transactions (as well as slow queries).

Resolution

Kill queries: For situations where exhaustion has already blocked business operations, typically use pg_terminate_backend(pid) immediately for emergency pressure relief. For cases using connection pooling, you can adjust the connection pool size parameters and execute a reload to reduce the number of connections at the database level.

You can also modify the max_connections parameter to a larger value, but this parameter requires a database restart to take effect.

etcd Quota Exhausted

An exhausted etcd quota will cause the PG high availability control plane to fail and prevent configuration changes.

Diagnosis

Pigsty uses etcd as the distributed configuration store (DCS) when implementing high availability. etcd itself has a storage quota (default is about 2GB). When etcd storage usage reaches the quota limit, etcd will refuse write operations and report “etcdserver: mvcc: database space exceeded”. In this case, Patroni cannot write heartbeats or update configuration to etcd, causing cluster management functions to fail.

Resolution

Versions between Pigsty v2.0.0 and v2.5.1 are affected by this issue by default. Pigsty v2.6.0 added auto-compaction configuration for deployed etcd. If you only use it for PG high availability leases, this issue will no longer occur in regular use cases.

Defective Storage Engine

Currently, TimescaleDB’s experimental storage engine Hypercore has been proven to have defects, with cases of VACUUM being unable to reclaim leading to XID wraparound failures. Users using this feature should migrate to PostgreSQL native tables or TimescaleDB’s default engine promptly.

Detailed introduction: PG New Storage Engine Failure Case (Chinese)

2 - Database Management

Database management: create, modify, delete, rebuild databases, and clone databases using templates

In Pigsty, database management follows an IaC (Infrastructure as Code) approach—define in the configuration inventory, then execute playbooks.

When no baseline SQL is defined, executing the pgsql-db.yml playbook is idempotent. It adjusts the specified database in the specified cluster to match the target state in the configuration inventory.

Define Database
Create Database
Modify Database
Delete Database
Rebuild Database
Clone Database
Manage Parameters
Manage Schemas
Manage Extensions
Connection Pool Config
Quick Reference

Note that some parameters can only be specified at creation time. Modifying these parameters requires deleting and recreating the database (using state: recreate to rebuild).

Define Database

Business databases are defined in the cluster parameter pg_databases, which is an array of database definition objects. Databases in the array are created in definition order, so later-defined databases can use previously-defined databases as templates.

Here’s the database definition from the default cluster pg-meta in Pigsty’s demo environment:

pg-meta:
  hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
  vars:
    pg_cluster: pg-meta
    pg_databases:
      - { name: meta ,baseline: cmdb.sql ,comment: pigsty meta database ,schemas: [pigsty] ,extensions: [{name: postgis, schema: public}, {name: timescaledb}]}
      - { name: grafana  ,owner: dbuser_grafana  ,revokeconn: true ,comment: grafana primary database }
      - { name: bytebase ,owner: dbuser_bytebase ,revokeconn: true ,comment: bytebase primary database }
      - { name: kong     ,owner: dbuser_kong     ,revokeconn: true ,comment: kong the api gateway database }
      - { name: gitea    ,owner: dbuser_gitea    ,revokeconn: true ,comment: gitea meta database }
      - { name: wiki     ,owner: dbuser_wiki     ,revokeconn: true ,comment: wiki meta database }
      - { name: noco     ,owner: dbuser_noco     ,revokeconn: true ,comment: nocodb database }

The only required field is name, which should be a valid and unique database name within the current PostgreSQL cluster—all other parameters have sensible defaults. For complete database definition parameter reference, see Database Configuration Reference.

Create Database

To create a new business database on an existing PostgreSQL cluster, add the database definition to all.children.<cls>.pg_databases, then execute:

bin/pgsql-db <cls> <dbname>    # Equivalent to: pgsql-db.yml -l <cls> -e dbname=<dbname>

Example: Create a business database named myapp

Add the database definition to the configuration file:

pg-meta:
  vars:
    pg_databases:
      - name: myapp
        owner: dbuser_myapp
        schemas: [app]
        extensions:
          - { name: pg_trgm }
          - { name: btree_gin }
        comment: my application database

Execute the create command:

bin/pgsql-db pg-meta myapp

Execution effects:

Creates database myapp on the primary
Sets database owner to dbuser_myapp
Creates schema app
Installs extensions pg_trgm and btree_gin
Configures default privileges (dbrole_readonly/readwrite/admin)
Adds database to Pgbouncer connection pool
Registers database as a Grafana data source

Use Playbooks to Create Databases

We don’t recommend creating business databases manually with SQL, especially when using Pgbouncer connection pooling. Using bin/pgsql-db automatically handles connection pool configuration and monitoring registration.

Modify Database

Modify database properties by updating the configuration and re-executing the playbook:

bin/pgsql-db <cls> <dbname>    # Idempotent operation, can be executed repeatedly

Modifiable Properties

Property	Description	Example
`owner`	Database owner	`owner: dbuser_new`
`tablespace`	Default tablespace (triggers data migration)	`tablespace: fast_ssd`
`is_template`	Mark as template database	`is_template: true`
`allowconn`	Allow connections	`allowconn: false`
`connlimit`	Connection limit	`connlimit: 100`
`revokeconn`	Revoke PUBLIC CONNECT privilege	`revokeconn: true`
`comment`	Comment	`comment: new comment`
`parameters`	Database-level parameters	See examples below
`schemas`	Add/remove schemas (incremental)	See Manage Schemas
`extensions`	Add/remove extensions (incremental)	See Manage Extensions
`pgbouncer`	Include in connection pool	`pgbouncer: false`
`pool_*`	Connection pool parameters	See Connection Pool Config

Non-modifiable Properties

The following properties cannot be modified after database creation—use state: recreate to rebuild the database:

template - Template database
encoding - Character encoding
locale / lc_collate / lc_ctype - Locale settings
locale_provider / icu_locale / icu_rules / builtin_locale - Locale provider settings
strategy - Clone strategy

Change Owner

- name: myapp
  owner: dbuser_new_owner     # Change to new owner

bin/pgsql-db pg-meta myapp

Executed SQL:

ALTER DATABASE "myapp" OWNER TO "dbuser_new_owner";
GRANT ALL PRIVILEGES ON DATABASE "myapp" TO "dbuser_new_owner";

Change Connection Limit

- name: myapp
  connlimit: 100              # Limit to max 100 connections

Executed SQL:

ALTER DATABASE "myapp" CONNECTION LIMIT 100;

Revoke Public Connection Privilege

- name: myapp
  owner: dbuser_myapp
  revokeconn: true            # Revoke PUBLIC CONNECT privilege

Executed SQL:

REVOKE CONNECT ON DATABASE "myapp" FROM PUBLIC;
GRANT CONNECT ON DATABASE "myapp" TO "replicator";
GRANT CONNECT ON DATABASE "myapp" TO "dbuser_monitor";
GRANT CONNECT ON DATABASE "myapp" TO "dbuser_dba" WITH GRANT OPTION;
GRANT CONNECT ON DATABASE "myapp" TO "dbuser_myapp" WITH GRANT OPTION;

To restore public connection privilege, set revokeconn: false:

- name: myapp
  revokeconn: false           # Restore PUBLIC CONNECT privilege

Executed SQL:

GRANT CONNECT ON DATABASE "myapp" TO PUBLIC;

Mark as Template Database

- name: app_template
  is_template: true           # Allow any user with CREATEDB privilege to clone

Executed SQL:

ALTER DATABASE "app_template" IS_TEMPLATE true;

Manage Parameters

Database-level parameters are configured via the parameters dictionary, generating ALTER DATABASE ... SET statements.

Set Parameters

- name: myapp
  parameters:
    work_mem: '256MB'
    maintenance_work_mem: '512MB'
    statement_timeout: '30s'
    search_path: 'app,public'

Executed SQL:

ALTER DATABASE "myapp" SET "work_mem" = '256MB';
ALTER DATABASE "myapp" SET "maintenance_work_mem" = '512MB';
ALTER DATABASE "myapp" SET "statement_timeout" = '30s';
ALTER DATABASE "myapp" SET "search_path" = 'app,public';

Reset Parameters to Default

Use the special value DEFAULT (case-insensitive) to reset parameters to PostgreSQL defaults:

- name: myapp
  parameters:
    work_mem: DEFAULT         # Reset to PostgreSQL default
    statement_timeout: DEFAULT

Executed SQL:

ALTER DATABASE "myapp" SET "work_mem" = DEFAULT;
ALTER DATABASE "myapp" SET "statement_timeout" = DEFAULT;

Common Database-Level Parameters

Parameter	Description	Example Value
`work_mem`	Query work memory	`'64MB'`
`maintenance_work_mem`	Maintenance operation memory	`'256MB'`
`statement_timeout`	Statement timeout	`'30s'`
`lock_timeout`	Lock wait timeout	`'10s'`
`idle_in_transaction_session_timeout`	Idle transaction timeout	`'10min'`
`search_path`	Schema search path	`'app,public'`
`default_tablespace`	Default tablespace	`'fast_ssd'`
`temp_tablespaces`	Temporary tablespaces	`'temp_ssd'`
`log_statement`	Statement logging level	`'ddl'`

Manage Schemas

Schemas are configured via the schemas array, supporting create, assign owner, and delete operations.

Create Schemas

- name: myapp
  schemas:
    # Simple form: schema name only
    - app
    - api

    # Full form: specify owner
    - { name: core, owner: dbuser_myapp }

Executed SQL:

CREATE SCHEMA IF NOT EXISTS "app";
CREATE SCHEMA IF NOT EXISTS "api";
CREATE SCHEMA IF NOT EXISTS "core" AUTHORIZATION "dbuser_myapp";

Specify Schema Owner

Use the owner field to assign a schema owner—useful for multi-tenant or permission isolation scenarios:

- name: multi_tenant_db
  owner: dbuser_admin
  schemas:
    - { name: tenant_a, owner: dbuser_tenant_a }
    - { name: tenant_b, owner: dbuser_tenant_b }
    - { name: shared, owner: dbuser_admin }

Delete Schemas

Use state: absent to mark schemas for deletion:

- name: myapp
  schemas:
    - { name: deprecated_schema, state: absent }

Executed SQL:

DROP SCHEMA IF EXISTS "deprecated_schema" CASCADE;

CASCADE Warning

Schema deletion uses the CASCADE option, which also deletes all objects within the schema (tables, views, functions, etc.). Ensure you understand the impact before executing delete operations.

Manage Extensions

Extensions are configured via the extensions array, supporting install and uninstall operations.

Install Extensions

- name: myapp
  extensions:
    # Simple form: extension name only
    - postgis
    - pg_trgm

    # Full form: specify schema and version
    - { name: vector, schema: public }
    - { name: pg_stat_statements, schema: monitor, version: '1.10' }

Executed SQL:

CREATE EXTENSION IF NOT EXISTS "postgis" CASCADE;
CREATE EXTENSION IF NOT EXISTS "pg_trgm" CASCADE;
CREATE EXTENSION IF NOT EXISTS "vector" WITH SCHEMA "public" CASCADE;
CREATE EXTENSION IF NOT EXISTS "pg_stat_statements" WITH SCHEMA "monitor" VERSION '1.10' CASCADE;

Uninstall Extensions

Use state: absent to mark extensions for uninstall:

- name: myapp
  extensions:
    - { name: pg_trgm, state: absent }    # Uninstall extension
    - { name: postgis }                    # Keep extension

Executed SQL:

DROP EXTENSION IF EXISTS "pg_trgm" CASCADE;
CREATE EXTENSION IF NOT EXISTS "postgis" CASCADE;

CASCADE Warning

Extension uninstall uses the CASCADE option, which also drops all objects depending on that extension (views, functions, etc.). Ensure you understand the impact before executing uninstall operations.

Delete Database

To delete a database, set its state to absent and execute the playbook:

pg_databases:
  - name: olddb
    state: absent

bin/pgsql-db <cls> olddb

Delete operation will:

If database is marked is_template: true, first execute ALTER DATABASE ... IS_TEMPLATE false
Force drop database with DROP DATABASE ... WITH (FORCE) (PG13+)
Terminate all active connections to the database
Remove database from Pgbouncer connection pool
Unregister from Grafana data sources

Protection mechanisms:

System databases postgres, template0, template1 cannot be deleted
Delete operations only execute on the primary—streaming replication syncs to replicas automatically

Dangerous Operation Warning

Deleting a database is an irreversible operation that permanently removes all data in that database. Before executing, ensure:

You have the latest database backup
No applications are using the database
Relevant stakeholders have been notified

Rebuild Database

The recreate state rebuilds a database, equivalent to delete then create:

pg_databases:
  - name: testdb
    state: recreate
    owner: dbuser_test
    baseline: test_init.sql    # Execute initialization after rebuild

bin/pgsql-db <cls> testdb

Use cases:

Test environment reset
Clear development database
Modify immutable properties (encoding, locale, etc.)
Restore database to initial state

Difference from manual DROP + CREATE:

Single command, no need for two operations
Automatically preserves Pgbouncer and Grafana configuration
Automatically loads baseline initialization script after execution

Clone Database

You can use an existing database as a template to create a new database, enabling quick replication of database structures.

Basic Clone

pg_databases:
  # 1. First define the template database
  - name: app_template
    owner: dbuser_app
    schemas: [core, api]
    extensions: [postgis, pg_trgm]
    baseline: app_schema.sql

  # 2. Create business database using template
  - name: app_prod
    template: app_template
    owner: dbuser_app

Specify Clone Strategy (PG15+)

- name: app_staging
  template: app_template
  strategy: FILE_COPY        # Or WAL_LOG
  owner: dbuser_app

Strategy	Description	Use Case
`FILE_COPY`	Direct data file copy	Large templates, general scenarios
`WAL_LOG`	Copy via WAL logs	Small templates, doesn’t block template connections

Use Custom Template Database

When using non-system templates (not template0/template1), Pigsty automatically terminates connections to the template database to allow cloning.

- name: new_db
  template: existing_db      # Use existing business database as template
  owner: dbuser_app

Mark as Template Database

By default, only superusers or database owners can use regular databases as templates. Using is_template: true allows any user with CREATEDB privilege to clone:

- name: shared_template
  is_template: true          # Allow any user with CREATEDB privilege to clone
  owner: dbuser_app

Use ICU Locale Provider

When using the icu locale provider, you must specify template: template0:

- name: myapp_icu
  template: template0        # Must use template0
  locale_provider: icu
  icu_locale: en-US
  encoding: UTF8

Connection Pool Config

By default, all business databases are added to the Pgbouncer connection pool.

Database-Level Connection Pool Parameters

- name: myapp
  pgbouncer: true              # Include in connection pool (default true)
  pool_mode: transaction       # Pool mode: transaction/session/statement
  pool_size: 64                # Default pool size
  pool_size_min: 0             # Minimum pool size
  pool_reserve: 32             # Reserved connections
  pool_connlimit: 100          # Maximum database connections
  pool_auth_user: dbuser_meta  # Auth query user

Generated Configuration

Configuration file located at /etc/pgbouncer/database.txt:

myapp                       = host=/var/run/postgresql pool_mode=transaction pool_size=64

Hide Databases

Some internal databases may not need connection pool access:

- name: internal_db
  pgbouncer: false           # Don't add to connection pool

Pool Mode Explanation

Mode	Description	Use Case
`transaction`	Return connection after transaction ends (default)	Most OLTP applications
`session`	Return connection after session ends	Applications requiring session state
`statement`	Return connection after statement ends	Stateless queries

Locale Provider

PostgreSQL 15+ introduced the locale_provider parameter, supporting different locale implementations.

Use ICU Provider (PG15+)

- name: myapp_icu
  template: template0        # ICU must use template0
  locale_provider: icu
  icu_locale: en-US          # ICU locale rules
  encoding: UTF8

Use Builtin Provider (PG17+)

- name: myapp_builtin
  template: template0
  locale_provider: builtin
  builtin_locale: C.UTF-8    # Builtin locale rules
  encoding: UTF8

ICU Collation Rules (PG16+)

- name: myapp_custom_icu
  template: template0
  locale_provider: icu
  icu_locale: en-US
  icu_rules: '&V << w <<< W'  # Custom ICU collation rules

Provider Comparison

Provider	Version Requirement	Features
`libc`	-	Traditional, depends on OS
`icu`	PG15+	Cross-platform consistent, feature-rich
`builtin`	PG17+	Most efficient C/C.UTF-8 collation

Quick Reference

Common Commands

Operation	Command
Create database	`bin/pgsql-db <cls> <dbname>`
Modify database	`bin/pgsql-db <cls> <dbname>`
Delete database	Set `state: absent` then run `bin/pgsql-db <cls> <dbname>`
Rebuild database	Set `state: recreate` then run `bin/pgsql-db <cls> <dbname>`
List databases	`psql -c '\l'`
View connection pool databases	`cat /etc/pgbouncer/database.txt`

Common Operation Examples

# Create basic database
- name: myapp
  owner: dbuser_myapp
  comment: my application database

# Create database with extensions
- name: geodata
  owner: dbuser_geo
  extensions: [postgis, postgis_topology]

# Private database with limited connections
- name: secure_db
  owner: dbuser_secure
  revokeconn: true
  connlimit: 10

# Set database-level parameters
- name: analytics
  owner: dbuser_analyst
  parameters:
    work_mem: '512MB'
    statement_timeout: '5min'

# Use ICU locale
- name: i18n_db
  template: template0
  locale_provider: icu
  icu_locale: zh-Hans
  encoding: UTF8

# Delete database
- name: old_db
  state: absent

# Rebuild database
- name: test_db
  state: recreate
  baseline: test_init.sql

Execution Flow

bin/pgsql-db executes these steps in order:

Validate - Check dbname parameter and database definition
Delete (if state=absent/recreate) - Execute DROP DATABASE
Create (if state=create/recreate) - Execute CREATE DATABASE
Configure - Execute ALTER DATABASE to set properties
Initialize - Create schemas, install extensions, execute baseline
Register - Update Pgbouncer and Grafana data sources

For database access permissions, refer to ACL: Database Privileges.

3 - HBA Management

PostgreSQL and Pgbouncer HBA rule management operations: refresh, reload, verify, and troubleshoot.

HBA rule changes require re-rendering configuration files and reloading services. This article covers HBA rule daily management operations.

Quick Reference

Operation	Command
Refresh cluster HBA	`bin/pgsql-hba <cls>`
Refresh specific instances	`bin/pgsql-hba <cls> <ip>...`
Refresh PostgreSQL only	`./pgsql.yml -l <cls> -t pg_hba,pg_reload`
Refresh Pgbouncer only	`./pgsql.yml -l <cls> -t pgbouncer_hba,pgbouncer_reload`
View current HBA	`psql -c "TABLE pg_hba_file_rules"`
Verify HBA config	`psql -c "SELECT pg_reload_conf()"`

Refresh HBA Rules

After modifying HBA rules in pigsty.yml, you need to re-render configuration files and reload services.

Using the Admin Script

The recommended approach is using the bin/pgsql-hba script to refresh PostgreSQL and Pgbouncer HBA in one step:

# Refresh entire cluster's HBA rules
bin/pgsql-hba pg-meta

# Refresh specific instances (multiple IPs separated by spaces)
bin/pgsql-hba pg-meta 10.10.10.10
bin/pgsql-hba pg-meta 10.10.10.11 10.10.10.12

# View script help
bin/pgsql-hba --help

The script internally executes:

./pgsql.yml -l <cluster> -t pg_hba,pg_reload,pgbouncer_hba,pgbouncer_reload

Using Ansible Playbook

Directly use the relevant tags from the pgsql.yml playbook:

# Refresh PostgreSQL HBA and reload
./pgsql.yml -l pg-meta -t pg_hba,pg_reload

# Refresh Pgbouncer HBA and reload
./pgsql.yml -l pg-meta -t pgbouncer_hba,pgbouncer_reload

# Refresh both
./pgsql.yml -l pg-meta -t pg_hba,pg_reload,pgbouncer_hba,pgbouncer_reload

# Use extra variables to force reload
./pgsql.yml -l pg-meta -e pg_reload=true -t pg_hba,pg_reload

Tag	Description
`pg_hba`	Render PostgreSQL HBA configuration file
`pg_reload`	Reload PostgreSQL config (requires `pg_reload=true`)
`pgbouncer_hba`	Render Pgbouncer HBA configuration file
`pgbouncer_reload`	Reload Pgbouncer config

Configuration File Locations

HBA configuration files are rendered by Ansible:

Service	Config File Path	Template File
PostgreSQL	`/pg/data/pg_hba.conf`	`roles/pgsql/templates/pg_hba.conf`
Pgbouncer	`/etc/pgbouncer/pgb_hba.conf`	`roles/pgsql/templates/pgbouncer.hba`

Warning: Don’t edit these files directly—they will be overwritten the next time a playbook runs. All changes should be made in pigsty.yml.

Verify HBA Rules

View Currently Active HBA Rules

# Use psql to view PostgreSQL HBA rules
psql -c "TABLE pg_hba_file_rules"

# Or view the config file directly
cat /pg/data/pg_hba.conf

# View Pgbouncer HBA rules
cat /etc/pgbouncer/pgb_hba.conf

Check HBA Configuration Syntax

# PostgreSQL config reload (validates syntax)
psql -c "SELECT pg_reload_conf()"

# If there are syntax errors, check the logs
tail -f /pg/log/postgresql-*.log

Test Connection Authentication

# Test connection for specific user from specific address
psql -h <host> -p 5432 -U <user> -d <database> -c "SELECT 1"

# See which HBA rule matches the connection
psql -c "SELECT * FROM pg_hba_file_rules WHERE database @> ARRAY['<dbname>']::text[]"

Common Management Scenarios

Add New HBA Rule

Edit pigsty.yml, add rule to the cluster’s pg_hba_rules:

pg-meta:
  vars:
    pg_hba_rules:
      - {user: new_user, db: new_db, addr: '192.168.1.0/24', auth: pwd, title: 'new app access'}

Execute refresh:

bin/pgsql-hba pg-meta

Emergency IP Block

When detecting a malicious IP, quickly add a blocklist rule:

Add high-priority (order: 0) deny rule:

pg_hba_rules:
  - {user: all, db: all, addr: '10.1.1.100/32', auth: deny, order: 0, title: 'emergency block'}

Refresh immediately:

bin/pgsql-hba pg-meta

Role-Based Rules

Configure different HBA rules for primary and replica:

pg_hba_rules:
  # Only primary allows write users
  - {user: writer, db: all, addr: intra, auth: pwd, role: primary, title: 'writer on primary'}

  # Replicas allow read-only users
  - {user: reader, db: all, addr: world, auth: ssl, role: replica, title: 'reader on replica'}

After refresh, rules are automatically enabled or disabled based on the instance’s pg_role.

Refresh HBA After Cluster Expansion

When new instances are added to the cluster, rules using addr: cluster need refresh to include new members:

# Add new instance
./pgsql.yml -l 10.10.10.14

# Refresh all instances' HBA (includes new member IPs)
bin/pgsql-hba pg-meta

Refresh HBA After Failover

After Patroni failover, instance pg_role may not match the configuration. If HBA rules use role filtering:

Update role definitions in pigsty.yml
Refresh HBA rules

# Refresh after updating roles in config file
bin/pgsql-hba pg-meta

Troubleshooting

Connection Rejected

Symptom: FATAL: no pg_hba.conf entry for host "x.x.x.x", user "xxx", database "xxx"

Troubleshooting steps:

Check current HBA rules:

psql -c "TABLE pg_hba_file_rules"

Confirm if client IP, username, database matches any rule
Check rule order (first match wins)
Add corresponding rule and refresh

Authentication Failed

Symptom: FATAL: password authentication failed for user "xxx"

Troubleshooting steps:

Confirm password is correct
Check password encryption method (pg_pwd_enc) compatibility with client
Check if user exists: \du or SELECT * FROM pg_roles WHERE rolname = 'xxx'

HBA Rules Not Taking Effect

Troubleshooting steps:

Confirm refresh command was executed
Check if Ansible execution succeeded
Confirm PostgreSQL reloaded:

psql -c "SELECT pg_reload_conf()"

Check if config file was updated:

head -20 /pg/data/pg_hba.conf

Rule Order Issues

HBA uses first-match-wins logic. If rules aren’t working as expected:

Check order values
Use psql -c "TABLE pg_hba_file_rules" to view actual order
Adjust order values or rule positions

Online HBA Modification (Not Recommended)

While you can directly edit /pg/data/pg_hba.conf and reload, this is not recommended:

# Direct edit (not recommended)
vi /pg/data/pg_hba.conf

# Reload config
psql -c "SELECT pg_reload_conf()"
# Or
pg_ctl reload -D /pg/data
# Or
systemctl reload postgresql

Problem: Manual changes will be overwritten the next time an Ansible playbook runs.

Correct approach: Always modify in pigsty.yml, then run bin/pgsql-hba to refresh.

Pgbouncer HBA Management

Pgbouncer HBA management is similar to PostgreSQL, with some differences:

Configuration Differences

Config file: /etc/pgbouncer/pgb_hba.conf
Doesn’t support db: replication
Authentication method: local connections use peer instead of ident

Refresh Commands

# Refresh Pgbouncer HBA only
./pgsql.yml -l pg-meta -t pgbouncer_hba,pgbouncer_reload

# Or use unified script (refreshes both PostgreSQL and Pgbouncer)
bin/pgsql-hba pg-meta

View Pgbouncer HBA

cat /etc/pgbouncer/pgb_hba.conf

Best Practices

Always manage in config files: Don’t directly edit pg_hba.conf—all changes through pigsty.yml
Verify in test environment first: HBA changes can cause connection issues—verify in test environment first
Use order to control priority: Blocklist rules use order: 0 to ensure priority matching
Refresh promptly: Refresh HBA after adding/removing instances or failover
Principle of least privilege: Only open necessary access—avoid addr: world + auth: trust
Monitor authentication failures: Watch for authentication failures in pg_stat_activity
Backup configuration: Backup pigsty.yml before important changes

Command Quick Reference

# Refresh HBA (recommended)
bin/pgsql-hba <cluster>

# View PostgreSQL HBA
psql -c "TABLE pg_hba_file_rules"
cat /pg/data/pg_hba.conf

# View Pgbouncer HBA
cat /etc/pgbouncer/pgb_hba.conf

# Reload PostgreSQL config
psql -c "SELECT pg_reload_conf()"

# Test connection
psql -h <host> -U <user> -d <db> -c "SELECT 1"

# View authentication failure logs
tail -f /pg/log/postgresql-*.log | grep -i auth

HBA Configuration: HBA rule configuration syntax and parameter details
User Management: User and role management operations
Access Control: Role system and permission model
Security & Compliance: PostgreSQL cluster security features

4 - SOP

Common PostgreSQL administration procedures in Pigsty for maintaining production database clusters.

This document organizes common PostgreSQL administration procedures in Pigsty for maintaining production database clusters.

Here are the standard operating procedures for common PostgreSQL administration tasks:

Case 1: Create Cluster
Case 2: Create User
Case 3: Delete User
Case 4: Create Database
Case 5: Reload Service
Case 6: Reload HBA
Case 7: Config Cluster
Case 8: Append Replica
Case 9: Remove Replica
Case 10: Remove Cluster
Case 11: Switchover
Case 12: Backup Cluster
Case 13: Restore Cluster
Case 14: Adding Packages
Case 15: Install Extension
Case 16: Minor Upgrade
Case 17: Major Upgrade

Cheatsheet

PGSQL playbooks and shortcuts:

bin/pgsql-add   <cls>                   # create pgsql cluster <cls>
bin/pgsql-user  <cls> <username>        # create pg user <username> on <cls>
bin/pgsql-db    <cls> <dbname>          # create pg database <dbname> on <cls>
bin/pgsql-svc   <cls> [...ip]           # reload pg service of cluster <cls>
bin/pgsql-hba   <cls> [...ip]           # reload postgres/pgbouncer HBA rules of cluster <cls>
bin/pgsql-add   <cls> [...ip]           # append replicas for cluster <cls>
bin/pgsql-rm    <cls> [...ip]           # remove replicas from cluster <cls>
bin/pgsql-rm    <cls>                   # remove pgsql cluster <cls>

Patroni admin command and shortcuts:

pg list        <cls>                    # print cluster info
pg edit-config <cls>                    # edit cluster config
pg reload      <cls> [ins]              # reload cluster config
pg restart     <cls> [ins]              # restart pgsql cluster
pg reinit      <cls> [ins]              # reinit cluster members
pg pause       <cls>                    # entering maintenance mode (no auto failover)
pg resume      <cls>                    # exiting maintenance mode
pg switchover  <cls>                    # switchover on cluster <cls> (primary is healthy)
pg failover    <cls>                    # failover on cluster <cls> (primary failed)

pgBackRest backup/restore command and shortcuts:

pb info                                 # print pgbackrest repo info
pg-backup                               # make a backup, incr, or full backup if necessary
pg-backup full                          # make a full backup
pg-backup diff                          # make a differential backup
pg-backup incr                          # make a incremental backup
pg-pitr -i                              # restore to most recent backup completion time (not common)
pg-pitr --time="2022-12-30 14:44:44+08" # restore to specific time point (e.g., in case of table/database drop)
pg-pitr --name="my-restore-point"       # restore to named restore point created by pg_create_restore_point
pg-pitr --lsn="0/7C82CB8" -X            # restore immediately before LSN
pg-pitr --xid="1234567" -X -P           # restore immediately before specific transaction ID, then promote to primary
pg-pitr --backup=latest                 # restore to latest backup set
pg-pitr --backup=20221108-105325        # restore to specific backup set, can be checked with pgbackrest info

Systemd components quick reference:

systemctl stop patroni                  # start stop restart reload
systemctl stop pgbouncer                # start stop restart reload
systemctl stop pg_exporter              # start stop restart reload
systemctl stop pgbouncer_exporter       # start stop restart reload
systemctl stop node_exporter            # start stop restart
systemctl stop haproxy                  # start stop restart reload
systemctl stop vip-manager              # start stop restart reload
systemctl stop postgres                 # only when patroni_mode == 'remove'

Create Cluster

To create a new Postgres cluster, first define it in the inventory, then initialize:

bin/node-add <cls>                # init nodes for cluster <cls>           # ./node.yml  -l <cls>
bin/pgsql-add <cls>               # init pgsql instances of cluster <cls>  # ./pgsql.yml -l <cls>

Note: PGSQL module requires managed nodes. Use bin/node-add to manage nodes first.

Example: Create Cluster

Create User

To create a new business user on an existing Postgres cluster, add the user definition to all.children.<cls>.pg_users, then create it using:

bin/pgsql-user <cls> <username>   # ./pgsql-user.yml -l <cls> -e username=<username>

Example: Create Business User

Delete User

To delete a user from an existing Postgres cluster, set the user’s state to absent in the definition, then execute:

bin/pgsql-user <cls> <username>   # ./pgsql-user.yml -l <cls> -e username=<username>

For example, to delete dbuser_test user:

pg_users:
  - name: dbuser_test
    state: absent

The deletion process will:

Use the pg-drop-role script to safely delete the user
Automatically disable user login and terminate active connections
Automatically transfer database/tablespace ownership to postgres
Automatically handle object ownership and permissions in all databases
Revoke all role memberships
Create an audit log for traceability
Remove the user from the Pgbouncer user list (if previously added)
Reload Pgbouncer configuration

Protected System Users:

The following system users cannot be deleted via state: absent and will be automatically skipped:

postgres (superuser)
replicator (or the user configured in pg_replication_username)
dbuser_dba (or the user configured in pg_admin_username)
dbuser_monitor (or the user configured in pg_monitor_username)

Example: pg-drop-role Script Usage

# Check user dependencies (read-only operation)
pg-drop-role dbuser_old --check

# Preview deletion operation (don't actually execute)
pg-drop-role dbuser_old --dry-run -v

# Delete user, transfer objects to postgres
pg-drop-role dbuser_old

# Delete user, transfer objects to specified user
pg-drop-role dbuser_old dbuser_new

# Force delete (terminate active connections)
pg-drop-role dbuser_old --force

Create Database

To create a new database on an existing Postgres cluster, add the database definition to all.children.<cls>.pg_databases, then create the database as follows:

bin/pgsql-db <cls> <dbname>       # ./pgsql-db.yml -l <cls> -e dbname=<dbname>

Note: If the database specifies a non-default owner, the owner user must already exist, otherwise you must Create User first.

Example: Create Business Database

Reload Service

Services are access points exposed by PostgreSQL (reachable via PGURL), served by HAProxy on host nodes.

Use this task when cluster membership changes, for example: append/remove replicas, switchover/failover / exposing new services, or updating existing service configurations (e.g., LB weights)

To create new services or reload existing services on entire proxy cluster or specific instances:

bin/pgsql-svc <cls>               # pgsql.yml -l <cls> -t pg_service -e pg_reload=true
bin/pgsql-svc <cls> [ip...]       # pgsql.yml -l ip... -t pg_service -e pg_reload=true

Example: Reload PG Service to Remove an Instance

Reload HBA

When your Postgres/Pgbouncer HBA rules change, you may need to reload HBA to apply the changes.

If you have any role-specific HBA rules, or IP address ranges referencing cluster member aliases, you may also need to reload HBA after switchover/cluster scaling.

To reload postgres and pgbouncer HBA rules on entire cluster or specific instances:

bin/pgsql-hba <cls>               # pgsql.yml -l <cls> -t pg_hba,pg_reload,pgbouncer_hba,pgbouncer_reload -e pg_reload=true
bin/pgsql-hba <cls> [ip...]       # pgsql.yml -l ip... -t pg_hba,pg_reload,pgbouncer_hba,pgbouncer_reload -e pg_reload=true

Example: Reload Cluster HBA Rules

Config Cluster

To change configuration of an existing Postgres cluster, you need to issue control commands on the admin node using the admin user (the user who installed Pigsty, with nopass ssh/sudo):

Alternatively, on any node in the database cluster, using dbsu (default postgres), you can execute admin commands, but only for this cluster.

pg edit-config <cls>              # interactive config a cluster with patronictl

Change patroni parameters and postgresql.parameters, save and apply changes according to prompts.

Example: Non-Interactive Cluster Configuration

You can skip interactive mode and override postgres parameters using the -p option, for example:

pg edit-config -p log_min_duration_statement=1000 pg-test
pg edit-config --force -p shared_preload_libraries='timescaledb, pg_cron, pg_stat_statements, auto_explain'

Example: Change Cluster Config Using Patroni REST API

You can also use the Patroni REST API to change configuration non-interactively, for example:

$ curl -s 10.10.10.11:8008/config | jq .  # get current config
$ curl -u 'postgres:Patroni.API' \
        -d '{"postgresql":{"parameters": {"log_min_duration_statement":200}}}' \
        -s -X PATCH http://10.10.10.11:8008/config | jq .

Note: Patroni sensitive API access (e.g., restart) is restricted to requests from infra/admin nodes, with HTTP basic authentication (username/password) and optional HTTPS protection.

Example: Configure Cluster with patronictl

Append Replica

To add a new replica to an existing PostgreSQL cluster, add its definition to the inventory all.children.<cls>.hosts, then:

bin/node-add <ip>                 # add node <ip> to Pigsty management
bin/pgsql-add <cls> <ip>          # init <ip> as new replica of cluster <cls>

This will add node <ip> to pigsty and initialize it as a replica of cluster <cls>.

Cluster services will be reloaded to accept the new member.

Example: Add Replica to pg-test

For example, if you want to add pg-test-3 / 10.10.10.13 to existing cluster pg-test, first update the inventory:

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary } # existing member
    10.10.10.12: { pg_seq: 2, pg_role: replica } # existing member
    10.10.10.13: { pg_seq: 3, pg_role: replica } # <--- new member
  vars: { pg_cluster: pg-test }

Then apply the changes as follows:

bin/node-add          10.10.10.13   # add node to pigsty
bin/pgsql-add pg-test 10.10.10.13   # init new replica for cluster pg-test on 10.10.10.13

This is similar to cluster initialization but works on a single instance:

[ OK ] Initialize instance 10.10.10.11 in pgsql cluster 'pg-test':
[WARN]   Reminder: add nodes to pigsty first, then install module 'pgsql'
[HINT]     $ bin/node-add  10.10.10.11  # run this first except for infra nodes
[WARN]   Init instance from cluster:
[ OK ]     $ ./pgsql.yml -l '10.10.10.11,&pg-test'
[WARN]   Reload pg_service on existing instances:
[ OK ]     $ ./pgsql.yml -l 'pg-test,!10.10.10.11' -t pg_service

Remove Replica

To remove a replica from an existing PostgreSQL cluster:

bin/pgsql-rm <cls> <ip...>        # ./pgsql-rm.yml -l <ip>

This will remove instance <ip> from cluster <cls>. Cluster services will be reloaded to remove the instance from load balancers.

Example: Remove Replica from pg-test

For example, if you want to remove pg-test-3 / 10.10.10.13 from existing cluster pg-test:

bin/pgsql-rm pg-test 10.10.10.13  # remove pgsql instance 10.10.10.13 from pg-test
bin/node-rm  10.10.10.13          # remove node from pigsty (optional)
vi pigsty.yml                     # remove instance definition from inventory
bin/pgsql-svc pg-test             # refresh pg_service on existing instances to remove from load balancer

[ OK ] Remove pgsql instance 10.10.10.13 from 'pg-test':
[WARN]   Remove instance from cluster:
[ OK ]     $ ./pgsql-rm.yml -l '10.10.10.13,&pg-test'

And remove the instance definition from inventory:

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
    10.10.10.12: { pg_seq: 2, pg_role: replica }
    10.10.10.13: { pg_seq: 3, pg_role: replica } # <--- remove this line after execution
  vars: { pg_cluster: pg-test }

Finally, you can reload PG service to remove the instance from load balancers:

bin/pgsql-svc pg-test             # reload service on pg-test

Remove Cluster

To remove an entire Postgres cluster, simply run:

bin/pgsql-rm <cls>                # ./pgsql-rm.yml -l <cls>

Example: Remove Cluster

Example: Force Remove Cluster

Note: If pg_safeguard is configured for this cluster (or globally set to true), pgsql-rm.yml will abort to avoid accidental cluster removal.

You can explicitly override it with playbook command line parameters to force removal:

./pgsql-rm.yml -l pg-meta -e pg_safeguard=false    # force remove pg cluster pg-meta

Switchover

You can use the patroni command line tool to perform PostgreSQL cluster switchover.

pg switchover <cls>   # interactive mode, you can skip the wizard with the following parameter combination
pg switchover --leader pg-test-1 --candidate=pg-test-2 --scheduled=now --force pg-test

Example: pg-test Switchover

$ pg switchover pg-test
Master [pg-test-1]:
Candidate ['pg-test-2', 'pg-test-3'] []: pg-test-2
When should the switchover take place (e.g. 2022-12-26T07:39 )  [now]: now
Current cluster topology
+ Cluster: pg-test (7181325041648035869) -----+----+-----------+-----------------+
| Member    | Host        | Role    | State   | TL | Lag in MB | Tags            |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-1 | 10.10.10.11 | Leader  | running |  1 |           | clonefrom: true |
|           |             |         |         |    |           | conf: tiny.yml  |
|           |             |         |         |    |           | spec: 1C.2G.50G |
|           |             |         |         |    |           | version: '15'   |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-2 | 10.10.10.12 | Replica | running |  1 |         0 | clonefrom: true |
|           |             |         |         |    |           | conf: tiny.yml  |
|           |             |         |         |    |           | spec: 1C.2G.50G |
|           |             |         |         |    |           | version: '15'   |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-3 | 10.10.10.13 | Replica | running |  1 |         0 | clonefrom: true |
|           |             |         |         |    |           | conf: tiny.yml  |
|           |             |         |         |    |           | spec: 1C.2G.50G |
|           |             |         |         |    |           | version: '15'   |
+-----------+-------------+---------+---------+----+-----------+-----------------+
Are you sure you want to switchover cluster pg-test, demoting current master pg-test-1? [y/N]: y
2022-12-26 06:39:58.02468 Successfully switched over to "pg-test-2"
+ Cluster: pg-test (7181325041648035869) -----+----+-----------+-----------------+
| Member    | Host        | Role    | State   | TL | Lag in MB | Tags            |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-1 | 10.10.10.11 | Replica | stopped |    |   unknown | clonefrom: true |
|           |             |         |         |    |           | conf: tiny.yml  |
|           |             |         |         |    |           | spec: 1C.2G.50G |
|           |             |         |         |    |           | version: '15'   |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-2 | 10.10.10.12 | Leader  | running |  1 |           | clonefrom: true |
|           |             |         |         |    |           | conf: tiny.yml  |
|           |             |         |         |    |           | spec: 1C.2G.50G |
|           |             |         |         |    |           | version: '15'   |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-3 | 10.10.10.13 | Replica | running |  1 |         0 | clonefrom: true |
|           |             |         |         |    |           | conf: tiny.yml  |
|           |             |         |         |    |           | spec: 1C.2G.50G |
|           |             |         |         |    |           | version: '15'   |
+-----------+-------------+---------+---------+----+-----------+-----------------+

To perform this via Patroni API (e.g., switch primary from instance 2 to instance 1 at a specified time):

curl -u 'postgres:Patroni.API' \
  -d '{"leader":"pg-test-2", "candidate": "pg-test-1","scheduled_at":"2022-12-26T14:47+08"}' \
  -s -X POST http://10.10.10.11:8008/switchover

After either switchover or failover, you need to refresh services and HBA rules after cluster membership changes. You should complete this promptly (e.g., within a few hours or a day) after the change:

bin/pgsql-svc <cls>
bin/pgsql-hba <cls>

Backup Cluster

To create backups using pgBackRest, run the following commands as local dbsu (default postgres):

pg-backup       # make a backup, incremental or full if necessary
pg-backup full  # make a full backup
pg-backup diff  # make a differential backup
pg-backup incr  # make an incremental backup
pb info         # print backup info (pgbackrest info)

See Backup & Restore for more information.

Example: Create Backup

Example: Create Scheduled Backup Task

You can add crontab to node_crontab to specify your backup strategy.

# Full backup daily at 1 AM
- '00 01 * * * postgres /pg/bin/pg-backup full'

# Full backup on Monday at 1 AM, incremental backups on other weekdays
- '00 01 * * 1 postgres /pg/bin/pg-backup full'
- '00 01 * * 2,3,4,5,6,7 postgres /pg/bin/pg-backup'

Restore Cluster

To restore a cluster to a previous point in time (PITR), run the Pigsty helper script pg-pitr as local dbsu user (default postgres):

pg-pitr -i                              # restore to most recent backup completion time (not common)
pg-pitr --time="2022-12-30 14:44:44+08" # restore to specific time point (e.g., in case of table/database drop)
pg-pitr --name="my-restore-point"       # restore to named restore point created by pg_create_restore_point
pg-pitr --lsn="0/7C82CB8" -X            # restore immediately before LSN
pg-pitr --xid="1234567" -X -P           # restore immediately before specific transaction ID, then promote cluster to primary
pg-pitr --backup=latest                 # restore to latest backup set
pg-pitr --backup=20221108-105325        # restore to specific backup set, can be listed with pgbackrest info

The command will output an operations manual, follow the instructions. See Backup & Restore - PITR for details.

Example: PITR Using Raw pgBackRest Commands

# Restore to latest available point (e.g., hardware failure)
pgbackrest --stanza=pg-meta restore

# PITR to specific time point (e.g., accidental table drop)
pgbackrest --stanza=pg-meta --type=time --target="2022-11-08 10:58:48" \
   --target-action=promote restore

# Restore specific backup point, then promote (or pause|shutdown)
pgbackrest --stanza=pg-meta --type=immediate --target-action=promote \
  --set=20221108-105325F_20221108-105938I restore

Adding Packages

To add new RPM packages, add them to repo_packages and repo_url_packages.

Use ./infra.yml -t repo_build subtask to rebuild local repo on Infra node. Then you can install these packages using ansible’s package module:

ansible pg-test -b -m package -a "name=pg_cron_15,topn_15,pg_stat_monitor_15*"  # install some packages with ansible

Example: Manually Update Packages in Local Repo

# Add upstream repo on infra/admin node, then manually download required packages
cd ~/pigsty; ./infra.yml -t repo_upstream,repo_cache # add upstream repo (internet)
cd /www/pigsty;  repotrack "some_new_package_name"   # download latest RPM packages

# Update local repo metadata
cd ~/pigsty; ./infra.yml -t repo_create              # recreate local repo
./node.yml -t node_repo                              # refresh YUM/APT cache on all nodes

# You can also manually refresh YUM/APT cache on nodes using Ansible
ansible all -b -a 'yum clean all'                    # clean node repo cache
ansible all -b -a 'yum makecache'                    # rebuild yum/apt cache from new repo
ansible all -b -a 'apt clean'                        # clean APT cache (Ubuntu/Debian)
ansible all -b -a 'apt update'                       # rebuild APT cache (Ubuntu/Debian)

For example, you can install or upgrade packages as follows:

ansible pg-test -b -m package -a "name=postgresql15* state=latest"

Install Extension

If you want to install extensions on a PostgreSQL cluster, add them to pg_extensions, then execute:

./pgsql.yml -t pg_extension     # install extensions

Some extensions need to be loaded in shared_preload_libraries to take effect. You can add them to pg_libs, or configure an existing cluster.

Finally, execute CREATE EXTENSION <extname>; on the cluster’s primary to complete extension installation.

Example: Install pg_cron Extension on pg-test Cluster

ansible pg-test -b -m package -a "name=pg_cron_15"          # install pg_cron package on all nodes
# Add pg_cron to shared_preload_libraries
pg edit-config --force -p shared_preload_libraries='timescaledb, pg_cron, pg_stat_statements, auto_explain'
pg restart --force pg-test                                  # restart cluster
psql -h pg-test -d postgres -c 'CREATE EXTENSION pg_cron;'  # install pg_cron on primary

For more details, see PGSQL Extension Installation.

Minor Upgrade

To perform minor version upgrade/downgrade, first add packages to the local repo: latest PG minor version RPM/DEB.

First perform rolling upgrade/downgrade on all replicas, then perform cluster switchover to upgrade/downgrade the primary.

ansible <cls> -b -a "yum upgrade/downgrade -y <pkg>"    # upgrade/downgrade packages
pg restart --force <cls>                                # restart cluster

Example: Downgrade PostgreSQL 15.2 to 15.1

Add 15.1 packages to repo and refresh nodes’ yum/apt cache:

cd ~/pigsty; ./infra.yml -t repo_upstream               # add upstream repo
cd /www/pigsty; repotrack postgresql15-*-15.1           # add 15.1 packages to yum repo
cd ~/pigsty; ./infra.yml -t repo_create                 # rebuild repo metadata
ansible pg-test -b -a 'yum clean all'                   # clean node repo cache
ansible pg-test -b -a 'yum makecache'                   # rebuild yum cache from new repo

# For Ubuntu/Debian users, use apt instead of yum
ansible pg-test -b -a 'apt clean'                       # clean node repo cache
ansible pg-test -b -a 'apt update'                      # rebuild apt cache from new repo

Execute downgrade and restart cluster:

ansible pg-test -b -a "yum downgrade -y postgresql15*"  # downgrade packages
pg restart --force pg-test                              # restart entire cluster to complete upgrade

Example: Upgrade PostgreSQL 15.1 Back to 15.2

This time we’ll do a rolling upgrade:

ansible pg-test -b -a "yum upgrade -y postgresql15*"    # upgrade packages (or apt upgrade)
ansible pg-test -b -a '/usr/pgsql/bin/pg_ctl --version' # check binary version is 15.2
pg restart --role replica --force pg-test               # restart replicas
pg switchover --leader pg-test-1 --candidate=pg-test-2 --scheduled=now --force pg-test    # switchover
pg restart --role primary --force pg-test               # restart primary

Major Upgrade

The easiest way to perform a major upgrade is to create a new cluster using the new version, then perform online migration through logical replication and blue-green deployment.

You can also perform in-place major upgrades. When using only the database kernel itself, this is not complicated - use PostgreSQL’s built-in pg_upgrade:

Suppose you want to upgrade PostgreSQL major version from 14 to 15. First add packages to the repo and ensure core extension plugins are installed with the same version numbers on both major versions.

./pgsql.yml -t pg_pkg -e pg_version=15                         # install pg 15 packages
sudo su - postgres; mkdir -p /data/postgres/pg-meta-15/data/   # prepare directory for 15
pg_upgrade -b /usr/pgsql-14/bin/ -B /usr/pgsql-15/bin/ -d /data/postgres/pg-meta-14/data/ -D /data/postgres/pg-meta-15/data/ -v -c # precheck
pg_upgrade -b /usr/pgsql-14/bin/ -B /usr/pgsql-15/bin/ -d /data/postgres/pg-meta-14/data/ -D /data/postgres/pg-meta-15/data/ --link -j8 -v -c
rm -rf /usr/pgsql; ln -s /usr/pgsql-15 /usr/pgsql;             # fix binary link
mv /data/postgres/pg-meta-14 /data/postgres/pg-meta-15         # rename data directory
rm -rf /pg; ln -s /data/postgres/pg-meta-15 /pg                # fix data directory link

5 - Cluster Management

Standard operation guide for creating/destroying PostgreSQL clusters and scaling existing clusters.

Create Cluster

To create a new Postgres cluster, first define it in the inventory, then initialize:

bin/node-add <cls>                # init nodes for cluster <cls>           # ./node.yml  -l <cls>
bin/pgsql-add <cls>               # init pgsql instances of cluster <cls>  # ./pgsql.yml -l <cls>

Note: PGSQL module requires managed nodes. Use bin/node-add to manage nodes first.

Example: Create Cluster

Create User

To create a new business user on an existing Postgres cluster, add the user definition to all.children.<cls>.pg_users, then create it using:

bin/pgsql-user <cls> <username>   # ./pgsql-user.yml -l <cls> -e username=<username>

Example: Create Business User

Create Database

To create a new database on an existing Postgres cluster, add the database definition to all.children.<cls>.pg_databases, then create the database as follows:

bin/pgsql-db <cls> <dbname>       # ./pgsql-db.yml -l <cls> -e dbname=<dbname>

Note: If the database specifies a non-default owner, the owner user must already exist, otherwise you must Create User first.

Example: Create Business Database

Reload Service

Services are access points exposed by PostgreSQL (reachable via PGURL), served by HAProxy on host nodes.

Use this task when cluster membership changes, for example: append/remove replicas, switchover/failover / exposing new services, or updating existing service configurations (e.g., LB weights)

To create new services or reload existing services on entire proxy cluster or specific instances:

bin/pgsql-svc <cls>               # pgsql.yml -l <cls> -t pg_service -e pg_reload=true
bin/pgsql-svc <cls> [ip...]       # pgsql.yml -l ip... -t pg_service -e pg_reload=true

Example: Reload PG Service to Remove an Instance

Reload HBA

When your Postgres/Pgbouncer HBA rules change, you may need to reload HBA to apply the changes.

If you have any role-specific HBA rules, or IP address ranges referencing cluster member aliases, you may also need to reload HBA after switchover/cluster scaling.

To reload postgres and pgbouncer HBA rules on entire cluster or specific instances:

bin/pgsql-hba <cls>               # pgsql.yml -l <cls> -t pg_hba,pg_reload,pgbouncer_hba,pgbouncer_reload -e pg_reload=true
bin/pgsql-hba <cls> [ip...]       # pgsql.yml -l ip... -t pg_hba,pg_reload,pgbouncer_hba,pgbouncer_reload -e pg_reload=true

Example: Reload Cluster HBA Rules

Config Cluster

To change configuration of an existing Postgres cluster, you need to issue control commands on the admin node using the admin user (the user who installed Pigsty, with nopass ssh/sudo):

Alternatively, on any node in the database cluster, using dbsu (default postgres), you can execute admin commands, but only for this cluster.

pg edit-config <cls>              # interactive config a cluster with patronictl

Change patroni parameters and postgresql.parameters, save and apply changes according to prompts.

Example: Non-Interactive Cluster Configuration

You can skip interactive mode and override postgres parameters using the -p option, for example:

pg edit-config -p log_min_duration_statement=1000 pg-test
pg edit-config --force -p shared_preload_libraries='timescaledb, pg_cron, pg_stat_statements, auto_explain'

Example: Change Cluster Config Using Patroni REST API

You can also use the Patroni REST API to change configuration non-interactively, for example:

$ curl -s 10.10.10.11:8008/config | jq .  # get current config
$ curl -u 'postgres:Patroni.API' \
        -d '{"postgresql":{"parameters": {"log_min_duration_statement":200}}}' \
        -s -X PATCH http://10.10.10.11:8008/config | jq .

Note: Patroni sensitive API access (e.g., restart) is restricted to requests from infra/admin nodes, with HTTP basic authentication (username/password) and optional HTTPS protection.

Example: Configure Cluster with patronictl

Append Replica

To add a new replica to an existing PostgreSQL cluster, add its definition to the inventory all.children.<cls>.hosts, then:

bin/node-add <ip>                 # add node <ip> to Pigsty management
bin/pgsql-add <cls> <ip>          # init <ip> as new replica of cluster <cls>

This will add node <ip> to pigsty and initialize it as a replica of cluster <cls>.

Cluster services will be reloaded to accept the new member.

Example: Add Replica to pg-test

For example, if you want to add pg-test-3 / 10.10.10.13 to existing cluster pg-test, first update the inventory:

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary } # existing member
    10.10.10.12: { pg_seq: 2, pg_role: replica } # existing member
    10.10.10.13: { pg_seq: 3, pg_role: replica } # <--- new member
  vars: { pg_cluster: pg-test }

Then apply the changes as follows:

bin/node-add          10.10.10.13   # add node to pigsty
bin/pgsql-add pg-test 10.10.10.13   # init new replica for cluster pg-test on 10.10.10.13

This is similar to cluster initialization but works on a single instance:

[ OK ] Initialize instance 10.10.10.11 in pgsql cluster 'pg-test':
[WARN]   Reminder: add nodes to pigsty first, then install module 'pgsql'
[HINT]     $ bin/node-add  10.10.10.11  # run this first except for infra nodes
[WARN]   Init instance from cluster:
[ OK ]     $ ./pgsql.yml -l '10.10.10.11,&pg-test'
[WARN]   Reload pg_service on existing instances:
[ OK ]     $ ./pgsql.yml -l 'pg-test,!10.10.10.11' -t pg_service

Remove Replica

To remove a replica from an existing PostgreSQL cluster:

bin/pgsql-rm <cls> <ip...>        # ./pgsql-rm.yml -l <ip>

This will remove instance <ip> from cluster <cls>. Cluster services will be reloaded to remove the instance from load balancers.

Example: Remove Replica from pg-test

For example, if you want to remove pg-test-3 / 10.10.10.13 from existing cluster pg-test:

bin/pgsql-rm pg-test 10.10.10.13  # remove pgsql instance 10.10.10.13 from pg-test
bin/node-rm  10.10.10.13          # remove node from pigsty (optional)
vi pigsty.yml                     # remove instance definition from inventory
bin/pgsql-svc pg-test             # refresh pg_service on existing instances to remove from load balancer

[ OK ] Remove pgsql instance 10.10.10.13 from 'pg-test':
[WARN]   Remove instance from cluster:
[ OK ]     $ ./pgsql-rm.yml -l '10.10.10.13,&pg-test'

And remove the instance definition from inventory:

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
    10.10.10.12: { pg_seq: 2, pg_role: replica }
    10.10.10.13: { pg_seq: 3, pg_role: replica } # <--- remove this line after execution
  vars: { pg_cluster: pg-test }

Finally, you can reload PG service to remove the instance from load balancers:

bin/pgsql-svc pg-test             # reload service on pg-test

Remove Cluster

To remove an entire Postgres cluster, simply run:

bin/pgsql-rm <cls>                # ./pgsql-rm.yml -l <cls>

Example: Remove Cluster

Example: Force Remove Cluster

Note: If pg_safeguard is configured for this cluster (or globally set to true), pgsql-rm.yml will abort to avoid accidental cluster removal.

You can explicitly override it with playbook command line parameters to force removal:

./pgsql-rm.yml -l pg-meta -e pg_safeguard=false    # force remove pg cluster pg-meta

6 - User Management

Creating PostgreSQL users/roles, managing connection pool roles, refreshing expiration times, user password rotation

Creating Users

To create a new business user on an existing Postgres cluster, add the user definition to all.children.<cls>.pg_users, then create it using the following command:

bin/pgsql-user <cls> <username>   # ./pgsql-user.yml -l <cls> -e username=<username>

Example: Creating a business user

Defining Users

Pigsty defines roles and users in database clusters through two configuration parameters:

pg_default_roles: Defines globally unified roles and users
pg_users: Defines business users and roles at the database cluster level

The former is used to define roles and users shared across the entire environment, while the latter defines business roles and users specific to individual clusters. Both have the same format, being arrays of user definition objects.

You can define multiple users/roles. They will be created sequentially first globally, then by cluster, and finally in array order, so later users can belong to previously defined roles.

Below is the business user definition in the default cluster pg-meta in the Pigsty demo environment:

pg-meta:
  hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
  vars:
    pg_cluster: pg-meta
    pg_users:
      - {name: dbuser_meta     ,password: DBUser.Meta     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: pigsty admin user }
      - {name: dbuser_view     ,password: DBUser.Viewer   ,pgbouncer: true ,roles: [dbrole_readonly] ,comment: read-only viewer for meta database }
      - {name: dbuser_grafana  ,password: DBUser.Grafana  ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for grafana database    }
      - {name: dbuser_bytebase ,password: DBUser.Bytebase ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for bytebase database   }
      - {name: dbuser_kong     ,password: DBUser.Kong     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for kong api gateway    }
      - {name: dbuser_gitea    ,password: DBUser.Gitea    ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for gitea service       }
      - {name: dbuser_wiki     ,password: DBUser.Wiki     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for wiki.js service     }
      - {name: dbuser_noco     ,password: DBUser.Noco     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for nocodb service      }

Each user/role definition is an object that may include the following fields, using the dbuser_meta user as an example:

- name: dbuser_meta               # Required, `name` is the only mandatory field in a user definition
  password: DBUser.Meta           # Optional, password, can be a scram-sha-256 hash string or plaintext
  login: true                     # Optional, can log in by default
  superuser: false                # Optional, default is false, is this a superuser?
  createdb: false                 # Optional, default is false, can create databases?
  createrole: false               # Optional, default is false, can create roles?
  inherit: true                   # Optional, by default, can this role use inherited permissions?
  replication: false              # Optional, default is false, can this role perform replication?
  bypassrls: false                # Optional, default is false, can this role bypass row-level security?
  pgbouncer: true                 # Optional, default is false, add this user to the pgbouncer user list? (production users using connection pooling should explicitly set to true)
  connlimit: -1                   # Optional, user connection limit, default -1 disables limit
  expire_in: 3650                 # Optional, expiration time for this role: calculated as created time + n days (higher priority than expire_at)
  expire_at: '2030-12-31'         # Optional, time point when this role expires, specify a specific date using YYYY-MM-DD format string (lower priority than expire_in)
  comment: pigsty admin user      # Optional, description and comment string for this user/role
  roles: [dbrole_admin]           # Optional, default roles are: dbrole_{admin,readonly,readwrite,offline}
  parameters: {}                  # Optional, configure role-level database parameters for this role using `ALTER ROLE SET`
  pool_mode: transaction          # Optional, pgbouncer pool mode defaulting to transaction, at user level
  pool_connlimit: -1              # Optional, maximum database connections at user level, default -1 disables limit
  search_path: public             # Optional, key-value configuration parameters according to postgresql documentation (e.g., use pigsty as default search_path)

The only required field is name, which should be a valid and unique username in the PostgreSQL cluster.
Roles don’t need a password, but for login-enabled business users, it’s usually necessary to specify a password.
password can be plaintext or a scram-sha-256 / md5 hash string; please avoid using plaintext passwords.
Users/roles are created sequentially in array order, so ensure role/group definitions come before members.
login, superuser, createdb, createrole, inherit, replication, bypassrls are boolean flags.
pgbouncer is disabled by default: to add business users to the pgbouncer user list, you should explicitly set it to true.

ACL System

Pigsty has a built-in, out-of-the-box access control / ACL system. You can easily use it by assigning the following four default roles to business users:

dbrole_readwrite: Role with global read-write access (production accounts primarily used by business should have database read-write permissions)
dbrole_readonly: Role with global read-only access (if other businesses want read-only access, they can use this role)
dbrole_admin: Role with DDL permissions (business administrators, scenarios requiring table creation in applications)
dbrole_offline: Role with restricted read-only access (can only access offline instances, typically for personal users)

If you want to redesign your own ACL system, consider customizing the following parameters and templates:

pg_default_roles: System-wide roles and global users
pg_default_privileges: Default privileges for newly created objects
roles/pgsql/templates/pg-init-role.sql: Role creation SQL template
roles/pgsql/templates/pg-init-template.sql: Privilege SQL template

Creating Users

Users and roles defined in pg_default_roles and pg_users will be automatically created sequentially during the PROVISION phase of cluster initialization. If you want to create users on an existing cluster, you can use the bin/pgsql-user tool. Add the new user/role definition to all.children.<cls>.pg_users and create the database using the following method:

bin/pgsql-user <cls> <username>    # pgsql-user.yml -l <cls> -e username=<username>

Unlike databases, the user creation playbook is always idempotent. When the target user already exists, Pigsty will modify the target user’s attributes to conform to the configuration. So running it repeatedly on existing clusters typically won’t cause issues.

Please Use Playbook to Create Users

We do not recommend manually creating new business users, especially when you want to create users that use the default pgbouncer connection pool: unless you’re willing to manually maintain the user list in Pgbouncer and keep it consistent with PostgreSQL. When creating a new database using the bin/pgsql-user tool or the pgsql-user.yml playbook, this database will also be added to the Pgbouncer Users list.

Modifying Users

The method for modifying PostgreSQL user attributes is the same as Creating Users.

First, adjust your user definition, modify the attributes that need adjustment, then execute the following command to apply:

bin/pgsql-user <cls> <username>    # pgsql-user.yml -l <cls> -e username=<username>

Note that modifying users does not delete users but modifies user attributes using the ALTER USER command; it also doesn’t revoke user permissions and groups, and uses the GRANT command to grant new roles.

Deleting Users

To delete a user, set its state to absent and execute the playbook:

pg_users:
  - name: dbuser_old
    state: absent

bin/pgsql-user <cls> dbuser_old

The deletion process will:

Use the pg-drop-role script to safely delete the user
Automatically disable user login and terminate active connections
Automatically transfer database/tablespace ownership to postgres
Automatically handle object ownership and permissions in all databases
Revoke all role memberships
Create an audit log for traceability
Remove the user from the Pgbouncer user list (if previously added)
Reload Pgbouncer configuration

Protected System Users:

The following system users cannot be deleted via state: absent and will be automatically skipped:

postgres (superuser)
replicator (or the user configured in pg_replication_username)
dbuser_dba (or the user configured in pg_admin_username)
dbuser_monitor (or the user configured in pg_monitor_username)

Safe Deletion

Pigsty uses the pg-drop-role script to safely delete users. This script will:

Automatically handle objects owned by the user (databases, tablespaces, schemas, tables, etc.)
Automatically terminate active connections (using --force)
Transfer object ownership to the postgres user
Create an audit log at /tmp/pg_drop_role_<user>_<timestamp>.log

No need to manually handle dependent objects - the script handles everything automatically.

pg-drop-role Script

pg-drop-role is a safe user deletion script provided by Pigsty, located at /pg/bin/pg-drop-role.

Usage:

pg-drop-role <role_name> [successor_role] [options]

Common Options:

Option	Description
`--check`	Only check dependencies, don’t execute deletion
`--dry-run`	Show SQL statements that would be executed, don’t actually execute
`--force`	Force terminate active connections before deletion
`-v, --verbose`	Show verbose output
`-h, --host`	Database host
`-p, --port`	Database port

Examples:

# Check user dependencies (read-only operation)
pg-drop-role dbuser_old --check

# Preview deletion operation (don't actually execute)
pg-drop-role dbuser_old --dry-run -v

# Delete user, transfer objects to postgres
pg-drop-role dbuser_old

# Delete user, transfer objects to specified user
pg-drop-role dbuser_old dbuser_new

# Force delete (terminate active connections)
pg-drop-role dbuser_old --force

Deletion Process:

Pre-check - Verify connection, check if user exists, check if protected
Create audit snapshot - Record all user dependencies
Disable login - ALTER ROLE ... NOLOGIN
Terminate connections - Terminate active connections when using --force
Transfer shared objects - Transfer database, tablespace ownership
Process all databases - Execute REASSIGN OWNED + DROP OWNED in each database
Revoke memberships - Revoke all role memberships
Drop role - Execute DROP ROLE

Pgbouncer Users

Pgbouncer is enabled by default and serves as connection pool middleware, with users managed by default.

Pigsty defaults to adding all users in pg_users that explicitly have the pgbouncer: true flag to the pgbouncer user list.

Users in the Pgbouncer connection pool are listed in /etc/pgbouncer/userlist.txt:

"postgres" ""
"dbuser_wiki" "SCRAM-SHA-256$4096:+77dyhrPeFDT/TptHs7/7Q==$KeatuohpKIYzHPCt/tqBu85vI11o9mar/by0hHYM2W8=:X9gig4JtjoS8Y/o1vQsIX/gY1Fns8ynTXkbWOjUfbRQ="
"dbuser_view" "SCRAM-SHA-256$4096:DFoZHU/DXsHL8MJ8regdEw==$gx9sUGgpVpdSM4o6A2R9PKAUkAsRPLhLoBDLBUYtKS0=:MujSgKe6rxcIUMv4GnyXJmV0YNbf39uFRZv724+X1FE="
"dbuser_monitor" "SCRAM-SHA-256$4096:fwU97ZMO/KR0ScHO5+UuBg==$CrNsmGrx1DkIGrtrD1Wjexb/aygzqQdirTO1oBZROPY=:L8+dJ+fqlMQh7y4PmVR/gbAOvYWOr+KINjeMZ8LlFww="
"dbuser_meta" "SCRAM-SHA-256$4096:leB2RQPcw1OIiRnPnOMUEg==$eyC+NIMKeoTxshJu314+BmbMFpCcspzI3UFZ1RYfNyU=:fJgXcykVPvOfro2MWNkl5q38oz21nSl1dTtM65uYR1Q="
"dbuser_kong" "SCRAM-SHA-256$4096:bK8sLXIieMwFDz67/0dqXQ==$P/tCRgyKx9MC9LH3ErnKsnlOqgNd/nn2RyvThyiK6e4=:CDM8QZNHBdPf97ztusgnE7olaKDNHBN0WeAbP/nzu5A="
"dbuser_grafana" "SCRAM-SHA-256$4096:HjLdGaGmeIAGdWyn2gDt/Q==$jgoyOB8ugoce+Wqjr0EwFf8NaIEMtiTuQTg1iEJs9BM=:ed4HUFqLyB4YpRr+y25FBT7KnlFDnan6JPVT9imxzA4="
"dbuser_gitea" "SCRAM-SHA-256$4096:l1DBGCc4dtircZ8O8Fbzkw==$tpmGwgLuWPDog8IEKdsaDGtiPAxD16z09slvu+rHE74=:pYuFOSDuWSofpD9OZhG7oWvyAR0PQjJBffgHZLpLHds="
"dbuser_dba" "SCRAM-SHA-256$4096:zH8niABU7xmtblVUo2QFew==$Zj7/pq+ICZx7fDcXikiN7GLqKKFA+X5NsvAX6CMshF0=:pqevR2WpizjRecPIQjMZOm+Ap+x0kgPL2Iv5zHZs0+g="
"dbuser_bytebase" "SCRAM-SHA-256$4096:OMoTM9Zf8QcCCMD0svK5gg==$kMchqbf4iLK1U67pVOfGrERa/fY818AwqfBPhsTShNQ=:6HqWteN+AadrUnrgC0byr5A72noqnPugItQjOLFw0Wk="

User-level connection pool parameters are maintained in a separate file: /etc/pgbouncer/useropts.txt, for example:

dbuser_dba                  = pool_mode=session max_user_connections=16
dbuser_monitor              = pool_mode=session max_user_connections=8

When you create a database, Pgbouncer’s database list definition file will be refreshed and take effect through online configuration reload, without affecting existing connections.

Pgbouncer runs with the same dbsu as PostgreSQL, defaulting to the postgres OS user. You can use the pgb alias to access pgbouncer management functions using dbsu.

Pigsty also provides a utility function pgb-route that can quickly switch pgbouncer database traffic to other nodes in the cluster for zero-downtime migration:

Connection pool user configuration files userlist.txt and useropts.txt will be automatically refreshed when you create users and take effect through online configuration reload, normally without affecting existing connections.

Note that the pgbouncer_auth_query parameter allows you to use dynamic queries to complete connection pool user authentication, which is a compromise solution when you don’t want to manage users in the connection pool.

7 - Parameter Tuning

Tuning Postgres Parameters

Pigsty provides four scenario-based parameter templates by default, which can be specified and used through the pg_conf parameter.

tiny.yml: Optimized for small nodes, VMs, and small demos (1-8 cores, 1-16GB)
oltp.yml: Optimized for OLTP workloads and latency-sensitive applications (4C8GB+) (default template)
olap.yml: Optimized for OLAP workloads and throughput (4C8G+)
crit.yml: Optimized for data consistency and critical applications (4C8G+)

Pigsty adopts different parameter optimization strategies for these four default scenarios, as shown below:

Memory Parameter Tuning

Pigsty automatically detects the system’s memory size and uses it as the basis for setting the maximum number of connections and memory-related parameters.

pg_max_conn: PostgreSQL maximum connections, auto will use recommended values for different scenarios
pg_shared_buffer_ratio: Shared buffer memory ratio, default is 0.25

By default, Pigsty uses 25% of memory as PostgreSQL shared buffers, with the remaining 75% as the operating system cache.

By default, if the user has not set a pg_max_conn maximum connections value, Pigsty will use defaults according to the following rules:

oltp: 500 (pgbouncer) / 1000 (postgres)
crit: 500 (pgbouncer) / 1000 (postgres)
tiny: 300
olap: 300

For OLTP and CRIT templates, if the service is not pointing to the pgbouncer connection pool but directly connects to the postgres database, the maximum connections will be doubled to 1000.

After determining the maximum connections, work_mem is calculated from shared memory size / maximum connections and limited to the range of 64MB ~ 1GB.

{% raw %}
{% if pg_max_conn != 'auto' and pg_max_conn|int >= 20 %}{% set pg_max_connections = pg_max_conn|int %}{% else %}{% if pg_default_service_dest|default('postgres') == 'pgbouncer' %}{% set pg_max_connections = 500 %}{% else %}{% set pg_max_connections = 1000 %}{% endif %}{% endif %}
{% set pg_max_prepared_transactions = pg_max_connections if 'citus' in pg_libs else 0 %}
{% set pg_max_locks_per_transaction = (2 * pg_max_connections)|int if 'citus' in pg_libs or 'timescaledb' in pg_libs else pg_max_connections %}
{% set pg_shared_buffers = (node_mem_mb|int * pg_shared_buffer_ratio|float) | round(0, 'ceil') | int %}
{% set pg_maintenance_mem = (pg_shared_buffers|int * 0.25)|round(0, 'ceil')|int %}
{% set pg_effective_cache_size = node_mem_mb|int - pg_shared_buffers|int  %}
{% set pg_workmem =  ([ ([ (pg_shared_buffers / pg_max_connections)|round(0,'floor')|int , 64 ])|max|int , 1024])|min|int %}
{% endraw %}

CPU Parameter Tuning

In PostgreSQL, there are 4 important parameters related to parallel queries. Pigsty automatically optimizes parameters based on the current system’s CPU cores. In all strategies, the total number of parallel processes (total budget) is usually set to CPU cores + 8, with a minimum of 16, to reserve enough background workers for logical replication and extensions. The OLAP and TINY templates vary slightly based on scenarios.

OLTP	Setting Logic	Range Limits
`max_worker_processes`	max(100% CPU + 8, 16)	CPU cores + 4, minimum 12
`max_parallel_workers`	max(ceil(50% CPU), 2)	1/2 CPU rounded up, minimum 2
`max_parallel_maintenance_workers`	max(ceil(33% CPU), 2)	1/3 CPU rounded up, minimum 2
`max_parallel_workers_per_gather`	min(max(ceil(20% CPU), 2),8)	1/5 CPU rounded down, minimum 2, max 8

OLAP	Setting Logic	Range Limits
`max_worker_processes`	max(100% CPU + 12, 20)	CPU cores + 12, minimum 20
`max_parallel_workers`	max(ceil(80% CPU, 2))	4/5 CPU rounded up, minimum 2
`max_parallel_maintenance_workers`	max(ceil(33% CPU), 2)	1/3 CPU rounded up, minimum 2
`max_parallel_workers_per_gather`	max(floor(50% CPU), 2)	1/2 CPU rounded up, minimum 2

CRIT	Setting Logic	Range Limits
`max_worker_processes`	max(100% CPU + 8, 16)	CPU cores + 8, minimum 16
`max_parallel_workers`	max(ceil(50% CPU), 2)	1/2 CPU rounded up, minimum 2
`max_parallel_maintenance_workers`	max(ceil(33% CPU), 2)	1/3 CPU rounded up, minimum 2
`max_parallel_workers_per_gather`	0, enable as needed

TINY	Setting Logic	Range Limits
`max_worker_processes`	max(100% CPU + 4, 12)	CPU cores + 4, minimum 12
`max_parallel_workers`	max(ceil(50% CPU) 1)	50% CPU rounded down, minimum 1
`max_parallel_maintenance_workers`	max(ceil(33% CPU), 1)	33% CPU rounded down, minimum 1
`max_parallel_workers_per_gather`	0, enable as needed

Note that the CRIT and TINY templates disable parallel queries by setting max_parallel_workers_per_gather = 0. Users can enable parallel queries as needed by setting this parameter.

Both OLTP and CRIT templates additionally set the following parameters, doubling the parallel query cost to reduce the tendency to use parallel queries.

parallel_setup_cost: 2000           # double from 100 to increase parallel cost
parallel_tuple_cost: 0.2            # double from 0.1 to increase parallel cost
min_parallel_table_scan_size: 16MB  # double from 8MB to increase parallel cost
min_parallel_index_scan_size: 1024  # double from 512 to increase parallel cost

Note that adjustments to the max_worker_processes parameter only take effect after a restart. Additionally, when a replica’s configuration value for this parameter is higher than the primary’s, the replica will fail to start. This parameter must be adjusted through Patroni configuration management, which ensures consistent primary-replica configuration and prevents new replicas from failing to start during failover.

Storage Space Parameters

Pigsty automatically detects the total space of the disk where the /data/postgres main data directory is located and uses it as the basis for specifying the following parameters:

{% raw %}
min_wal_size: {{ ([pg_size_twentieth, 200])|min }}GB                  # 1/20 disk size, max 200GB
max_wal_size: {{ ([pg_size_twentieth * 4, 2000])|min }}GB             # 2/10 disk size, max 2000GB
max_slot_wal_keep_size: {{ ([pg_size_twentieth * 6, 3000])|min }}GB   # 3/10 disk size, max 3000GB
temp_file_limit: {{ ([pg_size_twentieth, 200])|min }}GB               # 1/20 of disk size, max 200GB
{% endraw %}

temp_file_limit defaults to 5% of disk space, capped at 200GB.
min_wal_size defaults to 5% of disk space, capped at 200GB.
max_wal_size defaults to 20% of disk space, capped at 2TB.
max_slot_wal_keep_size defaults to 30% of disk space, capped at 3TB.

As a special case, the OLAP template allows 20% for temp_file_limit, capped at 2TB.

Manual Parameter Tuning

In addition to using Pigsty’s automatically configured parameters, you can also manually tune PostgreSQL parameters.

Use the pg edit-config <cluster> command to interactively edit cluster configuration:

pg edit-config pg-meta

Or use the -p parameter to directly set parameters:

pg edit-config -p log_min_duration_statement=1000 pg-meta
pg edit-config --force -p shared_preload_libraries='timescaledb, pg_cron, pg_stat_statements, auto_explain' pg-meta

You can also use the Patroni REST API to modify configuration:

curl -u 'postgres:Patroni.API' \
    -d '{"postgresql":{"parameters": {"log_min_duration_statement":200}}}' \
    -s -X PATCH http://10.10.10.10:8008/config | jq .

8 - Accidental Deletion

Handling accidental data deletion, table deletion, and database deletion

Accidental Data Deletion

If it’s a small-scale DELETE misoperation, you can consider using the pg_surgery or pg_dirtyread extension for in-place surgical recovery.

-- Immediately disable Auto Vacuum on this table and abort Auto Vacuum worker processes for this table
ALTER TABLE public.some_table SET (autovacuum_enabled = off, toast.autovacuum_enabled = off);

CREATE EXTENSION pg_dirtyread;
SELECT * FROM pg_dirtyread('tablename') AS t(col1 type1, col2 type2, ...);

If the deleted data has already been reclaimed by VACUUM, then use the general accidental deletion recovery process.

Accidental Object Deletion

When DROP/DELETE type misoperations occur, typically decide on a recovery plan according to the following process:

Confirm whether this data can be recovered from the business system or other data systems. If yes, recover directly from the business side.
Confirm whether there is a delayed replica. If yes, advance the delayed replica to the time point before deletion and query the data for recovery.
If the data has been confirmed deleted, confirm backup information and whether the backup range covers the deletion time point. If it does, start PITR.
Confirm whether to perform in-place cluster PITR rollback, or start a new server for replay, or use a replica for replay, and execute the recovery strategy.

Accidental Cluster Deletion

If an entire database cluster is accidentally deleted through Pigsty management commands, for example, incorrectly executing the pgsql-rm.yml playbook or the bin/pgsql-rm command. Unless you have set the pg_rm_backup parameter to false, the backup will be deleted along with the database cluster.

Warning: In this situation, your data will be unrecoverable! Please think three times before proceeding!

Recommendation: For production environments, you can globally configure this parameter to false in the configuration manifest to preserve backups when removing clusters.

9 - Clone Replicas

How to clone databases, database instances, and database clusters?

PostgreSQL can already replicate data through physical replicas and logical replicas, but sometimes you may need to quickly clone a database, database instance, or entire database cluster. The cloned database can be written to, evolve independently, and not affect the original database. In Pigsty, there are several cloning methods:

Clone Database: Clone a new database within the same cluster
Clone Instance: Clone a new instance on the same PG node
Clone Cluster: Create a new database cluster using PITR mechanism and restore to any point in time of the specified cluster

Clone Database

You can copy a PostgreSQL database through the template mechanism, but no active connections to the template database are allowed during this period.

If you want to clone the postgres database, you must execute the following two statements at the same time. Ensure all connections to the postgres database are cleaned up before executing Clone:

SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = 'postgres';
CREATE DATABASE pgcopy TEMPLATE postgres STRATEGY FILE_COPY;

Instant Clone

If you are using PostgreSQL 18 or higher, Pigsty sets file_copy_method by default. This parameter allows you to clone a database in O(1) (~200ms) time complexity without copying data files.

However, you must explicitly use the FILE_COPY strategy to create the database. Since the STRATEGY parameter of CREATE DATABASE was introduced in PostgreSQL 15, the default value has been WAL_LOG. You need to explicitly specify FILE_COPY for instant cloning.

SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = 'postgres';
CREATE DATABASE pgcopy TEMPLATE postgres STRATEGY FILE_COPY;

For example, cloning a 30 GB database: normal clone (WAL_LOG) takes 18 seconds, while instant clone (FILE_COPY) only needs constant time of 200 milliseconds.

Since Pigsty v4.0, you can use strategy: FILE_COPY in the pg_databases parameter to achieve instant database cloning.

    pg-meta:
      hosts:
        10.10.10.10: { pg_seq: 1, pg_role: primary }
      vars:
        pg_cluster: pg-meta
        pg_version: 18
        pg_databases:

          - name: meta

          - name: meta_dev
            template: meta
            strategy: FILE_COPY         # <---- Introduced in PG 15, instant in PG18

After configuration, use the standard database creation SOP to create the database:

bin/pgsql-db pg-meta meta_dev

Limitations and Notes

This feature is only available on supported file systems (xfs, btrfs, zfs, apfs). If the file system doesn’t support it, PostgreSQL will fail with an error.

By default, mainstream OS distributions’ xfs have reflink=1 enabled by default, so you don’t need to worry about this in most cases.

If your PostgreSQL version is below 15, specifying strategy will have no effect.

Please don’t use the postgres database as a template database for cloning, as management connections typically connect to the postgres database, which prevents the cloning operation.

Use instant cloning with caution in extremely high concurrency/throughput production environments, as it requires clearing all connections to the template database within the cloning window (200ms), otherwise the clone will fail.

10 - Maintenance

Common system maintenance tasks

To ensure Pigsty and PostgreSQL clusters run healthily and stably, some routine maintenance work is required.

Regular Monitoring Review

Pigsty provides an out-of-the-box monitoring platform. We recommend you browse the monitoring dashboards once a day to keep track of system status. At a minimum, we recommend you review the monitoring at least once a week, paying attention to alert events that occur, which can help you avoid most failures and issues in advance.

Here is a list of pre-defined alert rules in Pigsty.

Failover Follow-up

Pigsty’s high availability architecture allows PostgreSQL clusters to automatically perform primary-replica switchovers, meaning operations and DBAs don’t need to intervene or respond immediately. However, users still need to perform the following follow-up work at an appropriate time (e.g., the next business day), including:

Investigate and confirm the cause of the failure to prevent recurrence
Restore the cluster’s original primary-replica topology as appropriate, or modify the configuration manifest to match the new primary-replica status.
Refresh load balancer configuration through bin/pgsql-svc to update service routing status
Refresh the cluster’s HBA rules through bin/pgsql-hba to avoid primary-replica-specific rule drift
If necessary, use bin/pgsql-rm to remove the failed server and expand with a new replica through bin/pgsql-add

Table Bloat Management

Long-running PostgreSQL will experience “table bloat” / “index bloat” phenomena, leading to system performance degradation.

Regularly using pg_repack to perform online rebuilding of tables and indexes helps maintain PostgreSQL’s good performance. Pigsty has already installed and enabled this extension by default in all databases, so you can use it directly.

You can use Pigsty’s PGCAT Database - Table Bloat panel to confirm table bloat and index bloat in the database. Select tables and indexes with high bloat rates (larger tables with bloat rates above 50%) and use pg_repack for online reorganization:

pg_repack dbname -t schema.table

Reorganization does not affect normal read and write operations, but the switching moment after reorganization completes requires an AccessExclusive lock on the table, blocking all access. Therefore, for high-throughput businesses, it’s recommended to perform this during off-peak periods or maintenance windows. For more details, please refer to: Managing Relation Bloat

VACUUM FREEZE

Freezing expired transaction IDs (VACUUM FREEZE) is an important PostgreSQL maintenance task used to prevent transaction ID (XID) exhaustion leading to downtime. Although PostgreSQL already provides an automatic vacuum (AutoVacuum) mechanism, for high-standard production environments, we still recommend combining both automatic and manual approaches, regularly executing database-wide VACUUM FREEZE to ensure XID safety.

You can manually execute VACUUM FREEZE on a database using the following commands:

-- Execute VACUUM FREEZE on the entire database
VACUUM FREEZE;

-- Execute VACUUM FREEZE on a specific table
VACUUM FREEZE schema.table_name;

Or set up a scheduled task through crontab, for example, execute every Sunday morning:

# Execute VACUUM FREEZE on all databases every Sunday at 3 AM
0 3 * * 0 postgres psql -c 'VACUUM FREEZE;' dbname

11 - Version Upgrade

How to upgrade (or downgrade) PostgreSQL minor version kernel, and how to perform major version upgrades

Minor Version Upgrade

To perform a minor version server upgrade/downgrade, you first need to add software to your local software repository: the latest PG minor version RPM/DEB.

First perform a rolling upgrade/downgrade on all replicas, then execute a cluster switchover to upgrade/downgrade the primary.

ansible <cls> -b -a "yum upgrade/downgrade -y <pkg>"    # Upgrade/downgrade packages
pg restart --force <cls>                                # Restart cluster

This time we’ll perform a rolling upgrade:

ansible pg-test -b -a "yum upgrade -y postgresql15*"    # Upgrade packages (or apt upgrade)
ansible pg-test -b -a '/usr/pgsql/bin/pg_ctl --version' # Check binary version is 15.2
pg restart --role replica --force pg-test               # Restart replicas
pg switchover --leader pg-test-1 --candidate=pg-test-2 --scheduled=now --force pg-test    # Switchover primary and replica
pg restart --role primary --force pg-test               # Restart primary

Minor Version Downgrade

Add 15.1 packages to the software repository and refresh the node’s yum/apt cache:

cd ~/pigsty; ./infra.yml -t repo_upstream               # Add upstream repository
cd /www/pigsty; repotrack postgresql15-*-15.1           # Add 15.1 packages to yum repository
cd ~/pigsty; ./infra.yml -t repo_create                 # Rebuild repository metadata
ansible pg-test -b -a 'yum clean all'                   # Clean node repository cache
ansible pg-test -b -a 'yum makecache'                   # Regenerate yum cache from new repository

# For Ubuntu/Debian users, use apt instead of yum
ansible pg-test -b -a 'apt clean'                       # Clean node repository cache
ansible pg-test -b -a 'apt update'                      # Regenerate apt cache from new repository

Execute downgrade and restart cluster:

ansible pg-test -b -a "yum downgrade -y postgresql15*"  # Downgrade packages
pg restart --force pg-test                              # Restart entire cluster to complete upgrade

Major Version Upgrade

The simplest way to perform a major version upgrade is to create a new cluster using the new version, then perform online migration through logical replication and blue-green deployment.

You can also perform an in-place major version upgrade. When you only use the database kernel itself, this is not complicated; use PostgreSQL’s built-in pg_upgrade:

Suppose you want to upgrade PostgreSQL major version from 14 to 15. You first need to add software to the repository and ensure that core extension plugins installed on both sides of the two major versions also have the same version numbers.

./pgsql.yml -t pg_pkg -e pg_version=15                         # Install pg 15 packages
sudo su - postgres; mkdir -p /data/postgres/pg-meta-15/data/   # Prepare directory for 15
pg_upgrade -b /usr/pgsql-14/bin/ -B /usr/pgsql-15/bin/ -d /data/postgres/pg-meta-14/data/ -D /data/postgres/pg-meta-15/data/ -v -c # Pre-check
pg_upgrade -b /usr/pgsql-14/bin/ -B /usr/pgsql-15/bin/ -d /data/postgres/pg-meta-14/data/ -D /data/postgres/pg-meta-15/data/ --link -j8 -v -c
rm -rf /usr/pgsql; ln -s /usr/pgsql-15 /usr/pgsql;             # Fix binary link
mv /data/postgres/pg-meta-14 /data/postgres/pg-meta-15         # Rename data directory
rm -rf /pg; ln -s /data/postgres/pg-meta-15 /pg                # Fix data directory link

Administration

1 - Troubleshooting

Disk Space Exhausted

Symptoms

Diagnosis

Resolution

Transaction ID Wraparound

Symptoms

Diagnosis

Resolution

Connection Exhaustion

Diagnosis

Resolution

etcd Quota Exhausted

Diagnosis

Resolution

Defective Storage Engine

2 - Database Management

Define Database

Create Database

Modify Database

Modifiable Properties

Non-modifiable Properties

Change Owner

Change Connection Limit

Revoke Public Connection Privilege

Mark as Template Database

Manage Parameters

Set Parameters

Reset Parameters to Default

Common Database-Level Parameters

Manage Schemas

Create Schemas

Specify Schema Owner

Delete Schemas

Manage Extensions

Install Extensions

Uninstall Extensions

Delete Database

Rebuild Database

Clone Database

Basic Clone

Specify Clone Strategy (PG15+)

Use Custom Template Database

Mark as Template Database

Use ICU Locale Provider

Connection Pool Config

Database-Level Connection Pool Parameters

Generated Configuration

Hide Databases

Pool Mode Explanation

Locale Provider

Use ICU Provider (PG15+)

Use Builtin Provider (PG17+)

ICU Collation Rules (PG16+)

Provider Comparison

Quick Reference

Common Commands

Common Operation Examples

Execution Flow

3 - HBA Management

Quick Reference

Refresh HBA Rules

Using the Admin Script

Using Ansible Playbook

Related Tags

Configuration File Locations

Verify HBA Rules

View Currently Active HBA Rules

Check HBA Configuration Syntax

Test Connection Authentication

Common Management Scenarios

Add New HBA Rule

Emergency IP Block

Role-Based Rules

Refresh HBA After Cluster Expansion

Refresh HBA After Failover

Troubleshooting

Connection Rejected

Authentication Failed