Administration
Standard Operating Procedures (SOP) for database administration tasks
How to maintain existing PostgreSQL clusters with Pigsty?
This section provides standard operating procedures (SOP) for common PostgreSQL administration tasks:
- SOP: Standard operating procedures for creating/removing clusters and instances, backup & restore, rolling upgrades, etc.
- Failure: Common failure troubleshooting strategies and handling methods, such as disk exhaustion, connection exhaustion, XID wraparound, etc.
- Drop: Emergency procedures for handling accidental data deletion, table drops, and database drops
- Maintain: Maintenance tasks including regular inspections, post-failover cleanup, bloat management, VACUUM FREEZE, etc.
- Tuning: Automatic optimization strategies and adjustment methods for memory, CPU, storage parameters, etc.
1 - Troubleshooting
Common failures and analysis troubleshooting approaches
This document lists potential failures in PostgreSQL and Pigsty, as well as SOPs for locating, handling, and analyzing issues.
Disk Space Exhausted
Disk space exhaustion is the most common type of failure.
Symptoms
When the disk space where the database resides is exhausted, PostgreSQL will not work normally and may exhibit the following symptoms: database logs repeatedly report “no space left on device” errors, new data cannot be written, and PostgreSQL may even trigger a PANIC and force shutdown.
Pigsty includes a NodeFsSpaceFull alert rule that triggers when filesystem available space is less than 10%.
Use the monitoring system’s NODE Instance panel to review the FS metrics panel to locate the issue.
Diagnosis
You can also log into the database node and use df -h to view the usage of each mounted partition to determine which partition is full.
For database nodes, focus on checking the following directories and their sizes to determine which category of files has filled up the space:
- Data directory (
/pg/data/base): Stores data files for tables and indexes; pay attention to heavy writes and temporary files - WAL directory (e.g.,
pg/data/pg_wal): Stores PG WAL; WAL accumulation/replication slot retention is a common cause of disk exhaustion. - Database log directory (e.g.,
pg/log): If PG logs are not rotated in time and large amounts of errors are written, they may also consume significant space. - Local backup directory (e.g.,
data/backups): When using pgBackRest or similar tools to save backups locally, this may also fill up the disk.
If the issue occurs on the Pigsty admin node or monitoring node, also consider:
- Monitoring data: VictoriaMetrics time-series metrics and VictoriaLogs log storage both consume disk space; check retention policies.
- Object storage data: Pigsty’s integrated MinIO object storage may be used for PG backup storage.
After identifying the directory consuming the most space, you can further use du -sh <directory> to drill down and find specific large files or subdirectories.
Resolution
Disk exhaustion is an emergency issue requiring immediate action to free up space and ensure the database continues to operate.
When the data disk is not separated from the system disk, a full disk may prevent shell commands from executing. In this case, you can delete the /pg/dummy placeholder file to free up a small amount of emergency space so shell commands can work again.
If the database has crashed due to pg_wal filling up, you need to restart the database service after clearing space and carefully check data integrity.
Transaction ID Wraparound
PostgreSQL cyclically uses 32-bit transaction IDs (XIDs), and when exhausted, a “transaction ID wraparound” failure occurs (XID Wraparound).
Symptoms
The typical sign in the first phase is when the age saturation in the PGSQL Persist - Age Usage panel enters the warning zone.
Database logs begin to show messages like: WARNING: database "postgres" must be vacuumed within xxxxxxxx transactions.
If the problem continues to worsen, PostgreSQL enters protection mode: when remaining transaction IDs drop to about 1 million, the database switches to read-only mode; when reaching the limit of about 2.1 billion (2^31), it refuses any new transactions and forces the server to shut down to avoid data corruption.
Diagnosis
PostgreSQL and Pigsty enable automatic garbage collection (AutoVacuum) by default, so the occurrence of this type of failure usually has deeper root causes.
Common causes include: very long transactions (SAGE), misconfigured Autovacuum, replication slot blockage, insufficient resources, storage engine/extension bugs, disk bad blocks.
First identify the database with the highest age, then use the Pigsty PGCAT Database - Tables panel to confirm the age distribution of tables.
Also review the database error logs, which usually contain clues to locate the root cause.
Resolution
- Immediately freeze old transactions: If the database has not yet entered read-only protection mode, immediately execute a manual VACUUM FREEZE on the affected database. You can start by freezing the most severely aged tables one by one rather than doing the entire database at once to accelerate the effect. Connect to the database as a superuser and run
VACUUM FREEZE table_name; on tables identified with the largest relfrozenxid, prioritizing tables with the highest XID age. This can quickly reclaim large amounts of transaction ID space. - Single-user mode rescue: If the database is already refusing writes or has crashed for protection, you need to start the database in single-user mode to perform freeze operations. In single-user mode, run
VACUUM FREEZE database_name; to freeze and clean the entire database. After completion, restart the database in multi-user mode. This can lift the wraparound lock and make the database writable again. Be very careful when operating in single-user mode and ensure sufficient transaction ID margin to complete the freeze. - Standby node takeover: In some complex scenarios (e.g., when hardware issues prevent vacuum from completing), consider promoting a read-only standby node in the cluster to primary to obtain a relatively clean environment for handling the freeze. For example, if the primary cannot vacuum due to bad blocks, you can manually failover to promote the standby to the new primary, then perform emergency vacuum freeze on it. After ensuring the new primary has frozen old transactions, switch the load back.
Connection Exhaustion
PostgreSQL has a maximum connections configuration (max_connections). When client connections exceed this limit, new connection requests will be rejected. The typical symptom is that applications cannot connect to the database and report errors like
FATAL: remaining connection slots are reserved for non-replication superuser connections or too many clients already.
This indicates that regular connections are exhausted, leaving only slots reserved for superusers or replication.
Diagnosis
Connection exhaustion is usually caused by a large number of concurrent client requests. You can directly review the database’s current active sessions through PGCAT Instance / PGCAT Database / PGCAT Locks.
Determine what types of queries are filling the system and proceed with further handling. Pay special attention to whether there are many connections in the “Idle in Transaction” state and long-running transactions (as well as slow queries).
Resolution
Kill queries: For situations where exhaustion has already blocked business operations, typically use pg_terminate_backend(pid) immediately for emergency pressure relief.
For cases using connection pooling, you can adjust the connection pool size parameters and execute a reload to reduce the number of connections at the database level.
You can also modify the max_connections parameter to a larger value, but this parameter requires a database restart to take effect.
etcd Quota Exhausted
An exhausted etcd quota will cause the PG high availability control plane to fail and prevent configuration changes.
Diagnosis
Pigsty uses etcd as the distributed configuration store (DCS) when implementing high availability. etcd itself has a storage quota (default is about 2GB).
When etcd storage usage reaches the quota limit, etcd will refuse write operations and report “etcdserver: mvcc: database space exceeded”. In this case, Patroni cannot write heartbeats or update configuration to etcd, causing cluster management functions to fail.
Resolution
Versions between Pigsty v2.0.0 and v2.5.1 are affected by this issue by default. Pigsty v2.6.0 added auto-compaction configuration for deployed etcd. If you only use it for PG high availability leases, this issue will no longer occur in regular use cases.
Defective Storage Engine
Currently, TimescaleDB’s experimental storage engine Hypercore has been proven to have defects, with cases of VACUUM being unable to reclaim leading to XID wraparound failures.
Users using this feature should migrate to PostgreSQL native tables or TimescaleDB’s default engine promptly.
Detailed introduction: PG New Storage Engine Failure Case (Chinese)
2 - Database Management
Database management: create, modify, delete, rebuild databases, and clone databases using templates
In Pigsty, database management follows an IaC (Infrastructure as Code) approach—define in the configuration inventory, then execute playbooks.
When no baseline SQL is defined, executing the pgsql-db.yml playbook is idempotent. It adjusts the specified database in the specified cluster to match the target state in the configuration inventory.
Note that some parameters can only be specified at creation time. Modifying these parameters requires deleting and recreating the database (using state: recreate to rebuild).
Define Database
Business databases are defined in the cluster parameter pg_databases, which is an array of database definition objects. Databases in the array are created in definition order, so later-defined databases can use previously-defined databases as templates.
Here’s the database definition from the default cluster pg-meta in Pigsty’s demo environment:
pg-meta:
hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
vars:
pg_cluster: pg-meta
pg_databases:
- { name: meta ,baseline: cmdb.sql ,comment: pigsty meta database ,schemas: [pigsty] ,extensions: [{name: postgis, schema: public}, {name: timescaledb}]}
- { name: grafana ,owner: dbuser_grafana ,revokeconn: true ,comment: grafana primary database }
- { name: bytebase ,owner: dbuser_bytebase ,revokeconn: true ,comment: bytebase primary database }
- { name: kong ,owner: dbuser_kong ,revokeconn: true ,comment: kong the api gateway database }
- { name: gitea ,owner: dbuser_gitea ,revokeconn: true ,comment: gitea meta database }
- { name: wiki ,owner: dbuser_wiki ,revokeconn: true ,comment: wiki meta database }
- { name: noco ,owner: dbuser_noco ,revokeconn: true ,comment: nocodb database }
The only required field is name, which should be a valid and unique database name within the current PostgreSQL cluster—all other parameters have sensible defaults. For complete database definition parameter reference, see Database Configuration Reference.
Create Database
To create a new business database on an existing PostgreSQL cluster, add the database definition to all.children.<cls>.pg_databases, then execute:
bin/pgsql-db <cls> <dbname> # Equivalent to: pgsql-db.yml -l <cls> -e dbname=<dbname>
Example: Create a business database named myapp
- Add the database definition to the configuration file:
pg-meta:
vars:
pg_databases:
- name: myapp
owner: dbuser_myapp
schemas: [app]
extensions:
- { name: pg_trgm }
- { name: btree_gin }
comment: my application database
- Execute the create command:
bin/pgsql-db pg-meta myapp
Execution effects:
- Creates database
myapp on the primary - Sets database owner to
dbuser_myapp - Creates schema
app - Installs extensions
pg_trgm and btree_gin - Configures default privileges (dbrole_readonly/readwrite/admin)
- Adds database to Pgbouncer connection pool
- Registers database as a Grafana data source
Use Playbooks to Create Databases
We don’t recommend creating business databases manually with SQL, especially when using Pgbouncer connection pooling.
Using bin/pgsql-db automatically handles connection pool configuration and monitoring registration.
Modify Database
Modify database properties by updating the configuration and re-executing the playbook:
bin/pgsql-db <cls> <dbname> # Idempotent operation, can be executed repeatedly
Modifiable Properties
| Property | Description | Example |
|---|
owner | Database owner | owner: dbuser_new |
tablespace | Default tablespace (triggers data migration) | tablespace: fast_ssd |
is_template | Mark as template database | is_template: true |
allowconn | Allow connections | allowconn: false |
connlimit | Connection limit | connlimit: 100 |
revokeconn | Revoke PUBLIC CONNECT privilege | revokeconn: true |
comment | Comment | comment: new comment |
parameters | Database-level parameters | See examples below |
schemas | Add/remove schemas (incremental) | See Manage Schemas |
extensions | Add/remove extensions (incremental) | See Manage Extensions |
pgbouncer | Include in connection pool | pgbouncer: false |
pool_* | Connection pool parameters | See Connection Pool Config |
Non-modifiable Properties
The following properties cannot be modified after database creation—use state: recreate to rebuild the database:
template - Template databaseencoding - Character encodinglocale / lc_collate / lc_ctype - Locale settingslocale_provider / icu_locale / icu_rules / builtin_locale - Locale provider settingsstrategy - Clone strategy
Change Owner
- name: myapp
owner: dbuser_new_owner # Change to new owner
bin/pgsql-db pg-meta myapp
Executed SQL:
ALTER DATABASE "myapp" OWNER TO "dbuser_new_owner";
GRANT ALL PRIVILEGES ON DATABASE "myapp" TO "dbuser_new_owner";
Change Connection Limit
- name: myapp
connlimit: 100 # Limit to max 100 connections
Executed SQL:
ALTER DATABASE "myapp" CONNECTION LIMIT 100;
Revoke Public Connection Privilege
- name: myapp
owner: dbuser_myapp
revokeconn: true # Revoke PUBLIC CONNECT privilege
Executed SQL:
REVOKE CONNECT ON DATABASE "myapp" FROM PUBLIC;
GRANT CONNECT ON DATABASE "myapp" TO "replicator";
GRANT CONNECT ON DATABASE "myapp" TO "dbuser_monitor";
GRANT CONNECT ON DATABASE "myapp" TO "dbuser_dba" WITH GRANT OPTION;
GRANT CONNECT ON DATABASE "myapp" TO "dbuser_myapp" WITH GRANT OPTION;
To restore public connection privilege, set revokeconn: false:
- name: myapp
revokeconn: false # Restore PUBLIC CONNECT privilege
Executed SQL:
GRANT CONNECT ON DATABASE "myapp" TO PUBLIC;
Mark as Template Database
- name: app_template
is_template: true # Allow any user with CREATEDB privilege to clone
Executed SQL:
ALTER DATABASE "app_template" IS_TEMPLATE true;
Manage Parameters
Database-level parameters are configured via the parameters dictionary, generating ALTER DATABASE ... SET statements.
Set Parameters
- name: myapp
parameters:
work_mem: '256MB'
maintenance_work_mem: '512MB'
statement_timeout: '30s'
search_path: 'app,public'
Executed SQL:
ALTER DATABASE "myapp" SET "work_mem" = '256MB';
ALTER DATABASE "myapp" SET "maintenance_work_mem" = '512MB';
ALTER DATABASE "myapp" SET "statement_timeout" = '30s';
ALTER DATABASE "myapp" SET "search_path" = 'app,public';
Reset Parameters to Default
Use the special value DEFAULT (case-insensitive) to reset parameters to PostgreSQL defaults:
- name: myapp
parameters:
work_mem: DEFAULT # Reset to PostgreSQL default
statement_timeout: DEFAULT
Executed SQL:
ALTER DATABASE "myapp" SET "work_mem" = DEFAULT;
ALTER DATABASE "myapp" SET "statement_timeout" = DEFAULT;
Common Database-Level Parameters
| Parameter | Description | Example Value |
|---|
work_mem | Query work memory | '64MB' |
maintenance_work_mem | Maintenance operation memory | '256MB' |
statement_timeout | Statement timeout | '30s' |
lock_timeout | Lock wait timeout | '10s' |
idle_in_transaction_session_timeout | Idle transaction timeout | '10min' |
search_path | Schema search path | 'app,public' |
default_tablespace | Default tablespace | 'fast_ssd' |
temp_tablespaces | Temporary tablespaces | 'temp_ssd' |
log_statement | Statement logging level | 'ddl' |
Manage Schemas
Schemas are configured via the schemas array, supporting create, assign owner, and delete operations.
Create Schemas
- name: myapp
schemas:
# Simple form: schema name only
- app
- api
# Full form: specify owner
- { name: core, owner: dbuser_myapp }
Executed SQL:
CREATE SCHEMA IF NOT EXISTS "app";
CREATE SCHEMA IF NOT EXISTS "api";
CREATE SCHEMA IF NOT EXISTS "core" AUTHORIZATION "dbuser_myapp";
Specify Schema Owner
Use the owner field to assign a schema owner—useful for multi-tenant or permission isolation scenarios:
- name: multi_tenant_db
owner: dbuser_admin
schemas:
- { name: tenant_a, owner: dbuser_tenant_a }
- { name: tenant_b, owner: dbuser_tenant_b }
- { name: shared, owner: dbuser_admin }
Delete Schemas
Use state: absent to mark schemas for deletion:
- name: myapp
schemas:
- { name: deprecated_schema, state: absent }
Executed SQL:
DROP SCHEMA IF EXISTS "deprecated_schema" CASCADE;
CASCADE Warning
Schema deletion uses the CASCADE option, which also deletes all objects within the schema (tables, views, functions, etc.).
Ensure you understand the impact before executing delete operations.
Manage Extensions
Extensions are configured via the extensions array, supporting install and uninstall operations.
Install Extensions
- name: myapp
extensions:
# Simple form: extension name only
- postgis
- pg_trgm
# Full form: specify schema and version
- { name: vector, schema: public }
- { name: pg_stat_statements, schema: monitor, version: '1.10' }
Executed SQL:
CREATE EXTENSION IF NOT EXISTS "postgis" CASCADE;
CREATE EXTENSION IF NOT EXISTS "pg_trgm" CASCADE;
CREATE EXTENSION IF NOT EXISTS "vector" WITH SCHEMA "public" CASCADE;
CREATE EXTENSION IF NOT EXISTS "pg_stat_statements" WITH SCHEMA "monitor" VERSION '1.10' CASCADE;
Uninstall Extensions
Use state: absent to mark extensions for uninstall:
- name: myapp
extensions:
- { name: pg_trgm, state: absent } # Uninstall extension
- { name: postgis } # Keep extension
Executed SQL:
DROP EXTENSION IF EXISTS "pg_trgm" CASCADE;
CREATE EXTENSION IF NOT EXISTS "postgis" CASCADE;
CASCADE Warning
Extension uninstall uses the CASCADE option, which also drops all objects depending on that extension (views, functions, etc.).
Ensure you understand the impact before executing uninstall operations.
Delete Database
To delete a database, set its state to absent and execute the playbook:
pg_databases:
- name: olddb
state: absent
Delete operation will:
- If database is marked
is_template: true, first execute ALTER DATABASE ... IS_TEMPLATE false - Force drop database with
DROP DATABASE ... WITH (FORCE) (PG13+) - Terminate all active connections to the database
- Remove database from Pgbouncer connection pool
- Unregister from Grafana data sources
Protection mechanisms:
- System databases
postgres, template0, template1 cannot be deleted - Delete operations only execute on the primary—streaming replication syncs to replicas automatically
Dangerous Operation Warning
Deleting a database is an irreversible operation that permanently removes all data in that database.
Before executing, ensure:
- You have the latest database backup
- No applications are using the database
- Relevant stakeholders have been notified
Rebuild Database
The recreate state rebuilds a database, equivalent to delete then create:
pg_databases:
- name: testdb
state: recreate
owner: dbuser_test
baseline: test_init.sql # Execute initialization after rebuild
bin/pgsql-db <cls> testdb
Use cases:
- Test environment reset
- Clear development database
- Modify immutable properties (encoding, locale, etc.)
- Restore database to initial state
Difference from manual DROP + CREATE:
- Single command, no need for two operations
- Automatically preserves Pgbouncer and Grafana configuration
- Automatically loads
baseline initialization script after execution
Clone Database
You can use an existing database as a template to create a new database, enabling quick replication of database structures.
Basic Clone
pg_databases:
# 1. First define the template database
- name: app_template
owner: dbuser_app
schemas: [core, api]
extensions: [postgis, pg_trgm]
baseline: app_schema.sql
# 2. Create business database using template
- name: app_prod
template: app_template
owner: dbuser_app
Specify Clone Strategy (PG15+)
- name: app_staging
template: app_template
strategy: FILE_COPY # Or WAL_LOG
owner: dbuser_app
| Strategy | Description | Use Case |
|---|
FILE_COPY | Direct data file copy | Large templates, general scenarios |
WAL_LOG | Copy via WAL logs | Small templates, doesn’t block template connections |
Use Custom Template Database
When using non-system templates (not template0/template1), Pigsty automatically terminates connections to the template database to allow cloning.
- name: new_db
template: existing_db # Use existing business database as template
owner: dbuser_app
Mark as Template Database
By default, only superusers or database owners can use regular databases as templates.
Using is_template: true allows any user with CREATEDB privilege to clone:
- name: shared_template
is_template: true # Allow any user with CREATEDB privilege to clone
owner: dbuser_app
Use ICU Locale Provider
When using the icu locale provider, you must specify template: template0:
- name: myapp_icu
template: template0 # Must use template0
locale_provider: icu
icu_locale: en-US
encoding: UTF8
Connection Pool Config
By default, all business databases are added to the Pgbouncer connection pool.
Database-Level Connection Pool Parameters
- name: myapp
pgbouncer: true # Include in connection pool (default true)
pool_mode: transaction # Pool mode: transaction/session/statement
pool_size: 64 # Default pool size
pool_size_min: 0 # Minimum pool size
pool_reserve: 32 # Reserved connections
pool_connlimit: 100 # Maximum database connections
pool_auth_user: dbuser_meta # Auth query user
Generated Configuration
Configuration file located at /etc/pgbouncer/database.txt:
myapp = host=/var/run/postgresql pool_mode=transaction pool_size=64
Hide Databases
Some internal databases may not need connection pool access:
- name: internal_db
pgbouncer: false # Don't add to connection pool
Pool Mode Explanation
| Mode | Description | Use Case |
|---|
transaction | Return connection after transaction ends (default) | Most OLTP applications |
session | Return connection after session ends | Applications requiring session state |
statement | Return connection after statement ends | Stateless queries |
Locale Provider
PostgreSQL 15+ introduced the locale_provider parameter, supporting different locale implementations.
Use ICU Provider (PG15+)
- name: myapp_icu
template: template0 # ICU must use template0
locale_provider: icu
icu_locale: en-US # ICU locale rules
encoding: UTF8
Use Builtin Provider (PG17+)
- name: myapp_builtin
template: template0
locale_provider: builtin
builtin_locale: C.UTF-8 # Builtin locale rules
encoding: UTF8
ICU Collation Rules (PG16+)
- name: myapp_custom_icu
template: template0
locale_provider: icu
icu_locale: en-US
icu_rules: '&V << w <<< W' # Custom ICU collation rules
Provider Comparison
| Provider | Version Requirement | Features |
|---|
libc | - | Traditional, depends on OS |
icu | PG15+ | Cross-platform consistent, feature-rich |
builtin | PG17+ | Most efficient C/C.UTF-8 collation |
Quick Reference
Common Commands
| Operation | Command |
|---|
| Create database | bin/pgsql-db <cls> <dbname> |
| Modify database | bin/pgsql-db <cls> <dbname> |
| Delete database | Set state: absent then run bin/pgsql-db <cls> <dbname> |
| Rebuild database | Set state: recreate then run bin/pgsql-db <cls> <dbname> |
| List databases | psql -c '\l' |
| View connection pool databases | cat /etc/pgbouncer/database.txt |
Common Operation Examples
# Create basic database
- name: myapp
owner: dbuser_myapp
comment: my application database
# Create database with extensions
- name: geodata
owner: dbuser_geo
extensions: [postgis, postgis_topology]
# Private database with limited connections
- name: secure_db
owner: dbuser_secure
revokeconn: true
connlimit: 10
# Set database-level parameters
- name: analytics
owner: dbuser_analyst
parameters:
work_mem: '512MB'
statement_timeout: '5min'
# Use ICU locale
- name: i18n_db
template: template0
locale_provider: icu
icu_locale: zh-Hans
encoding: UTF8
# Delete database
- name: old_db
state: absent
# Rebuild database
- name: test_db
state: recreate
baseline: test_init.sql
Execution Flow
bin/pgsql-db executes these steps in order:
- Validate - Check dbname parameter and database definition
- Delete (if state=absent/recreate) - Execute DROP DATABASE
- Create (if state=create/recreate) - Execute CREATE DATABASE
- Configure - Execute ALTER DATABASE to set properties
- Initialize - Create schemas, install extensions, execute baseline
- Register - Update Pgbouncer and Grafana data sources
For database access permissions, refer to ACL: Database Privileges.
3 - HBA Management
PostgreSQL and Pgbouncer HBA rule management operations: refresh, reload, verify, and troubleshoot.
HBA rule changes require re-rendering configuration files and reloading services. This article covers HBA rule daily management operations.
Quick Reference
| Operation | Command |
|---|
| Refresh cluster HBA | bin/pgsql-hba <cls> |
| Refresh specific instances | bin/pgsql-hba <cls> <ip>... |
| Refresh PostgreSQL only | ./pgsql.yml -l <cls> -t pg_hba,pg_reload |
| Refresh Pgbouncer only | ./pgsql.yml -l <cls> -t pgbouncer_hba,pgbouncer_reload |
| View current HBA | psql -c "TABLE pg_hba_file_rules" |
| Verify HBA config | psql -c "SELECT pg_reload_conf()" |
Refresh HBA Rules
After modifying HBA rules in pigsty.yml, you need to re-render configuration files and reload services.
Using the Admin Script
The recommended approach is using the bin/pgsql-hba script to refresh PostgreSQL and Pgbouncer HBA in one step:
# Refresh entire cluster's HBA rules
bin/pgsql-hba pg-meta
# Refresh specific instances (multiple IPs separated by spaces)
bin/pgsql-hba pg-meta 10.10.10.10
bin/pgsql-hba pg-meta 10.10.10.11 10.10.10.12
# View script help
bin/pgsql-hba --help
The script internally executes:
./pgsql.yml -l <cluster> -t pg_hba,pg_reload,pgbouncer_hba,pgbouncer_reload
Using Ansible Playbook
Directly use the relevant tags from the pgsql.yml playbook:
# Refresh PostgreSQL HBA and reload
./pgsql.yml -l pg-meta -t pg_hba,pg_reload
# Refresh Pgbouncer HBA and reload
./pgsql.yml -l pg-meta -t pgbouncer_hba,pgbouncer_reload
# Refresh both
./pgsql.yml -l pg-meta -t pg_hba,pg_reload,pgbouncer_hba,pgbouncer_reload
# Use extra variables to force reload
./pgsql.yml -l pg-meta -e pg_reload=true -t pg_hba,pg_reload
| Tag | Description |
|---|
pg_hba | Render PostgreSQL HBA configuration file |
pg_reload | Reload PostgreSQL config (requires pg_reload=true) |
pgbouncer_hba | Render Pgbouncer HBA configuration file |
pgbouncer_reload | Reload Pgbouncer config |
Configuration File Locations
HBA configuration files are rendered by Ansible:
| Service | Config File Path | Template File |
|---|
| PostgreSQL | /pg/data/pg_hba.conf | roles/pgsql/templates/pg_hba.conf |
| Pgbouncer | /etc/pgbouncer/pgb_hba.conf | roles/pgsql/templates/pgbouncer.hba |
Warning: Don’t edit these files directly—they will be overwritten the next time a playbook runs. All changes should be made in pigsty.yml.
Verify HBA Rules
View Currently Active HBA Rules
# Use psql to view PostgreSQL HBA rules
psql -c "TABLE pg_hba_file_rules"
# Or view the config file directly
cat /pg/data/pg_hba.conf
# View Pgbouncer HBA rules
cat /etc/pgbouncer/pgb_hba.conf
Check HBA Configuration Syntax
# PostgreSQL config reload (validates syntax)
psql -c "SELECT pg_reload_conf()"
# If there are syntax errors, check the logs
tail -f /pg/log/postgresql-*.log
Test Connection Authentication
# Test connection for specific user from specific address
psql -h <host> -p 5432 -U <user> -d <database> -c "SELECT 1"
# See which HBA rule matches the connection
psql -c "SELECT * FROM pg_hba_file_rules WHERE database @> ARRAY['<dbname>']::text[]"
Common Management Scenarios
Add New HBA Rule
- Edit
pigsty.yml, add rule to the cluster’s pg_hba_rules:
pg-meta:
vars:
pg_hba_rules:
- {user: new_user, db: new_db, addr: '192.168.1.0/24', auth: pwd, title: 'new app access'}
- Execute refresh:
Emergency IP Block
When detecting a malicious IP, quickly add a blocklist rule:
- Add high-priority (
order: 0) deny rule:
pg_hba_rules:
- {user: all, db: all, addr: '10.1.1.100/32', auth: deny, order: 0, title: 'emergency block'}
- Refresh immediately:
Role-Based Rules
Configure different HBA rules for primary and replica:
pg_hba_rules:
# Only primary allows write users
- {user: writer, db: all, addr: intra, auth: pwd, role: primary, title: 'writer on primary'}
# Replicas allow read-only users
- {user: reader, db: all, addr: world, auth: ssl, role: replica, title: 'reader on replica'}
After refresh, rules are automatically enabled or disabled based on the instance’s pg_role.
Refresh HBA After Cluster Expansion
When new instances are added to the cluster, rules using addr: cluster need refresh to include new members:
# Add new instance
./pgsql.yml -l 10.10.10.14
# Refresh all instances' HBA (includes new member IPs)
bin/pgsql-hba pg-meta
Refresh HBA After Failover
After Patroni failover, instance pg_role may not match the configuration. If HBA rules use role filtering:
- Update role definitions in
pigsty.yml - Refresh HBA rules
# Refresh after updating roles in config file
bin/pgsql-hba pg-meta
Troubleshooting
Connection Rejected
Symptom: FATAL: no pg_hba.conf entry for host "x.x.x.x", user "xxx", database "xxx"
Troubleshooting steps:
- Check current HBA rules:
psql -c "TABLE pg_hba_file_rules"
Confirm if client IP, username, database matches any rule
Check rule order (first match wins)
Add corresponding rule and refresh
Authentication Failed
Symptom: FATAL: password authentication failed for user "xxx"
Troubleshooting steps:
- Confirm password is correct
- Check password encryption method (
pg_pwd_enc) compatibility with client - Check if user exists:
\du or SELECT * FROM pg_roles WHERE rolname = 'xxx'
HBA Rules Not Taking Effect
Troubleshooting steps:
- Confirm refresh command was executed
- Check if Ansible execution succeeded
- Confirm PostgreSQL reloaded:
psql -c "SELECT pg_reload_conf()"
- Check if config file was updated:
head -20 /pg/data/pg_hba.conf
Rule Order Issues
HBA uses first-match-wins logic. If rules aren’t working as expected:
- Check
order values - Use
psql -c "TABLE pg_hba_file_rules" to view actual order - Adjust
order values or rule positions
Online HBA Modification (Not Recommended)
While you can directly edit /pg/data/pg_hba.conf and reload, this is not recommended:
# Direct edit (not recommended)
vi /pg/data/pg_hba.conf
# Reload config
psql -c "SELECT pg_reload_conf()"
# Or
pg_ctl reload -D /pg/data
# Or
systemctl reload postgresql
Problem: Manual changes will be overwritten the next time an Ansible playbook runs.
Correct approach: Always modify in pigsty.yml, then run bin/pgsql-hba to refresh.
Pgbouncer HBA Management
Pgbouncer HBA management is similar to PostgreSQL, with some differences:
Configuration Differences
- Config file:
/etc/pgbouncer/pgb_hba.conf - Doesn’t support
db: replication - Authentication method: local connections use
peer instead of ident
Refresh Commands
# Refresh Pgbouncer HBA only
./pgsql.yml -l pg-meta -t pgbouncer_hba,pgbouncer_reload
# Or use unified script (refreshes both PostgreSQL and Pgbouncer)
bin/pgsql-hba pg-meta
View Pgbouncer HBA
cat /etc/pgbouncer/pgb_hba.conf
Best Practices
- Always manage in config files: Don’t directly edit
pg_hba.conf—all changes through pigsty.yml - Verify in test environment first: HBA changes can cause connection issues—verify in test environment first
- Use order to control priority: Blocklist rules use
order: 0 to ensure priority matching - Refresh promptly: Refresh HBA after adding/removing instances or failover
- Principle of least privilege: Only open necessary access—avoid
addr: world + auth: trust - Monitor authentication failures: Watch for authentication failures in
pg_stat_activity - Backup configuration: Backup
pigsty.yml before important changes
Command Quick Reference
# Refresh HBA (recommended)
bin/pgsql-hba <cluster>
# View PostgreSQL HBA
psql -c "TABLE pg_hba_file_rules"
cat /pg/data/pg_hba.conf
# View Pgbouncer HBA
cat /etc/pgbouncer/pgb_hba.conf
# Reload PostgreSQL config
psql -c "SELECT pg_reload_conf()"
# Test connection
psql -h <host> -U <user> -d <db> -c "SELECT 1"
# View authentication failure logs
tail -f /pg/log/postgresql-*.log | grep -i auth
4 - SOP
Common PostgreSQL administration procedures in Pigsty for maintaining production database clusters.
This document organizes common PostgreSQL administration procedures in Pigsty for maintaining production database clusters.
Here are the standard operating procedures for common PostgreSQL administration tasks:
Cheatsheet
PGSQL playbooks and shortcuts:
bin/pgsql-add <cls> # create pgsql cluster <cls>
bin/pgsql-user <cls> <username> # create pg user <username> on <cls>
bin/pgsql-db <cls> <dbname> # create pg database <dbname> on <cls>
bin/pgsql-svc <cls> [...ip] # reload pg service of cluster <cls>
bin/pgsql-hba <cls> [...ip] # reload postgres/pgbouncer HBA rules of cluster <cls>
bin/pgsql-add <cls> [...ip] # append replicas for cluster <cls>
bin/pgsql-rm <cls> [...ip] # remove replicas from cluster <cls>
bin/pgsql-rm <cls> # remove pgsql cluster <cls>
Patroni admin command and shortcuts:
pg list <cls> # print cluster info
pg edit-config <cls> # edit cluster config
pg reload <cls> [ins] # reload cluster config
pg restart <cls> [ins] # restart pgsql cluster
pg reinit <cls> [ins] # reinit cluster members
pg pause <cls> # entering maintenance mode (no auto failover)
pg resume <cls> # exiting maintenance mode
pg switchover <cls> # switchover on cluster <cls> (primary is healthy)
pg failover <cls> # failover on cluster <cls> (primary failed)
pgBackRest backup/restore command and shortcuts:
pb info # print pgbackrest repo info
pg-backup # make a backup, incr, or full backup if necessary
pg-backup full # make a full backup
pg-backup diff # make a differential backup
pg-backup incr # make a incremental backup
pg-pitr -i # restore to most recent backup completion time (not common)
pg-pitr --time="2022-12-30 14:44:44+08" # restore to specific time point (e.g., in case of table/database drop)
pg-pitr --name="my-restore-point" # restore to named restore point created by pg_create_restore_point
pg-pitr --lsn="0/7C82CB8" -X # restore immediately before LSN
pg-pitr --xid="1234567" -X -P # restore immediately before specific transaction ID, then promote to primary
pg-pitr --backup=latest # restore to latest backup set
pg-pitr --backup=20221108-105325 # restore to specific backup set, can be checked with pgbackrest info
Systemd components quick reference:
systemctl stop patroni # start stop restart reload
systemctl stop pgbouncer # start stop restart reload
systemctl stop pg_exporter # start stop restart reload
systemctl stop pgbouncer_exporter # start stop restart reload
systemctl stop node_exporter # start stop restart
systemctl stop haproxy # start stop restart reload
systemctl stop vip-manager # start stop restart reload
systemctl stop postgres # only when patroni_mode == 'remove'
Create Cluster
To create a new Postgres cluster, first define it in the inventory, then initialize:
bin/node-add <cls> # init nodes for cluster <cls> # ./node.yml -l <cls>
bin/pgsql-add <cls> # init pgsql instances of cluster <cls> # ./pgsql.yml -l <cls>
Note: PGSQL module requires managed nodes. Use bin/node-add to manage nodes first.
Example: Create Cluster

Create User
To create a new business user on an existing Postgres cluster, add the user definition to all.children.<cls>.pg_users, then create it using:
bin/pgsql-user <cls> <username> # ./pgsql-user.yml -l <cls> -e username=<username>
Example: Create Business User

Delete User
To delete a user from an existing Postgres cluster, set the user’s state to absent in the definition, then execute:
bin/pgsql-user <cls> <username> # ./pgsql-user.yml -l <cls> -e username=<username>
For example, to delete dbuser_test user:
pg_users:
- name: dbuser_test
state: absent
The deletion process will:
- Use the
pg-drop-role script to safely delete the user - Automatically disable user login and terminate active connections
- Automatically transfer database/tablespace ownership to
postgres - Automatically handle object ownership and permissions in all databases
- Revoke all role memberships
- Create an audit log for traceability
- Remove the user from the Pgbouncer user list (if previously added)
- Reload Pgbouncer configuration
Protected System Users:
The following system users cannot be deleted via state: absent and will be automatically skipped:
postgres (superuser)replicator (or the user configured in pg_replication_username)dbuser_dba (or the user configured in pg_admin_username)dbuser_monitor (or the user configured in pg_monitor_username)
Example: pg-drop-role Script Usage
# Check user dependencies (read-only operation)
pg-drop-role dbuser_old --check
# Preview deletion operation (don't actually execute)
pg-drop-role dbuser_old --dry-run -v
# Delete user, transfer objects to postgres
pg-drop-role dbuser_old
# Delete user, transfer objects to specified user
pg-drop-role dbuser_old dbuser_new
# Force delete (terminate active connections)
pg-drop-role dbuser_old --force
Create Database
To create a new database on an existing Postgres cluster, add the database definition to all.children.<cls>.pg_databases, then create the database as follows:
bin/pgsql-db <cls> <dbname> # ./pgsql-db.yml -l <cls> -e dbname=<dbname>
Note: If the database specifies a non-default owner, the owner user must already exist, otherwise you must Create User first.
Example: Create Business Database

Reload Service
Services are access points exposed by PostgreSQL (reachable via PGURL), served by HAProxy on host nodes.
Use this task when cluster membership changes, for example: append/remove replicas, switchover/failover / exposing new services, or updating existing service configurations (e.g., LB weights)
To create new services or reload existing services on entire proxy cluster or specific instances:
bin/pgsql-svc <cls> # pgsql.yml -l <cls> -t pg_service -e pg_reload=true
bin/pgsql-svc <cls> [ip...] # pgsql.yml -l ip... -t pg_service -e pg_reload=true
Example: Reload PG Service to Remove an Instance

Reload HBA
When your Postgres/Pgbouncer HBA rules change, you may need to reload HBA to apply the changes.
If you have any role-specific HBA rules, or IP address ranges referencing cluster member aliases, you may also need to reload HBA after switchover/cluster scaling.
To reload postgres and pgbouncer HBA rules on entire cluster or specific instances:
bin/pgsql-hba <cls> # pgsql.yml -l <cls> -t pg_hba,pg_reload,pgbouncer_hba,pgbouncer_reload -e pg_reload=true
bin/pgsql-hba <cls> [ip...] # pgsql.yml -l ip... -t pg_hba,pg_reload,pgbouncer_hba,pgbouncer_reload -e pg_reload=true
Example: Reload Cluster HBA Rules

Config Cluster
To change configuration of an existing Postgres cluster, you need to issue control commands on the admin node using the admin user (the user who installed Pigsty, with nopass ssh/sudo):
Alternatively, on any node in the database cluster, using dbsu (default postgres), you can execute admin commands, but only for this cluster.
pg edit-config <cls> # interactive config a cluster with patronictl
Change patroni parameters and postgresql.parameters, save and apply changes according to prompts.
Example: Non-Interactive Cluster Configuration
You can skip interactive mode and override postgres parameters using the -p option, for example:
pg edit-config -p log_min_duration_statement=1000 pg-test
pg edit-config --force -p shared_preload_libraries='timescaledb, pg_cron, pg_stat_statements, auto_explain'
Example: Change Cluster Config Using Patroni REST API
You can also use the Patroni REST API to change configuration non-interactively, for example:
$ curl -s 10.10.10.11:8008/config | jq . # get current config
$ curl -u 'postgres:Patroni.API' \
-d '{"postgresql":{"parameters": {"log_min_duration_statement":200}}}' \
-s -X PATCH http://10.10.10.11:8008/config | jq .
Note: Patroni sensitive API access (e.g., restart) is restricted to requests from infra/admin nodes, with HTTP basic authentication (username/password) and optional HTTPS protection.
Example: Configure Cluster with patronictl

Append Replica
To add a new replica to an existing PostgreSQL cluster, add its definition to the inventory all.children.<cls>.hosts, then:
bin/node-add <ip> # add node <ip> to Pigsty management
bin/pgsql-add <cls> <ip> # init <ip> as new replica of cluster <cls>
This will add node <ip> to pigsty and initialize it as a replica of cluster <cls>.
Cluster services will be reloaded to accept the new member.
Example: Add Replica to pg-test

For example, if you want to add pg-test-3 / 10.10.10.13 to existing cluster pg-test, first update the inventory:
pg-test:
hosts:
10.10.10.11: { pg_seq: 1, pg_role: primary } # existing member
10.10.10.12: { pg_seq: 2, pg_role: replica } # existing member
10.10.10.13: { pg_seq: 3, pg_role: replica } # <--- new member
vars: { pg_cluster: pg-test }
Then apply the changes as follows:
bin/node-add 10.10.10.13 # add node to pigsty
bin/pgsql-add pg-test 10.10.10.13 # init new replica for cluster pg-test on 10.10.10.13
This is similar to cluster initialization but works on a single instance:
[ OK ] Initialize instance 10.10.10.11 in pgsql cluster 'pg-test':
[WARN] Reminder: add nodes to pigsty first, then install module 'pgsql'
[HINT] $ bin/node-add 10.10.10.11 # run this first except for infra nodes
[WARN] Init instance from cluster:
[ OK ] $ ./pgsql.yml -l '10.10.10.11,&pg-test'
[WARN] Reload pg_service on existing instances:
[ OK ] $ ./pgsql.yml -l 'pg-test,!10.10.10.11' -t pg_service
Remove Replica
To remove a replica from an existing PostgreSQL cluster:
bin/pgsql-rm <cls> <ip...> # ./pgsql-rm.yml -l <ip>
This will remove instance <ip> from cluster <cls>. Cluster services will be reloaded to remove the instance from load balancers.
Example: Remove Replica from pg-test

For example, if you want to remove pg-test-3 / 10.10.10.13 from existing cluster pg-test:
bin/pgsql-rm pg-test 10.10.10.13 # remove pgsql instance 10.10.10.13 from pg-test
bin/node-rm 10.10.10.13 # remove node from pigsty (optional)
vi pigsty.yml # remove instance definition from inventory
bin/pgsql-svc pg-test # refresh pg_service on existing instances to remove from load balancer
[ OK ] Remove pgsql instance 10.10.10.13 from 'pg-test':
[WARN] Remove instance from cluster:
[ OK ] $ ./pgsql-rm.yml -l '10.10.10.13,&pg-test'
And remove the instance definition from inventory:
pg-test:
hosts:
10.10.10.11: { pg_seq: 1, pg_role: primary }
10.10.10.12: { pg_seq: 2, pg_role: replica }
10.10.10.13: { pg_seq: 3, pg_role: replica } # <--- remove this line after execution
vars: { pg_cluster: pg-test }
Finally, you can reload PG service to remove the instance from load balancers:
bin/pgsql-svc pg-test # reload service on pg-test
Remove Cluster
To remove an entire Postgres cluster, simply run:
bin/pgsql-rm <cls> # ./pgsql-rm.yml -l <cls>
Example: Remove Cluster

Example: Force Remove Cluster
Note: If pg_safeguard is configured for this cluster (or globally set to true), pgsql-rm.yml will abort to avoid accidental cluster removal.
You can explicitly override it with playbook command line parameters to force removal:
./pgsql-rm.yml -l pg-meta -e pg_safeguard=false # force remove pg cluster pg-meta
Switchover
You can use the patroni command line tool to perform PostgreSQL cluster switchover.
pg switchover <cls> # interactive mode, you can skip the wizard with the following parameter combination
pg switchover --leader pg-test-1 --candidate=pg-test-2 --scheduled=now --force pg-test
Example: pg-test Switchover

$ pg switchover pg-test
Master [pg-test-1]:
Candidate ['pg-test-2', 'pg-test-3'] []: pg-test-2
When should the switchover take place (e.g. 2022-12-26T07:39 ) [now]: now
Current cluster topology
+ Cluster: pg-test (7181325041648035869) -----+----+-----------+-----------------+
| Member | Host | Role | State | TL | Lag in MB | Tags |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-1 | 10.10.10.11 | Leader | running | 1 | | clonefrom: true |
| | | | | | | conf: tiny.yml |
| | | | | | | spec: 1C.2G.50G |
| | | | | | | version: '15' |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-2 | 10.10.10.12 | Replica | running | 1 | 0 | clonefrom: true |
| | | | | | | conf: tiny.yml |
| | | | | | | spec: 1C.2G.50G |
| | | | | | | version: '15' |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-3 | 10.10.10.13 | Replica | running | 1 | 0 | clonefrom: true |
| | | | | | | conf: tiny.yml |
| | | | | | | spec: 1C.2G.50G |
| | | | | | | version: '15' |
+-----------+-------------+---------+---------+----+-----------+-----------------+
Are you sure you want to switchover cluster pg-test, demoting current master pg-test-1? [y/N]: y
2022-12-26 06:39:58.02468 Successfully switched over to "pg-test-2"
+ Cluster: pg-test (7181325041648035869) -----+----+-----------+-----------------+
| Member | Host | Role | State | TL | Lag in MB | Tags |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-1 | 10.10.10.11 | Replica | stopped | | unknown | clonefrom: true |
| | | | | | | conf: tiny.yml |
| | | | | | | spec: 1C.2G.50G |
| | | | | | | version: '15' |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-2 | 10.10.10.12 | Leader | running | 1 | | clonefrom: true |
| | | | | | | conf: tiny.yml |
| | | | | | | spec: 1C.2G.50G |
| | | | | | | version: '15' |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-3 | 10.10.10.13 | Replica | running | 1 | 0 | clonefrom: true |
| | | | | | | conf: tiny.yml |
| | | | | | | spec: 1C.2G.50G |
| | | | | | | version: '15' |
+-----------+-------------+---------+---------+----+-----------+-----------------+
To perform this via Patroni API (e.g., switch primary from instance 2 to instance 1 at a specified time):
curl -u 'postgres:Patroni.API' \
-d '{"leader":"pg-test-2", "candidate": "pg-test-1","scheduled_at":"2022-12-26T14:47+08"}' \
-s -X POST http://10.10.10.11:8008/switchover
After either switchover or failover, you need to refresh services and HBA rules after cluster membership changes. You should complete this promptly (e.g., within a few hours or a day) after the change:
bin/pgsql-svc <cls>
bin/pgsql-hba <cls>
Backup Cluster
To create backups using pgBackRest, run the following commands as local dbsu (default postgres):
pg-backup # make a backup, incremental or full if necessary
pg-backup full # make a full backup
pg-backup diff # make a differential backup
pg-backup incr # make an incremental backup
pb info # print backup info (pgbackrest info)
See Backup & Restore for more information.
Example: Create Backup

Example: Create Scheduled Backup Task
You can add crontab to node_crontab to specify your backup strategy.
# Full backup daily at 1 AM
- '00 01 * * * postgres /pg/bin/pg-backup full'
# Full backup on Monday at 1 AM, incremental backups on other weekdays
- '00 01 * * 1 postgres /pg/bin/pg-backup full'
- '00 01 * * 2,3,4,5,6,7 postgres /pg/bin/pg-backup'
Restore Cluster
To restore a cluster to a previous point in time (PITR), run the Pigsty helper script pg-pitr as local dbsu user (default postgres):
pg-pitr -i # restore to most recent backup completion time (not common)
pg-pitr --time="2022-12-30 14:44:44+08" # restore to specific time point (e.g., in case of table/database drop)
pg-pitr --name="my-restore-point" # restore to named restore point created by pg_create_restore_point
pg-pitr --lsn="0/7C82CB8" -X # restore immediately before LSN
pg-pitr --xid="1234567" -X -P # restore immediately before specific transaction ID, then promote cluster to primary
pg-pitr --backup=latest # restore to latest backup set
pg-pitr --backup=20221108-105325 # restore to specific backup set, can be listed with pgbackrest info
The command will output an operations manual, follow the instructions. See Backup & Restore - PITR for details.
Example: PITR Using Raw pgBackRest Commands
# Restore to latest available point (e.g., hardware failure)
pgbackrest --stanza=pg-meta restore
# PITR to specific time point (e.g., accidental table drop)
pgbackrest --stanza=pg-meta --type=time --target="2022-11-08 10:58:48" \
--target-action=promote restore
# Restore specific backup point, then promote (or pause|shutdown)
pgbackrest --stanza=pg-meta --type=immediate --target-action=promote \
--set=20221108-105325F_20221108-105938I restore
Adding Packages
To add new RPM packages, add them to repo_packages and repo_url_packages.
Use ./infra.yml -t repo_build subtask to rebuild local repo on Infra node. Then you can install these packages using ansible’s package module:
ansible pg-test -b -m package -a "name=pg_cron_15,topn_15,pg_stat_monitor_15*" # install some packages with ansible
Example: Manually Update Packages in Local Repo
# Add upstream repo on infra/admin node, then manually download required packages
cd ~/pigsty; ./infra.yml -t repo_upstream,repo_cache # add upstream repo (internet)
cd /www/pigsty; repotrack "some_new_package_name" # download latest RPM packages
# Update local repo metadata
cd ~/pigsty; ./infra.yml -t repo_create # recreate local repo
./node.yml -t node_repo # refresh YUM/APT cache on all nodes
# You can also manually refresh YUM/APT cache on nodes using Ansible
ansible all -b -a 'yum clean all' # clean node repo cache
ansible all -b -a 'yum makecache' # rebuild yum/apt cache from new repo
ansible all -b -a 'apt clean' # clean APT cache (Ubuntu/Debian)
ansible all -b -a 'apt update' # rebuild APT cache (Ubuntu/Debian)
For example, you can install or upgrade packages as follows:
ansible pg-test -b -m package -a "name=postgresql15* state=latest"
Install Extension
If you want to install extensions on a PostgreSQL cluster, add them to pg_extensions, then execute:
./pgsql.yml -t pg_extension # install extensions
Some extensions need to be loaded in shared_preload_libraries to take effect. You can add them to pg_libs, or configure an existing cluster.
Finally, execute CREATE EXTENSION <extname>; on the cluster’s primary to complete extension installation.
Example: Install pg_cron Extension on pg-test Cluster
ansible pg-test -b -m package -a "name=pg_cron_15" # install pg_cron package on all nodes
# Add pg_cron to shared_preload_libraries
pg edit-config --force -p shared_preload_libraries='timescaledb, pg_cron, pg_stat_statements, auto_explain'
pg restart --force pg-test # restart cluster
psql -h pg-test -d postgres -c 'CREATE EXTENSION pg_cron;' # install pg_cron on primary
For more details, see PGSQL Extension Installation.
Minor Upgrade
To perform minor version upgrade/downgrade, first add packages to the local repo: latest PG minor version RPM/DEB.
First perform rolling upgrade/downgrade on all replicas, then perform cluster switchover to upgrade/downgrade the primary.
ansible <cls> -b -a "yum upgrade/downgrade -y <pkg>" # upgrade/downgrade packages
pg restart --force <cls> # restart cluster
Example: Downgrade PostgreSQL 15.2 to 15.1
Add 15.1 packages to repo and refresh nodes’ yum/apt cache:
cd ~/pigsty; ./infra.yml -t repo_upstream # add upstream repo
cd /www/pigsty; repotrack postgresql15-*-15.1 # add 15.1 packages to yum repo
cd ~/pigsty; ./infra.yml -t repo_create # rebuild repo metadata
ansible pg-test -b -a 'yum clean all' # clean node repo cache
ansible pg-test -b -a 'yum makecache' # rebuild yum cache from new repo
# For Ubuntu/Debian users, use apt instead of yum
ansible pg-test -b -a 'apt clean' # clean node repo cache
ansible pg-test -b -a 'apt update' # rebuild apt cache from new repo
Execute downgrade and restart cluster:
ansible pg-test -b -a "yum downgrade -y postgresql15*" # downgrade packages
pg restart --force pg-test # restart entire cluster to complete upgrade
Example: Upgrade PostgreSQL 15.1 Back to 15.2
This time we’ll do a rolling upgrade:
ansible pg-test -b -a "yum upgrade -y postgresql15*" # upgrade packages (or apt upgrade)
ansible pg-test -b -a '/usr/pgsql/bin/pg_ctl --version' # check binary version is 15.2
pg restart --role replica --force pg-test # restart replicas
pg switchover --leader pg-test-1 --candidate=pg-test-2 --scheduled=now --force pg-test # switchover
pg restart --role primary --force pg-test # restart primary
Major Upgrade
The easiest way to perform a major upgrade is to create a new cluster using the new version, then perform online migration through logical replication and blue-green deployment.
You can also perform in-place major upgrades. When using only the database kernel itself, this is not complicated - use PostgreSQL’s built-in pg_upgrade:
Suppose you want to upgrade PostgreSQL major version from 14 to 15. First add packages to the repo and ensure core extension plugins are installed with the same version numbers on both major versions.
./pgsql.yml -t pg_pkg -e pg_version=15 # install pg 15 packages
sudo su - postgres; mkdir -p /data/postgres/pg-meta-15/data/ # prepare directory for 15
pg_upgrade -b /usr/pgsql-14/bin/ -B /usr/pgsql-15/bin/ -d /data/postgres/pg-meta-14/data/ -D /data/postgres/pg-meta-15/data/ -v -c # precheck
pg_upgrade -b /usr/pgsql-14/bin/ -B /usr/pgsql-15/bin/ -d /data/postgres/pg-meta-14/data/ -D /data/postgres/pg-meta-15/data/ --link -j8 -v -c
rm -rf /usr/pgsql; ln -s /usr/pgsql-15 /usr/pgsql; # fix binary link
mv /data/postgres/pg-meta-14 /data/postgres/pg-meta-15 # rename data directory
rm -rf /pg; ln -s /data/postgres/pg-meta-15 /pg # fix data directory link
5 - Cluster Management
Standard operation guide for creating/destroying PostgreSQL clusters and scaling existing clusters.
Create Cluster
To create a new Postgres cluster, first define it in the inventory, then initialize:
bin/node-add <cls> # init nodes for cluster <cls> # ./node.yml -l <cls>
bin/pgsql-add <cls> # init pgsql instances of cluster <cls> # ./pgsql.yml -l <cls>
Note: PGSQL module requires managed nodes. Use bin/node-add to manage nodes first.
Example: Create Cluster

Create User
To create a new business user on an existing Postgres cluster, add the user definition to all.children.<cls>.pg_users, then create it using:
bin/pgsql-user <cls> <username> # ./pgsql-user.yml -l <cls> -e username=<username>
Example: Create Business User

Create Database
To create a new database on an existing Postgres cluster, add the database definition to all.children.<cls>.pg_databases, then create the database as follows:
bin/pgsql-db <cls> <dbname> # ./pgsql-db.yml -l <cls> -e dbname=<dbname>
Note: If the database specifies a non-default owner, the owner user must already exist, otherwise you must Create User first.
Example: Create Business Database

Reload Service
Services are access points exposed by PostgreSQL (reachable via PGURL), served by HAProxy on host nodes.
Use this task when cluster membership changes, for example: append/remove replicas, switchover/failover / exposing new services, or updating existing service configurations (e.g., LB weights)
To create new services or reload existing services on entire proxy cluster or specific instances:
bin/pgsql-svc <cls> # pgsql.yml -l <cls> -t pg_service -e pg_reload=true
bin/pgsql-svc <cls> [ip...] # pgsql.yml -l ip... -t pg_service -e pg_reload=true
Example: Reload PG Service to Remove an Instance

Reload HBA
When your Postgres/Pgbouncer HBA rules change, you may need to reload HBA to apply the changes.
If you have any role-specific HBA rules, or IP address ranges referencing cluster member aliases, you may also need to reload HBA after switchover/cluster scaling.
To reload postgres and pgbouncer HBA rules on entire cluster or specific instances:
bin/pgsql-hba <cls> # pgsql.yml -l <cls> -t pg_hba,pg_reload,pgbouncer_hba,pgbouncer_reload -e pg_reload=true
bin/pgsql-hba <cls> [ip...] # pgsql.yml -l ip... -t pg_hba,pg_reload,pgbouncer_hba,pgbouncer_reload -e pg_reload=true
Example: Reload Cluster HBA Rules

Config Cluster
To change configuration of an existing Postgres cluster, you need to issue control commands on the admin node using the admin user (the user who installed Pigsty, with nopass ssh/sudo):
Alternatively, on any node in the database cluster, using dbsu (default postgres), you can execute admin commands, but only for this cluster.
pg edit-config <cls> # interactive config a cluster with patronictl
Change patroni parameters and postgresql.parameters, save and apply changes according to prompts.
Example: Non-Interactive Cluster Configuration
You can skip interactive mode and override postgres parameters using the -p option, for example:
pg edit-config -p log_min_duration_statement=1000 pg-test
pg edit-config --force -p shared_preload_libraries='timescaledb, pg_cron, pg_stat_statements, auto_explain'
Example: Change Cluster Config Using Patroni REST API
You can also use the Patroni REST API to change configuration non-interactively, for example:
$ curl -s 10.10.10.11:8008/config | jq . # get current config
$ curl -u 'postgres:Patroni.API' \
-d '{"postgresql":{"parameters": {"log_min_duration_statement":200}}}' \
-s -X PATCH http://10.10.10.11:8008/config | jq .
Note: Patroni sensitive API access (e.g., restart) is restricted to requests from infra/admin nodes, with HTTP basic authentication (username/password) and optional HTTPS protection.
Example: Configure Cluster with patronictl

Append Replica
To add a new replica to an existing PostgreSQL cluster, add its definition to the inventory all.children.<cls>.hosts, then:
bin/node-add <ip> # add node <ip> to Pigsty management
bin/pgsql-add <cls> <ip> # init <ip> as new replica of cluster <cls>
This will add node <ip> to pigsty and initialize it as a replica of cluster <cls>.
Cluster services will be reloaded to accept the new member.
Example: Add Replica to pg-test

For example, if you want to add pg-test-3 / 10.10.10.13 to existing cluster pg-test, first update the inventory:
pg-test:
hosts:
10.10.10.11: { pg_seq: 1, pg_role: primary } # existing member
10.10.10.12: { pg_seq: 2, pg_role: replica } # existing member
10.10.10.13: { pg_seq: 3, pg_role: replica } # <--- new member
vars: { pg_cluster: pg-test }
Then apply the changes as follows:
bin/node-add 10.10.10.13 # add node to pigsty
bin/pgsql-add pg-test 10.10.10.13 # init new replica for cluster pg-test on 10.10.10.13
This is similar to cluster initialization but works on a single instance:
[ OK ] Initialize instance 10.10.10.11 in pgsql cluster 'pg-test':
[WARN] Reminder: add nodes to pigsty first, then install module 'pgsql'
[HINT] $ bin/node-add 10.10.10.11 # run this first except for infra nodes
[WARN] Init instance from cluster:
[ OK ] $ ./pgsql.yml -l '10.10.10.11,&pg-test'
[WARN] Reload pg_service on existing instances:
[ OK ] $ ./pgsql.yml -l 'pg-test,!10.10.10.11' -t pg_service
Remove Replica
To remove a replica from an existing PostgreSQL cluster:
bin/pgsql-rm <cls> <ip...> # ./pgsql-rm.yml -l <ip>
This will remove instance <ip> from cluster <cls>. Cluster services will be reloaded to remove the instance from load balancers.
Example: Remove Replica from pg-test

For example, if you want to remove pg-test-3 / 10.10.10.13 from existing cluster pg-test:
bin/pgsql-rm pg-test 10.10.10.13 # remove pgsql instance 10.10.10.13 from pg-test
bin/node-rm 10.10.10.13 # remove node from pigsty (optional)
vi pigsty.yml # remove instance definition from inventory
bin/pgsql-svc pg-test # refresh pg_service on existing instances to remove from load balancer
[ OK ] Remove pgsql instance 10.10.10.13 from 'pg-test':
[WARN] Remove instance from cluster:
[ OK ] $ ./pgsql-rm.yml -l '10.10.10.13,&pg-test'
And remove the instance definition from inventory:
pg-test:
hosts:
10.10.10.11: { pg_seq: 1, pg_role: primary }
10.10.10.12: { pg_seq: 2, pg_role: replica }
10.10.10.13: { pg_seq: 3, pg_role: replica } # <--- remove this line after execution
vars: { pg_cluster: pg-test }
Finally, you can reload PG service to remove the instance from load balancers:
bin/pgsql-svc pg-test # reload service on pg-test
Remove Cluster
To remove an entire Postgres cluster, simply run:
bin/pgsql-rm <cls> # ./pgsql-rm.yml -l <cls>
Example: Remove Cluster

Example: Force Remove Cluster
Note: If pg_safeguard is configured for this cluster (or globally set to true), pgsql-rm.yml will abort to avoid accidental cluster removal.
You can explicitly override it with playbook command line parameters to force removal:
./pgsql-rm.yml -l pg-meta -e pg_safeguard=false # force remove pg cluster pg-meta
6 - User Management
Creating PostgreSQL users/roles, managing connection pool roles, refreshing expiration times, user password rotation
Creating Users
To create a new business user on an existing Postgres cluster, add the user definition to all.children.<cls>.pg_users, then create it using the following command:
bin/pgsql-user <cls> <username> # ./pgsql-user.yml -l <cls> -e username=<username>
Example: Creating a business user

Defining Users
Pigsty defines roles and users in database clusters through two configuration parameters:
pg_default_roles: Defines globally unified roles and userspg_users: Defines business users and roles at the database cluster level
The former is used to define roles and users shared across the entire environment, while the latter defines business roles and users specific to individual clusters. Both have the same format, being arrays of user definition objects.
You can define multiple users/roles. They will be created sequentially first globally, then by cluster, and finally in array order, so later users can belong to previously defined roles.
Below is the business user definition in the default cluster pg-meta in the Pigsty demo environment:
pg-meta:
hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
vars:
pg_cluster: pg-meta
pg_users:
- {name: dbuser_meta ,password: DBUser.Meta ,pgbouncer: true ,roles: [dbrole_admin] ,comment: pigsty admin user }
- {name: dbuser_view ,password: DBUser.Viewer ,pgbouncer: true ,roles: [dbrole_readonly] ,comment: read-only viewer for meta database }
- {name: dbuser_grafana ,password: DBUser.Grafana ,pgbouncer: true ,roles: [dbrole_admin] ,comment: admin user for grafana database }
- {name: dbuser_bytebase ,password: DBUser.Bytebase ,pgbouncer: true ,roles: [dbrole_admin] ,comment: admin user for bytebase database }
- {name: dbuser_kong ,password: DBUser.Kong ,pgbouncer: true ,roles: [dbrole_admin] ,comment: admin user for kong api gateway }
- {name: dbuser_gitea ,password: DBUser.Gitea ,pgbouncer: true ,roles: [dbrole_admin] ,comment: admin user for gitea service }
- {name: dbuser_wiki ,password: DBUser.Wiki ,pgbouncer: true ,roles: [dbrole_admin] ,comment: admin user for wiki.js service }
- {name: dbuser_noco ,password: DBUser.Noco ,pgbouncer: true ,roles: [dbrole_admin] ,comment: admin user for nocodb service }
Each user/role definition is an object that may include the following fields, using the dbuser_meta user as an example:
- name: dbuser_meta # Required, `name` is the only mandatory field in a user definition
password: DBUser.Meta # Optional, password, can be a scram-sha-256 hash string or plaintext
login: true # Optional, can log in by default
superuser: false # Optional, default is false, is this a superuser?
createdb: false # Optional, default is false, can create databases?
createrole: false # Optional, default is false, can create roles?
inherit: true # Optional, by default, can this role use inherited permissions?
replication: false # Optional, default is false, can this role perform replication?
bypassrls: false # Optional, default is false, can this role bypass row-level security?
pgbouncer: true # Optional, default is false, add this user to the pgbouncer user list? (production users using connection pooling should explicitly set to true)
connlimit: -1 # Optional, user connection limit, default -1 disables limit
expire_in: 3650 # Optional, expiration time for this role: calculated as created time + n days (higher priority than expire_at)
expire_at: '2030-12-31' # Optional, time point when this role expires, specify a specific date using YYYY-MM-DD format string (lower priority than expire_in)
comment: pigsty admin user # Optional, description and comment string for this user/role
roles: [dbrole_admin] # Optional, default roles are: dbrole_{admin,readonly,readwrite,offline}
parameters: {} # Optional, configure role-level database parameters for this role using `ALTER ROLE SET`
pool_mode: transaction # Optional, pgbouncer pool mode defaulting to transaction, at user level
pool_connlimit: -1 # Optional, maximum database connections at user level, default -1 disables limit
search_path: public # Optional, key-value configuration parameters according to postgresql documentation (e.g., use pigsty as default search_path)
- The only required field is
name, which should be a valid and unique username in the PostgreSQL cluster. - Roles don’t need a
password, but for login-enabled business users, it’s usually necessary to specify a password. password can be plaintext or a scram-sha-256 / md5 hash string; please avoid using plaintext passwords.- Users/roles are created sequentially in array order, so ensure role/group definitions come before members.
login, superuser, createdb, createrole, inherit, replication, bypassrls are boolean flags.pgbouncer is disabled by default: to add business users to the pgbouncer user list, you should explicitly set it to true.
ACL System
Pigsty has a built-in, out-of-the-box access control / ACL system. You can easily use it by assigning the following four default roles to business users:
dbrole_readwrite: Role with global read-write access (production accounts primarily used by business should have database read-write permissions)dbrole_readonly: Role with global read-only access (if other businesses want read-only access, they can use this role)dbrole_admin: Role with DDL permissions (business administrators, scenarios requiring table creation in applications)dbrole_offline: Role with restricted read-only access (can only access offline instances, typically for personal users)
If you want to redesign your own ACL system, consider customizing the following parameters and templates:
Creating Users
Users and roles defined in pg_default_roles and pg_users will be automatically created sequentially during the PROVISION phase of cluster initialization.
If you want to create users on an existing cluster, you can use the bin/pgsql-user tool.
Add the new user/role definition to all.children.<cls>.pg_users and create the database using the following method:
bin/pgsql-user <cls> <username> # pgsql-user.yml -l <cls> -e username=<username>
Unlike databases, the user creation playbook is always idempotent. When the target user already exists, Pigsty will modify the target user’s attributes to conform to the configuration. So running it repeatedly on existing clusters typically won’t cause issues.
Please Use Playbook to Create Users
We do not recommend manually creating new business users, especially when you want to create users that use the default pgbouncer connection pool: unless you’re willing to manually maintain the user list in Pgbouncer and keep it consistent with PostgreSQL.
When creating a new database using the bin/pgsql-user tool or the pgsql-user.yml playbook, this database will also be added to the Pgbouncer Users list.
Modifying Users
The method for modifying PostgreSQL user attributes is the same as Creating Users.
First, adjust your user definition, modify the attributes that need adjustment, then execute the following command to apply:
bin/pgsql-user <cls> <username> # pgsql-user.yml -l <cls> -e username=<username>
Note that modifying users does not delete users but modifies user attributes using the ALTER USER command; it also doesn’t revoke user permissions and groups, and uses the GRANT command to grant new roles.
Deleting Users
To delete a user, set its state to absent and execute the playbook:
pg_users:
- name: dbuser_old
state: absent
bin/pgsql-user <cls> dbuser_old
The deletion process will:
- Use the
pg-drop-role script to safely delete the user - Automatically disable user login and terminate active connections
- Automatically transfer database/tablespace ownership to
postgres - Automatically handle object ownership and permissions in all databases
- Revoke all role memberships
- Create an audit log for traceability
- Remove the user from the Pgbouncer user list (if previously added)
- Reload Pgbouncer configuration
Protected System Users:
The following system users cannot be deleted via state: absent and will be automatically skipped:
postgres (superuser)replicator (or the user configured in pg_replication_username)dbuser_dba (or the user configured in pg_admin_username)dbuser_monitor (or the user configured in pg_monitor_username)
Safe Deletion
Pigsty uses the pg-drop-role script to safely delete users. This script will:
- Automatically handle objects owned by the user (databases, tablespaces, schemas, tables, etc.)
- Automatically terminate active connections (using
--force) - Transfer object ownership to the
postgres user - Create an audit log at
/tmp/pg_drop_role_<user>_<timestamp>.log
No need to manually handle dependent objects - the script handles everything automatically.
pg-drop-role Script
pg-drop-role is a safe user deletion script provided by Pigsty, located at /pg/bin/pg-drop-role.
Usage:
pg-drop-role <role_name> [successor_role] [options]
Common Options:
| Option | Description |
|---|
--check | Only check dependencies, don’t execute deletion |
--dry-run | Show SQL statements that would be executed, don’t actually execute |
--force | Force terminate active connections before deletion |
-v, --verbose | Show verbose output |
-h, --host | Database host |
-p, --port | Database port |
Examples:
# Check user dependencies (read-only operation)
pg-drop-role dbuser_old --check
# Preview deletion operation (don't actually execute)
pg-drop-role dbuser_old --dry-run -v
# Delete user, transfer objects to postgres
pg-drop-role dbuser_old
# Delete user, transfer objects to specified user
pg-drop-role dbuser_old dbuser_new
# Force delete (terminate active connections)
pg-drop-role dbuser_old --force
Deletion Process:
- Pre-check - Verify connection, check if user exists, check if protected
- Create audit snapshot - Record all user dependencies
- Disable login -
ALTER ROLE ... NOLOGIN - Terminate connections - Terminate active connections when using
--force - Transfer shared objects - Transfer database, tablespace ownership
- Process all databases - Execute
REASSIGN OWNED + DROP OWNED in each database - Revoke memberships - Revoke all role memberships
- Drop role - Execute
DROP ROLE
Pgbouncer Users
Pgbouncer is enabled by default and serves as connection pool middleware, with users managed by default.
Pigsty defaults to adding all users in pg_users that explicitly have the pgbouncer: true flag to the pgbouncer user list.
Users in the Pgbouncer connection pool are listed in /etc/pgbouncer/userlist.txt:
"postgres" ""
"dbuser_wiki" "SCRAM-SHA-256$4096:+77dyhrPeFDT/TptHs7/7Q==$KeatuohpKIYzHPCt/tqBu85vI11o9mar/by0hHYM2W8=:X9gig4JtjoS8Y/o1vQsIX/gY1Fns8ynTXkbWOjUfbRQ="
"dbuser_view" "SCRAM-SHA-256$4096:DFoZHU/DXsHL8MJ8regdEw==$gx9sUGgpVpdSM4o6A2R9PKAUkAsRPLhLoBDLBUYtKS0=:MujSgKe6rxcIUMv4GnyXJmV0YNbf39uFRZv724+X1FE="
"dbuser_monitor" "SCRAM-SHA-256$4096:fwU97ZMO/KR0ScHO5+UuBg==$CrNsmGrx1DkIGrtrD1Wjexb/aygzqQdirTO1oBZROPY=:L8+dJ+fqlMQh7y4PmVR/gbAOvYWOr+KINjeMZ8LlFww="
"dbuser_meta" "SCRAM-SHA-256$4096:leB2RQPcw1OIiRnPnOMUEg==$eyC+NIMKeoTxshJu314+BmbMFpCcspzI3UFZ1RYfNyU=:fJgXcykVPvOfro2MWNkl5q38oz21nSl1dTtM65uYR1Q="
"dbuser_kong" "SCRAM-SHA-256$4096:bK8sLXIieMwFDz67/0dqXQ==$P/tCRgyKx9MC9LH3ErnKsnlOqgNd/nn2RyvThyiK6e4=:CDM8QZNHBdPf97ztusgnE7olaKDNHBN0WeAbP/nzu5A="
"dbuser_grafana" "SCRAM-SHA-256$4096:HjLdGaGmeIAGdWyn2gDt/Q==$jgoyOB8ugoce+Wqjr0EwFf8NaIEMtiTuQTg1iEJs9BM=:ed4HUFqLyB4YpRr+y25FBT7KnlFDnan6JPVT9imxzA4="
"dbuser_gitea" "SCRAM-SHA-256$4096:l1DBGCc4dtircZ8O8Fbzkw==$tpmGwgLuWPDog8IEKdsaDGtiPAxD16z09slvu+rHE74=:pYuFOSDuWSofpD9OZhG7oWvyAR0PQjJBffgHZLpLHds="
"dbuser_dba" "SCRAM-SHA-256$4096:zH8niABU7xmtblVUo2QFew==$Zj7/pq+ICZx7fDcXikiN7GLqKKFA+X5NsvAX6CMshF0=:pqevR2WpizjRecPIQjMZOm+Ap+x0kgPL2Iv5zHZs0+g="
"dbuser_bytebase" "SCRAM-SHA-256$4096:OMoTM9Zf8QcCCMD0svK5gg==$kMchqbf4iLK1U67pVOfGrERa/fY818AwqfBPhsTShNQ=:6HqWteN+AadrUnrgC0byr5A72noqnPugItQjOLFw0Wk="
User-level connection pool parameters are maintained in a separate file: /etc/pgbouncer/useropts.txt, for example:
dbuser_dba = pool_mode=session max_user_connections=16
dbuser_monitor = pool_mode=session max_user_connections=8
When you create a database, Pgbouncer’s database list definition file will be refreshed and take effect through online configuration reload, without affecting existing connections.
Pgbouncer runs with the same dbsu as PostgreSQL, defaulting to the postgres OS user. You can use the pgb alias to access pgbouncer management functions using dbsu.
Pigsty also provides a utility function pgb-route that can quickly switch pgbouncer database traffic to other nodes in the cluster for zero-downtime migration:
Connection pool user configuration files userlist.txt and useropts.txt will be automatically refreshed when you create users and take effect through online configuration reload, normally without affecting existing connections.
Note that the pgbouncer_auth_query parameter allows you to use dynamic queries to complete connection pool user authentication, which is a compromise solution when you don’t want to manage users in the connection pool.
7 - Parameter Tuning
Tuning Postgres Parameters
Pigsty provides four scenario-based parameter templates by default, which can be specified and used through the pg_conf parameter.
tiny.yml: Optimized for small nodes, VMs, and small demos (1-8 cores, 1-16GB)oltp.yml: Optimized for OLTP workloads and latency-sensitive applications (4C8GB+) (default template)olap.yml: Optimized for OLAP workloads and throughput (4C8G+)crit.yml: Optimized for data consistency and critical applications (4C8G+)
Pigsty adopts different parameter optimization strategies for these four default scenarios, as shown below:
Memory Parameter Tuning
Pigsty automatically detects the system’s memory size and uses it as the basis for setting the maximum number of connections and memory-related parameters.
pg_max_conn: PostgreSQL maximum connections, auto will use recommended values for different scenariospg_shared_buffer_ratio: Shared buffer memory ratio, default is 0.25
By default, Pigsty uses 25% of memory as PostgreSQL shared buffers, with the remaining 75% as the operating system cache.
By default, if the user has not set a pg_max_conn maximum connections value, Pigsty will use defaults according to the following rules:
- oltp: 500 (pgbouncer) / 1000 (postgres)
- crit: 500 (pgbouncer) / 1000 (postgres)
- tiny: 300
- olap: 300
For OLTP and CRIT templates, if the service is not pointing to the pgbouncer connection pool but directly connects to the postgres database, the maximum connections will be doubled to 1000.
After determining the maximum connections, work_mem is calculated from shared memory size / maximum connections and limited to the range of 64MB ~ 1GB.
{% raw %}
{% if pg_max_conn != 'auto' and pg_max_conn|int >= 20 %}{% set pg_max_connections = pg_max_conn|int %}{% else %}{% if pg_default_service_dest|default('postgres') == 'pgbouncer' %}{% set pg_max_connections = 500 %}{% else %}{% set pg_max_connections = 1000 %}{% endif %}{% endif %}
{% set pg_max_prepared_transactions = pg_max_connections if 'citus' in pg_libs else 0 %}
{% set pg_max_locks_per_transaction = (2 * pg_max_connections)|int if 'citus' in pg_libs or 'timescaledb' in pg_libs else pg_max_connections %}
{% set pg_shared_buffers = (node_mem_mb|int * pg_shared_buffer_ratio|float) | round(0, 'ceil') | int %}
{% set pg_maintenance_mem = (pg_shared_buffers|int * 0.25)|round(0, 'ceil')|int %}
{% set pg_effective_cache_size = node_mem_mb|int - pg_shared_buffers|int %}
{% set pg_workmem = ([ ([ (pg_shared_buffers / pg_max_connections)|round(0,'floor')|int , 64 ])|max|int , 1024])|min|int %}
{% endraw %}
CPU Parameter Tuning
In PostgreSQL, there are 4 important parameters related to parallel queries. Pigsty automatically optimizes parameters based on the current system’s CPU cores.
In all strategies, the total number of parallel processes (total budget) is usually set to CPU cores + 8, with a minimum of 16, to reserve enough background workers for logical replication and extensions. The OLAP and TINY templates vary slightly based on scenarios.
| OLTP | Setting Logic | Range Limits |
|---|
max_worker_processes | max(100% CPU + 8, 16) | CPU cores + 4, minimum 12 |
max_parallel_workers | max(ceil(50% CPU), 2) | 1/2 CPU rounded up, minimum 2 |
max_parallel_maintenance_workers | max(ceil(33% CPU), 2) | 1/3 CPU rounded up, minimum 2 |
max_parallel_workers_per_gather | min(max(ceil(20% CPU), 2),8) | 1/5 CPU rounded down, minimum 2, max 8 |
| OLAP | Setting Logic | Range Limits |
|---|
max_worker_processes | max(100% CPU + 12, 20) | CPU cores + 12, minimum 20 |
max_parallel_workers | max(ceil(80% CPU, 2)) | 4/5 CPU rounded up, minimum 2 |
max_parallel_maintenance_workers | max(ceil(33% CPU), 2) | 1/3 CPU rounded up, minimum 2 |
max_parallel_workers_per_gather | max(floor(50% CPU), 2) | 1/2 CPU rounded up, minimum 2 |
| CRIT | Setting Logic | Range Limits |
|---|
max_worker_processes | max(100% CPU + 8, 16) | CPU cores + 8, minimum 16 |
max_parallel_workers | max(ceil(50% CPU), 2) | 1/2 CPU rounded up, minimum 2 |
max_parallel_maintenance_workers | max(ceil(33% CPU), 2) | 1/3 CPU rounded up, minimum 2 |
max_parallel_workers_per_gather | 0, enable as needed | |
| TINY | Setting Logic | Range Limits |
|---|
max_worker_processes | max(100% CPU + 4, 12) | CPU cores + 4, minimum 12 |
max_parallel_workers | max(ceil(50% CPU) 1) | 50% CPU rounded down, minimum 1 |
max_parallel_maintenance_workers | max(ceil(33% CPU), 1) | 33% CPU rounded down, minimum 1 |
max_parallel_workers_per_gather | 0, enable as needed | |
Note that the CRIT and TINY templates disable parallel queries by setting max_parallel_workers_per_gather = 0.
Users can enable parallel queries as needed by setting this parameter.
Both OLTP and CRIT templates additionally set the following parameters, doubling the parallel query cost to reduce the tendency to use parallel queries.
parallel_setup_cost: 2000 # double from 100 to increase parallel cost
parallel_tuple_cost: 0.2 # double from 0.1 to increase parallel cost
min_parallel_table_scan_size: 16MB # double from 8MB to increase parallel cost
min_parallel_index_scan_size: 1024 # double from 512 to increase parallel cost
Note that adjustments to the max_worker_processes parameter only take effect after a restart. Additionally, when a replica’s configuration value for this parameter is higher than the primary’s, the replica will fail to start.
This parameter must be adjusted through Patroni configuration management, which ensures consistent primary-replica configuration and prevents new replicas from failing to start during failover.
Storage Space Parameters
Pigsty automatically detects the total space of the disk where the /data/postgres main data directory is located and uses it as the basis for specifying the following parameters:
{% raw %}
min_wal_size: {{ ([pg_size_twentieth, 200])|min }}GB # 1/20 disk size, max 200GB
max_wal_size: {{ ([pg_size_twentieth * 4, 2000])|min }}GB # 2/10 disk size, max 2000GB
max_slot_wal_keep_size: {{ ([pg_size_twentieth * 6, 3000])|min }}GB # 3/10 disk size, max 3000GB
temp_file_limit: {{ ([pg_size_twentieth, 200])|min }}GB # 1/20 of disk size, max 200GB
{% endraw %}
temp_file_limit defaults to 5% of disk space, capped at 200GB.min_wal_size defaults to 5% of disk space, capped at 200GB.max_wal_size defaults to 20% of disk space, capped at 2TB.max_slot_wal_keep_size defaults to 30% of disk space, capped at 3TB.
As a special case, the OLAP template allows 20% for temp_file_limit, capped at 2TB.
Manual Parameter Tuning
In addition to using Pigsty’s automatically configured parameters, you can also manually tune PostgreSQL parameters.
Use the pg edit-config <cluster> command to interactively edit cluster configuration:
Or use the -p parameter to directly set parameters:
pg edit-config -p log_min_duration_statement=1000 pg-meta
pg edit-config --force -p shared_preload_libraries='timescaledb, pg_cron, pg_stat_statements, auto_explain' pg-meta
You can also use the Patroni REST API to modify configuration:
curl -u 'postgres:Patroni.API' \
-d '{"postgresql":{"parameters": {"log_min_duration_statement":200}}}' \
-s -X PATCH http://10.10.10.10:8008/config | jq .
8 - Accidental Deletion
Handling accidental data deletion, table deletion, and database deletion
Accidental Data Deletion
If it’s a small-scale DELETE misoperation, you can consider using the pg_surgery or pg_dirtyread extension for in-place surgical recovery.
-- Immediately disable Auto Vacuum on this table and abort Auto Vacuum worker processes for this table
ALTER TABLE public.some_table SET (autovacuum_enabled = off, toast.autovacuum_enabled = off);
CREATE EXTENSION pg_dirtyread;
SELECT * FROM pg_dirtyread('tablename') AS t(col1 type1, col2 type2, ...);
If the deleted data has already been reclaimed by VACUUM, then use the general accidental deletion recovery process.
Accidental Object Deletion
When DROP/DELETE type misoperations occur, typically decide on a recovery plan according to the following process:
- Confirm whether this data can be recovered from the business system or other data systems. If yes, recover directly from the business side.
- Confirm whether there is a delayed replica. If yes, advance the delayed replica to the time point before deletion and query the data for recovery.
- If the data has been confirmed deleted, confirm backup information and whether the backup range covers the deletion time point. If it does, start PITR.
- Confirm whether to perform in-place cluster PITR rollback, or start a new server for replay, or use a replica for replay, and execute the recovery strategy.
Accidental Cluster Deletion
If an entire database cluster is accidentally deleted through Pigsty management commands, for example, incorrectly executing the pgsql-rm.yml playbook or the bin/pgsql-rm command.
Unless you have set the pg_rm_backup parameter to false, the backup will be deleted along with the database cluster.
Warning: In this situation, your data will be unrecoverable! Please think three times before proceeding!
Recommendation: For production environments, you can globally configure this parameter to false in the configuration manifest to preserve backups when removing clusters.
9 - Clone Replicas
How to clone databases, database instances, and database clusters?
PostgreSQL can already replicate data through physical replicas and logical replicas, but sometimes you may need to quickly clone a database, database instance, or entire database cluster. The cloned database can be written to, evolve independently, and not affect the original database. In Pigsty, there are several cloning methods:
- Clone Database: Clone a new database within the same cluster
- Clone Instance: Clone a new instance on the same PG node
- Clone Cluster: Create a new database cluster using PITR mechanism and restore to any point in time of the specified cluster
Clone Database
You can copy a PostgreSQL database through the template mechanism, but no active connections to the template database are allowed during this period.
If you want to clone the postgres database, you must execute the following two statements at the same time. Ensure all connections to the postgres database are cleaned up before executing Clone:
SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = 'postgres';
CREATE DATABASE pgcopy TEMPLATE postgres STRATEGY FILE_COPY;
Instant Clone
If you are using PostgreSQL 18 or higher, Pigsty sets file_copy_method by default. This parameter allows you to clone a database in O(1) (~200ms) time complexity without copying data files.
However, you must explicitly use the FILE_COPY strategy to create the database. Since the STRATEGY parameter of CREATE DATABASE was introduced in PostgreSQL 15, the default value has been WAL_LOG. You need to explicitly specify FILE_COPY for instant cloning.
SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = 'postgres';
CREATE DATABASE pgcopy TEMPLATE postgres STRATEGY FILE_COPY;
For example, cloning a 30 GB database: normal clone (WAL_LOG) takes 18 seconds, while instant clone (FILE_COPY) only needs constant time of 200 milliseconds.
Since Pigsty v4.0, you can use strategy: FILE_COPY in the pg_databases parameter to achieve instant database cloning.
pg-meta:
hosts:
10.10.10.10: { pg_seq: 1, pg_role: primary }
vars:
pg_cluster: pg-meta
pg_version: 18
pg_databases:
- name: meta
- name: meta_dev
template: meta
strategy: FILE_COPY # <---- Introduced in PG 15, instant in PG18
After configuration, use the standard database creation SOP to create the database:
bin/pgsql-db pg-meta meta_dev
Limitations and Notes
This feature is only available on supported file systems (xfs, btrfs, zfs, apfs). If the file system doesn’t support it, PostgreSQL will fail with an error.
By default, mainstream OS distributions’ xfs have reflink=1 enabled by default, so you don’t need to worry about this in most cases.
If your PostgreSQL version is below 15, specifying strategy will have no effect.
Please don’t use the postgres database as a template database for cloning, as management connections typically connect to the postgres database, which prevents the cloning operation.
Use instant cloning with caution in extremely high concurrency/throughput production environments, as it requires clearing all connections to the template database within the cloning window (200ms), otherwise the clone will fail.
10 - Maintenance
Common system maintenance tasks
To ensure Pigsty and PostgreSQL clusters run healthily and stably, some routine maintenance work is required.
Regular Monitoring Review
Pigsty provides an out-of-the-box monitoring platform. We recommend you browse the monitoring dashboards once a day to keep track of system status.
At a minimum, we recommend you review the monitoring at least once a week, paying attention to alert events that occur, which can help you avoid most failures and issues in advance.
Here is a list of pre-defined alert rules in Pigsty.
Failover Follow-up
Pigsty’s high availability architecture allows PostgreSQL clusters to automatically perform primary-replica switchovers, meaning operations and DBAs don’t need to intervene or respond immediately.
However, users still need to perform the following follow-up work at an appropriate time (e.g., the next business day), including:
- Investigate and confirm the cause of the failure to prevent recurrence
- Restore the cluster’s original primary-replica topology as appropriate, or modify the configuration manifest to match the new primary-replica status.
- Refresh load balancer configuration through
bin/pgsql-svc to update service routing status - Refresh the cluster’s HBA rules through
bin/pgsql-hba to avoid primary-replica-specific rule drift - If necessary, use
bin/pgsql-rm to remove the failed server and expand with a new replica through bin/pgsql-add
Table Bloat Management
Long-running PostgreSQL will experience “table bloat” / “index bloat” phenomena, leading to system performance degradation.
Regularly using pg_repack to perform online rebuilding of tables and indexes helps maintain PostgreSQL’s good performance.
Pigsty has already installed and enabled this extension by default in all databases, so you can use it directly.
You can use Pigsty’s PGCAT Database - Table Bloat panel to
confirm table bloat and index bloat in the database. Select tables and indexes with high bloat rates (larger tables with bloat rates above 50%) and use pg_repack for online reorganization:
pg_repack dbname -t schema.table
Reorganization does not affect normal read and write operations, but the switching moment after reorganization completes requires an AccessExclusive lock on the table, blocking all access.
Therefore, for high-throughput businesses, it’s recommended to perform this during off-peak periods or maintenance windows. For more details, please refer to: Managing Relation Bloat
VACUUM FREEZE
Freezing expired transaction IDs (VACUUM FREEZE) is an important PostgreSQL maintenance task used to prevent transaction ID (XID) exhaustion leading to downtime.
Although PostgreSQL already provides an automatic vacuum (AutoVacuum) mechanism, for high-standard production environments,
we still recommend combining both automatic and manual approaches, regularly executing database-wide VACUUM FREEZE to ensure XID safety.
You can manually execute VACUUM FREEZE on a database using the following commands:
-- Execute VACUUM FREEZE on the entire database
VACUUM FREEZE;
-- Execute VACUUM FREEZE on a specific table
VACUUM FREEZE schema.table_name;
Or set up a scheduled task through crontab, for example, execute every Sunday morning:
# Execute VACUUM FREEZE on all databases every Sunday at 3 AM
0 3 * * 0 postgres psql -c 'VACUUM FREEZE;' dbname
11 - Version Upgrade
How to upgrade (or downgrade) PostgreSQL minor version kernel, and how to perform major version upgrades
Minor Version Upgrade
To perform a minor version server upgrade/downgrade, you first need to add software to your local software repository: the latest PG minor version RPM/DEB.
First perform a rolling upgrade/downgrade on all replicas, then execute a cluster switchover to upgrade/downgrade the primary.
ansible <cls> -b -a "yum upgrade/downgrade -y <pkg>" # Upgrade/downgrade packages
pg restart --force <cls> # Restart cluster
This time we’ll perform a rolling upgrade:
ansible pg-test -b -a "yum upgrade -y postgresql15*" # Upgrade packages (or apt upgrade)
ansible pg-test -b -a '/usr/pgsql/bin/pg_ctl --version' # Check binary version is 15.2
pg restart --role replica --force pg-test # Restart replicas
pg switchover --leader pg-test-1 --candidate=pg-test-2 --scheduled=now --force pg-test # Switchover primary and replica
pg restart --role primary --force pg-test # Restart primary
Minor Version Downgrade
Add 15.1 packages to the software repository and refresh the node’s yum/apt cache:
cd ~/pigsty; ./infra.yml -t repo_upstream # Add upstream repository
cd /www/pigsty; repotrack postgresql15-*-15.1 # Add 15.1 packages to yum repository
cd ~/pigsty; ./infra.yml -t repo_create # Rebuild repository metadata
ansible pg-test -b -a 'yum clean all' # Clean node repository cache
ansible pg-test -b -a 'yum makecache' # Regenerate yum cache from new repository
# For Ubuntu/Debian users, use apt instead of yum
ansible pg-test -b -a 'apt clean' # Clean node repository cache
ansible pg-test -b -a 'apt update' # Regenerate apt cache from new repository
Execute downgrade and restart cluster:
ansible pg-test -b -a "yum downgrade -y postgresql15*" # Downgrade packages
pg restart --force pg-test # Restart entire cluster to complete upgrade
Major Version Upgrade
The simplest way to perform a major version upgrade is to create a new cluster using the new version, then perform online migration through logical replication and blue-green deployment.
You can also perform an in-place major version upgrade. When you only use the database kernel itself, this is not complicated; use PostgreSQL’s built-in pg_upgrade:
Suppose you want to upgrade PostgreSQL major version from 14 to 15. You first need to add software to the repository and ensure that core extension plugins installed on both sides of the two major versions also have the same version numbers.
./pgsql.yml -t pg_pkg -e pg_version=15 # Install pg 15 packages
sudo su - postgres; mkdir -p /data/postgres/pg-meta-15/data/ # Prepare directory for 15
pg_upgrade -b /usr/pgsql-14/bin/ -B /usr/pgsql-15/bin/ -d /data/postgres/pg-meta-14/data/ -D /data/postgres/pg-meta-15/data/ -v -c # Pre-check
pg_upgrade -b /usr/pgsql-14/bin/ -B /usr/pgsql-15/bin/ -d /data/postgres/pg-meta-14/data/ -D /data/postgres/pg-meta-15/data/ --link -j8 -v -c
rm -rf /usr/pgsql; ln -s /usr/pgsql-15 /usr/pgsql; # Fix binary link
mv /data/postgres/pg-meta-14 /data/postgres/pg-meta-15 # Rename data directory
rm -rf /pg; ln -s /data/postgres/pg-meta-15 /pg # Fix data directory link