This is the multi-page printable view of this section. Click here to print.
Module: PILOT
1 - Module: MySQL
MySQL used to be the “most popular open-source relational database in the world”.
Installation | Configuration | Administration | Playbook | Monitoring | Parameters
Overview
MySQL module is currently available in Pigsty Pro as a Beta Preview. Note that you should NOT use this MySQL deployment for production environments.
Installation
You can install MySQL 8.0 from the official software source on EL systems directly on the nodes managed by Pigsty.
# el 7,8,9
./node.yml -t node_install -e '{"node_repo_modules":"node,mysql","node_packages":["mysql-community-server,mysql-community-client"]}'
# debian / ubuntu
./node.yml -t node_install -e '{"node_repo_modules":"node,mysql","node_packages":["mysql-server"]}'
You can also add the MySQL package to the local repo and use the playbook mysql.yml for production deployment.
Configuration
This config snippet defines a single-node MySQL instance, along with its Databases and Users.
my-test:
hosts: { 10.10.10.10: { mysql_seq: 1, mysql_role: primary } }
vars:
mysql_cluster: my-test
mysql_databases:
- { name: meta }
mysql_users:
- { name: dbuser_meta ,host: '%' ,password: 'dbuesr_meta' ,priv: { "*.*": "SELECT, UPDATE, DELETE, INSERT" } }
- { name: dbuser_dba ,host: '%' ,password: 'DBUser.DBA' ,priv: { "*.*": "ALL PRIVILEGES" } }
- { name: dbuser_monitor ,host: '%' ,password: 'DBUser.Monitor' ,priv: { "*.*": "SELECT, PROCESS, REPLICATION CLIENT" } ,connlimit: 3 }
Administration
Here are some basic MySQL cluster management operations:
Create MySQL cluster with mysql.yml:
./mysql.yml -l my-test
Playbook
Pigsty has the following playbooks related to the MYSQL module:
mysql.yml: Deploy MySQL according to the inventory
mysql.yml
The playbook mysql.yml contains the following subtasks:
mysql-id : generate mysql instance identity
mysql_clean : remove existing mysql instance (DANGEROUS)
mysql_dbsu : create os user mysql
mysql_install : install mysql rpm/deb packages
mysql_dir : create mysql data & conf dir
mysql_config : generate mysql config file
mysql_boot : bootstrap mysql cluster
mysql_launch : launch mysql service
mysql_pass : write mysql password
mysql_db : create mysql biz database
mysql_user : create mysql biz user
mysql_exporter : launch mysql exporter
mysql_register : register mysql service to prometheus
Monitoring
Pigsty has two built-in MYSQL dashboards:
MYSQL Overview: MySQL cluster overview
MYSQL Instance: MySQL instance overview
Parameters
MySQL’s available parameters:
#-----------------------------------------------------------------
# MYSQL_IDENTITY
#-----------------------------------------------------------------
# mysql_cluster: #CLUSTER # mysql cluster name, required identity parameter
# mysql_role: replica #INSTANCE # mysql role, required, could be primary,replica
# mysql_seq: 0 #INSTANCE # mysql instance seq number, required identity parameter
#-----------------------------------------------------------------
# MYSQL_BUSINESS
#-----------------------------------------------------------------
# mysql business object definition, overwrite in group vars
mysql_users: [] # mysql business users
mysql_databases: [] # mysql business databases
mysql_services: [] # mysql business services
# global credentials, overwrite in global vars
mysql_root_username: root
mysql_root_password: DBUser.Root
mysql_replication_username: replicator
mysql_replication_password: DBUser.Replicator
mysql_admin_username: dbuser_dba
mysql_admin_password: DBUser.DBA
mysql_monitor_username: dbuser_monitor
mysql_monitor_password: DBUser.Monitor
#-----------------------------------------------------------------
# MYSQL_INSTALL
#-----------------------------------------------------------------
# - install - #
mysql_dbsu: mysql # os dbsu name, mysql by default, better not change it
mysql_dbsu_uid: 27 # os dbsu uid and gid, 306 for default mysql users and groups
mysql_dbsu_home: /var/lib/mysql # mysql home directory, `/var/lib/mysql` by default
mysql_dbsu_ssh_exchange: true # exchange mysql dbsu ssh key among same mysql cluster
mysql_packages: # mysql packages to be installed, `mysql-community*` by default
- mysql-community*
- mysqld_exporter
# - bootstrap - #
mysql_data: /data/mysql # mysql data directory, `/data/mysql` by default
mysql_listen: '0.0.0.0' # mysql listen addresses, comma separated IP list
mysql_port: 3306 # mysql listen port, 3306 by default
mysql_sock: /var/lib/mysql/mysql.sock # mysql socket dir, `/var/lib/mysql/mysql.sock` by default
mysql_pid: /var/run/mysqld/mysqld.pid # mysql pid file, `/var/run/mysqld/mysqld.pid` by default
mysql_conf: /etc/my.cnf # mysql config file, `/etc/my.cnf` by default
mysql_log_dir: /var/log # mysql log dir, `/var/log/mysql` by default
mysql_exporter_port: 9104 # mysqld_exporter listen port, 9104 by default
mysql_parameters: {} # extra parameters for mysqld
mysql_default_parameters: # default parameters for mysqld
2 - Module: Kafka
Kafka is an open-source distributed event streaming platform: Installation | Configuration | Administration | Playbook | Monitoring | Parameters | Resources
Overview
Kafka module is currently available in Pigsty Pro as a Beta Preview.
Installation
If you are using the open-source version of Pigsty, you can install Kafka and its Java dependencies on the specified node using the following command.
Pigsty provides Kafka 3.8.0 RPM and DEB packages in the official Infra repository, which can be downloaded and installed directly.
./node.yml -t node_install -e '{"node_repo_modules":"infra","node_packages":["kafka"]}'
Kafka requires a Java runtime environment, so you need to install an available JDK when installing Kafka (OpenJDK 17 is used by default, but other JDKs and versions, such as 8 and 11, can also be used).
# EL7 (no JDK 17 support)
./node.yml -t node_install -e '{"node_repo_modules":"node","node_packages":["java-11-openjdk-headless"]}'
# EL8 / EL9 (use OpenJDK 17)
./node.yml -t node_install -e '{"node_repo_modules":"node","node_packages":["java-17-openjdk-headless"]}'
# Debian / Ubuntu (use OpenJDK 17)
./node.yml -t node_install -e '{"node_repo_modules":"node","node_packages":["openjdk-17-jdk"]}'
Configuration
Single node Kafka configuration example. Please note that in Pigsty single machine deployment mode, the 9093 port on the admin node is already occupied by AlertManager.
It is recommended to use other ports when installing Kafka on the admin node, such as (9095).
kf-main:
hosts:
10.10.10.10: { kafka_seq: 1, kafka_role: controller }
vars:
kafka_cluster: kf-main
kafka_data: /data/kafka
kafka_peer_port: 9095 # 9093 is already hold by alertmanager
3-node Kraft mode Kafka cluster configuration example:
kf-test:
hosts:
10.10.10.11: { kafka_seq: 1, kafka_role: controller }
10.10.10.12: { kafka_seq: 2, kafka_role: controller }
10.10.10.13: { kafka_seq: 3, kafka_role: controller }
vars:
kafka_cluster: kf-test
Administration
Here are some basic Kafka cluster management operations:
Create Kafka clusters with kafka.yml playbook:
./kafka.yml -l kf-main
./kafka.yml -l kf-test
Create a topic named test:
kafka-topics.sh --create --topic test --partitions 1 --replication-factor 1 --bootstrap-server localhost:9092
Here the --replication-factor 1 means each data will be replicated once, and --partitions 1 means only one partition will be created.
Use the following command to view the list of Topics in Kafka:
kafka-topics.sh --bootstrap-server localhost:9092 --list
Use the built-in Kafka producer to send messages to the test Topic:
kafka-console-producer.sh --topic test --bootstrap-server localhost:9092
>haha
>xixi
>hoho
>hello
>world
> ^D
Use the built-in Kafka consumer to read messages from the test Topic:
kafka-console-consumer.sh --topic test --from-beginning --bootstrap-server localhost:9092
Playbook
Pigsty provides 1 playbook related to the Kafka module for managing Kafka clusters.
kafka.yml
The kafka.yml playbook for deploying Kafka KRaft mode cluster contains the following subtasks:
kafka-id : generate kafka instance identity
kafka_clean : remove existing kafka instance (DANGEROUS)
kafka_user : create os user kafka
kafka_pkg : install kafka rpm/deb packages
kafka_link : create symlink to /usr/kafka
kafka_path : add kafka bin path to /etc/profile.d
kafka_svc : install kafka systemd service
kafka_dir : create kafka data & conf dir
kafka_config : generate kafka config file
kafka_boot : bootstrap kafka cluster
kafka_launch : launch kafka service
kafka_exporter : launch kafka exporter
kafka_register : register kafka service to prometheus
Monitoring
Pigsty has provided two monitoring panels related to the KAFKA module:
KAFKA Overview shows the overall monitoring metrics of the Kafka cluster.
KAFKA Instance shows the monitoring metrics details of a single Kafka instance.
Parameters
Available parameters for Kafka module:
#kafka_cluster: #CLUSTER # kafka cluster name, required identity parameter
#kafka_role: controller #INSTANCE # kafka role, controller, broker, or controller-only
#kafka_seq: 0 #INSTANCE # kafka instance seq number, required identity parameter
kafka_clean: false # cleanup kafka during init? false by default
kafka_data: /data/kafka # kafka data directory, `/data/kafka` by default
kafka_version: 3.8.0 # kafka version string
scala_version: 2.13 # kafka binary scala version
kafka_port: 9092 # kafka broker listen port
kafka_peer_port: 9093 # kafka broker peer listen port, 9093 by default (conflict with alertmanager)
kafka_exporter_port: 9308 # kafka exporter listen port, 9308 by default
kafka_parameters: # kafka parameters to be added to server.properties
num.network.threads: 3
num.io.threads: 8
socket.send.buffer.bytes: 102400
socket.receive.buffer.bytes: 102400
socket.request.max.bytes: 104857600
num.partitions: 1
num.recovery.threads.per.data.dir: 1
offsets.topic.replication.factor: 1
transaction.state.log.replication.factor: 1
transaction.state.log.min.isr: 1
log.retention.hours: 168
log.segment.bytes: 1073741824
log.retention.check.interval.ms: 300000
#log.retention.bytes: 1073741824
#log.flush.interval.ms: 1000
#log.flush.interval.messages: 10000
Resources
Pigsty provides some Kafka-related extension plugins for PostgreSQL:
kafka_fdw: A useful FDW that allows users to read and write Kafka Topic data directly from PostgreSQLwal2json: Used to logically decode WAL from PostgreSQL and generate JSON-formatted change datawal2mongo: Used to logically decode WAL from PostgreSQL and generate BSON-formatted change datadecoder_raw: Used to logically decode WAL from PostgreSQL and generate SQL-formatted change datatest_decoding: Used to logically decode WAL from PostgreSQL and generate RAW-formatted change data
3 - Module: DuckDB
DuckDB is a fast in-process analytical database: Installation | Resources
Overview
DuckDB is an embedded database, so it does not require deployment or service management. You only need to install the DuckDB package on the node to use it.
Installation
Pigsty already provides DuckDB software package (RPM / DEB) in the Infra software repository, you can install it with the following command:
./node.yml -t node_install -e '{"node_repo_modules":"infra","node_packages":["duckdb"]}'
Resources
There are some DuckDB-related extension plugins provided by Pigsty for PostgreSQL:
pg_analytics: Add OLAP capabilities to PostgreSQL based on DuckDBpg_lakehouse: Data lakehouse plugin by ParadeDB, wrapping DuckDB. (Currently planned to be renamed back topg_analytics)duckdb_fdw: Foreign data wrapper for DuckDB, read/write DuckDB data files from PGpg_duckdb: WIP extension plugin by DuckDB official MotherDuck and Hydra (only available on EL systems as a pilot)
4 - Module: TigerBeetle
TigerBeetle is a financial accounting transaction database offering extreme performance and reliability.
Overview
The TigerBeetle module is currently available for Beta preview only in the Pigsty Professional Edition.
Installation
Pigsty Infra Repo has the RPM / DEB packages for TigerBeetle, use the following command to install:
./node.yml -t node_install -e '{"node_repo_modules":"infra","node_packages":["tigerbeetle"]}'
After installation, please refer to the official documentation for configuration: https://github.com/tigerbeetle/tigerbeetle
Please note that TigerBeetle supports only Linux kernel version 5.5 or higher, making it incompatible by default with EL7 (3.10) and EL8 (4.18) systems.
To install TigerBeetle, please use EL9 (5.14), Ubuntu 22.04 (5.15), Debian 12 (6.1), Debian 11 (5.10), or another supported system.
5 - Module: Kubernetes
Kubernetes is a production-grade, open-source container orchestration platform. It helps you automate, deploy, scale, and manage containerized applications.
Pigsty has native support for ETCD clusters, which can be used by Kubernetes. Therefore, the pro version also provides the KUBE module for deploying production-grade Kubernetes clusters.
The KUBE module is currently in Beta status and only available for Pro edition customers.
However, you can directly specify node repositories in Pigsty, install Kubernetes packages, and use Pigsty to adjust environment configurations and provision nodes for K8S deployment, solving the last mile delivery problem.
SealOS
SealOS is a lightweight, high-performance, and easy-to-use Kubernetes distribution. It is designed to simplify the deployment and management of Kubernetes clusters.
Pigsty provides SealOS 5.0 RPM and DEB packages in the Infra repository, which can be downloaded and installed directly, and use SealOS to manage clusters.
./node.yml -t node_install -e '{"node_repo_modules":"infra","node_packages":["sealos"]}'
Kubernetes
If you prefer to deploy Kubernetes using the classic Kubeadm, please refer to the module reference below.
./node.yml -t node_install -e '{"node_repo_modules":"kube","node_packages":["kubeadm,kubelet,kubectl"]}'
Kubernetes supports multiple container runtimes. If you want to use Containerd as the container runtime, please make sure Containerd is installed on the node.
./node.yml -t node_install -e '{"node_repo_modules":"node,docker","node_packages":["containerd.io"]}'
If you want to use Docker as the container runtime, you need to install Docker and bridge with the cri-dockerd project (not available on EL9/D11/U20 yet):
./node.yml -t node_install -e '{"node_repo_modules":"node,infra,docker","node_packages":["docker-ce,docker-compose-plugin,cri-dockerd"]}'
Playbook
kube.yml playbook (TBD)
Monitoring
TBD
Parameters
Kubernetes module parameters:
#kube_cluster: #IDENTITY# # define kubernetes cluster name
kube_role: node # default kubernetes role (master|node)
kube_version: 1.31.0 # kubernetes version
kube_registry: registry.aliyuncs.com/google_containers # kubernetes version aliyun k8s miiror repository
kube_pod_cidr: "10.11.0.0/16" # kubernetes pod network cidr
kube_service_cidr: "10.12.0.0/16" # kubernetes service network cidr
kube_dashboard_admin_user: dashboard-admin-sa # kubernetes dashboard admin user name
6 - Module: Consul
Consul is a distributed DCS + KV + DNS + service registry/discovery component.
In the old version (1.x) of Pigsty, Consul was used as the default high-availability DCS. Now this support has been removed, but it will be provided as a separate module in the future.
Configuration
To deploy Consul, you need to add the IP addresses and hostnames of all nodes to the consul group.
At least one node should be designated as the consul server with consul_role: server, while other nodes default to consul_role: node.
consul:
hosts:
10.10.10.10: { nodename: meta , consul_role: server }
10.10.10.11: { nodename: node-1 }
10.10.10.12: { nodename: node-2 }
10.10.10.13: { nodename: node-3 }
For production deployments, we recommend using an odd number of Consul Servers, preferably three.
Parameters
#-----------------------------------------------------------------
# CONSUL
#-----------------------------------------------------------------
consul_role: node # consul role, node or server, node by default
consul_dc: pigsty # consul data center name, `pigsty` by default
consul_data: /data/consul # consul data dir, `/data/consul`
consul_clean: true # consul purge flag, if true, clean consul during init
consul_ui: false # enable consul ui, the default value for consul server is true
7 - Module: Victoria
VictoriaMetrics is the in-place replacement for Prometheus, offering better performance and compression ratio.
Overview
Victoria is currently only available in the Pigsty Professional Edition Beta preview. It includes the deployment and management of VictoriaMetrics and VictoriaLogs components.
Installation
Pigsty Infra Repo has the RPM / DEB packages for VictoriaMetrics, use the following command to install:
./node.yml -t node_install -e '{"node_repo_modules":"infra","node_packages":["victoria-metrics"]}'
./node.yml -t node_install -e '{"node_repo_modules":"infra","node_packages":["victoria-metrics-cluster"]}'
./node.yml -t node_install -e '{"node_repo_modules":"infra","node_packages":["victoria-metrics-utils"]}'
./node.yml -t node_install -e '{"node_repo_modules":"infra","node_packages":["victoria-logs"]}'
For common users, installing the standalone version of VictoriaMetrics is sufficient.
If you need to deploy a cluster, you can install the victoria-metrics-cluster package.
8 - Module: Jupyter
Run Jupyter notebook with Docker, you have to:
- Change the default password in
.env:JUPYTER_TOKEN - Create data dir with proper permission:
make dir, owned by1000:100 make upto pull up Jupyter with docker compose
cd ~/pigsty/app/jupyter ; make dir up
Visit http://lab.pigsty or http://10.10.10.10:8888, the default password is pigsty
Prepare
Create a data directory /data/jupyter, with the default uid & gid 1000:100:
make dir # mkdir -p /data/jupyter; chown -R 1000:100 /data/jupyter
Connect to Postgres
Use the Jupyter terminal to install psycopg2-binary & psycopg2 package.
pip install psycopg2-binary psycopg2
# install with a mirror
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple psycopg2-binary psycopg2
pip install --upgrade pip
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
Or installation with conda:
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge/
Then use the driver in your notebook:
import psycopg2
conn = psycopg2.connect('postgres://dbuser_dba:[email protected]:5432/meta')
cursor = conn.cursor()
cursor.execute('SELECT * FROM pg_stat_activity')
for i in cursor.fetchall():
print(i)
Alias
make up # pull up jupyter with docker compose
make dir # create required /data/jupyter and set owner
make run # launch jupyter with docker
make view # print jupyter access point
make log # tail -f jupyter logs
make info # introspect jupyter with jq
make stop # stop jupyter container
make clean # remove jupyter container
make pull # pull latest jupyter image
make rmi # remove jupyter image
make save # save jupyter image to /tmp/docker/jupyter.tgz
make load # load jupyter image from /tmp/docker/jupyter.tgz