Monitoring
How to perform self-monitoring of infrastructure in Pigsty?
This document describes monitoring dashboards and alert rules for the INFRA module in Pigsty.
Dashboards
Pigsty provides the following monitoring dashboards for the Infra module:
| Dashboard | Description |
|---|---|
| Pigsty Home | Pigsty monitoring system homepage |
| INFRA Overview | Pigsty infrastructure self-monitoring overview |
| Nginx Instance | Nginx metrics and logs |
| Grafana Instance | Grafana metrics and logs |
| VictoriaMetrics Instance | VictoriaMetrics scraping/query status |
| VMAlert Instance | Alert rule execution status |
| Alertmanager Instance | Alert aggregation and notifications |
| VictoriaLogs Instance | Log ingestion, querying, and indexing |
| Logs Instance | View log information on a single node |
| VictoriaTraces Instance | Trace storage and querying |
| Inventory CMDB | CMDB visualization |
| ETCD Overview | etcd cluster monitoring |
Alert Rules
Pigsty provides the following two alert rules for the INFRA module:
| Alert Rule | Description |
|---|---|
InfraDown | Infrastructure component is down |
AgentDown | Monitoring agent is down |
You can modify or add new infrastructure alert rules in files/victoria/rules/infra.yml.
Alert Rule Configuration
################################################################
# Infrastructure Alert Rules #
################################################################
- name: infra-alert
rules:
#==============================================================#
# Infra Aliveness #
#==============================================================#
# infra components (victoria,grafana) down for 1m triggers a P1 alert
- alert: InfraDown
expr: infra_up < 1
for: 1m
labels: { level: 0, severity: CRIT, category: infra }
annotations:
summary: "CRIT InfraDown {{ $labels.type }}@{{ $labels.instance }}"
description: |
infra_up[type={{ $labels.type }}, instance={{ $labels.instance }}] = {{ $value | printf "%.2f" }} < 1
#==============================================================#
# Agent Aliveness #
#==============================================================#
# agent aliveness are determined directly by exporter aliveness
# including: node_exporter, pg_exporter, pgbouncer_exporter, haproxy_exporter
- alert: AgentDown
expr: agent_up < 1
for: 1m
labels: { level: 0, severity: CRIT, category: infra }
annotations:
summary: 'CRIT AgentDown {{ $labels.ins }}@{{ $labels.instance }}'
description: |
agent_up[ins={{ $labels.ins }}, instance={{ $labels.instance }}] = {{ $value | printf "%.2f" }} < 1
Feedback
Was this page helpful?
Thanks for the feedback! Please let us know how we can improve.
Sorry to hear that. Please let us know how we can improve.