DocumentationpgBalancer Documentation

Metrics & Monitoring

Monitoring Options

📊 Monitoring options

✅ REST API - Real-time metrics via HTTP/JSON endpoints
✅ bctl CLI - Command-line statistics and status
✅ Prometheus - Time-series metrics collection
✅ Grafana - Visual dashboards and alerting

REST API Metrics

pgBalancer REST API provides real-time metrics on port 8080 (configurable):

Backend Node Statistics

curl -s http://localhost:8080/api/v1/backends | jq

Connection Pool Status

curl -s http://localhost:8080/api/v1/pool/summary | jq

bctl CLI Monitoring

Use bctl command-line tool for real-time monitoring and statistics:

Node Status

# View all backend nodes
bctl node-status

# Table format (human-readable)
bctl node-status --format=table

# JSON format (for scripts)
bctl node-status --format=json

Pool Status

# View connection pool status
bctl pool-status --format=table

Statistics

# Get comprehensive statistics
bctl stats --format=table

# Process count
bctl proc-count

Prometheus Integration

Export pgBalancer metrics to Prometheus using postgres_exporter or custom scraping:

Prometheus Configuration

# prometheus.yml
scrape_configs:
  # pgBalancer REST API metrics
  - job_name: 'pgbalancer'
    scrape_interval: 15s
    static_configs:
      - targets: ['localhost:8080']
    metrics_path: '/api/stats'

Grafana Dashboards

Create Grafana dashboards to visualize pgBalancer metrics:

🎯 Key Dashboard Panels

• Backend Health Scores - AI health scoring visualization
• Query Distribution - Queries per backend over time
• Connection Pool Usage - Active vs idle connections
• Response Time - Average query latency by backend
• Failover Events - Backend up/down timeline
• Load Balance Efficiency - Query distribution fairness

MQTT Event Monitoring

Monitor pgBalancer events in real-time using MQTT:

MQTT Event Subscription

# Subscribe to all pgBalancer events
mosquitto_sub -h localhost -t 'pgbalancer/#' -v

# Subscribe to specific event types
mosquitto_sub -h localhost -t 'pgbalancer/node/status' -v
mosquitto_sub -h localhost -t 'pgbalancer/failover' -v

Logging and Log Analysis

Log Monitoring

# View pgBalancer logs
tail -f /var/log/pgbalancer/pgbalancer.log

# Filter for errors only
tail -f /var/log/pgbalancer/pgbalancer.log | grep ERROR

# View systemd logs
sudo journalctl -u pgbalancer -f

Performance Monitoring Queries

Real-Time Health Monitoring

# Monitor AI health scores in real-time
watch -n 5 'curl -s http://localhost:8080/api/backends | jq ".[] | {id, hostname, health_score, avg_response_time_ms}"'

Alerting Best Practices

⚠️ Critical Alerts

• Backend node down (health_score = 0)
• All backends unavailable
• Connection pool exhaustion (utilization > 90%)
• Failover events
• Health check failures (> 3 consecutive)

⚡ Warning Alerts

• Low health score (< 0.5)
• High response time (> 100ms avg)
• Connection pool usage (> 70%)
• Uneven query distribution
• Increased error rate

Key Metrics Reference

Metric	Type	Description	Source
`backend_status`	Gauge	Backend up (1) or down (0)	REST API, bctl
`health_score`	Gauge	AI health score (0.0 to 1.0)	REST API
`total_queries`	Counter	Total queries processed	REST API, bctl
`active_connections`	Gauge	Currently active connections	REST API, bctl

PreviousHigh Availability & Failover

NextMonitoring & Metrics