pgbalancer blog header

pgbalancer: AI-Powered PostgreSQL Connection Pooler

πŸ“¦ View on GitHub | πŸ“₯ Download Latest Release | πŸ“– Documentation

Executive Summary

PostgreSQL applications need efficient connection pooling to handle thousands of concurrent clients without overwhelming database servers. pgbalancer solves this by providing AI-powered connection pooling with machine learning load balancing, REST API management, and distributed MQTT coordination. Built as a modern evolution of pgpool-II, it adds intelligent query routing and contemporary DevOps-friendly tooling.

Introduction

Database connections are expensive. Each connection consumes memory, CPU, and file descriptors. Applications scale to thousands of concurrent users. Direct database connections become a bottleneck.

Traditional connection poolers manage resources. They lack intelligence in query distribution. pgbalancer combines connection pooling with machine learning. It learns query patterns and optimizes backend selection in real-time.

What Makes pgbalancer Different

AI-Powered Load Balancing

pgbalancer uses machine learning to route queries:

  • Adaptive Learning: Learns from query execution patterns and response times
  • Health Scoring: Real-time backend scoring based on performance (0.0-1.0 scale)
  • Predictive Routing: Forecasts query execution time and optimal backend selection
  • Weighted Selection: Exploration vs exploitation strategy (20% exploration rate)

The AI engine tracks metrics for each backend node:

  • Average response time
  • Current load and query count
  • Success/failure rates and error tracking
  • Predicted load based on historical patterns
  • Health scores updated after each query

REST API

pgbalancer provides an HTTP/JSON API:

POST   /api/v1/auth/login           # JWT authentication
GET    /api/v1/status               # Cluster status
GET    /api/v1/nodes                # Backend node list
POST   /api/v1/nodes/{id}/attach    # Node management
GET    /api/v1/health/stats         # Health metrics
POST   /api/v1/control/reload       # Configuration reload

The REST API runs as a dedicated child process with under 10ms response time. It provides access to cluster state and management functions.

MQTT Distributed Coordination

For multi-node deployments, pgbalancer uses MQTT for distributed coordination:

Topics:
pgbalancer/cluster/health # Health check broadcasts
pgbalancer/cluster/failover # Failover event notifications
pgbalancer/cluster/config # Configuration updates

This enables:

  • Automatic node discovery
  • Coordinated failover across instances
  • Real-time event streaming for monitoring
  • Integration with existing MQTT infrastructure

Professional CLI Tool (bctl)

The bctl command-line tool replaces multiple legacy pcp_* commands with a single unified interface:

# Modern table output with box-drawing
bctl --table nodes
β”Œβ”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ ID β”‚ Host β”‚ Port β”‚ Status β”‚ Weight β”‚ Role β”‚
β”œβ”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 0 β”‚ localhostβ”‚ 5433 β”‚ up β”‚ 1.0 β”‚ primary β”‚
β”‚ 1 β”‚ localhostβ”‚ 5434 β”‚ up β”‚ 1.0 β”‚ standby β”‚
β”‚ 2 β”‚ localhostβ”‚ 5435 β”‚ up β”‚ 1.0 β”‚ standby β”‚
β””β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
# JSON output for scripting
bctl --json nodes | jq '.nodes[] | select(.status=="up")'
# Real-time cluster monitoring
bctl --verbose health

Core Architecture

The Intelligence Layer

pgbalancer's machine learning engine sits between the connection pooler and backend databases:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         Client Applications             β”‚
β”‚      (PostgreSQL wire protocol)         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                β”‚
                β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         pgbalancer Main Process         β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚    Connection Pool Management      β”‚ β”‚
β”‚  β”‚  β€’ 32 init children (configurable) β”‚ β”‚
β”‚  β”‚  β€’ 4 connections per child         β”‚ β”‚
β”‚  β”‚  β€’ Session/transaction pooling     β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                β”‚
                β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚      AI Load Balancing Engine           β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚  Query Analysis                    β”‚ β”‚
β”‚  β”‚  β€’ Parse complexity (0-100 scale)  β”‚ β”‚
β”‚  β”‚  β€’ Detect read/write operations    β”‚ β”‚
β”‚  β”‚  β€’ Estimate row count              β”‚ β”‚
β”‚  β”‚  β€’ Predict execution time          β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚  Backend Selection                 β”‚ β”‚
β”‚  β”‚  β€’ Health scoring (0.0-1.0)        β”‚ β”‚
β”‚  β”‚  β€’ Weighted random selection       β”‚ β”‚
β”‚  β”‚  β€’ Exploration vs exploitation     β”‚ β”‚
β”‚  β”‚  β€’ Load prediction                 β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚  Adaptive Learning                 β”‚ β”‚
β”‚  β”‚  β€’ Learning rate: 10%              β”‚ β”‚
β”‚  β”‚  β€’ Metric decay for freshness      β”‚ β”‚
β”‚  β”‚  β€’ Continuous model updates        β”‚ β”‚
β”‚  β”‚  β€’ Success rate tracking           β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                β”‚
                β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚       PostgreSQL Backend Nodes          β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”β”‚
β”‚  β”‚ Primary  β”‚  β”‚ Standby 1β”‚  β”‚Standby2β”‚β”‚
β”‚  β”‚ RW       β”‚  β”‚ RO       β”‚  β”‚ RO     β”‚β”‚
β”‚  β”‚ Health:1.0β”‚ β”‚Health:0.98β”‚ β”‚ Health:β”‚β”‚
β”‚  β”‚          β”‚  β”‚          β”‚  β”‚  0.99  β”‚β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
       β”‚            β”‚            β”‚
       β–Ό            β–Ό            β–Ό
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚REST API β”‚  β”‚ MQTT   β”‚  β”‚ bctl    β”‚
  β”‚Port 8080β”‚  β”‚Events  β”‚  β”‚ CLI     β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

How It Works: The Query Lifecycle

  1. Connection Acceptance: Client connects via PostgreSQL protocol
  2. Connection Pooling: Reuse existing backend connection or create new
  3. Query Analysis: AI parses query complexity and type (read/write)
  4. Backend Selection: ML algorithm selects optimal backend based on:
    • Current health scores
    • Predicted response time
    • Backend load levels
    • Historical success rates
  5. Query Execution: Forward query to selected backend
  6. Feedback Learning: Update backend metrics based on actual performance
  7. Response Return: Stream results back to client
  8. Connection Return: Release connection back to pool

This cycle continuously improves routing decisions through machine learning feedback.

Installation and Configuration

Prerequisites

  • PostgreSQL 13, 14, 15, 16, 17, or 18
  • C compiler (gcc/clang)
  • Standard build tools (autoconf, automake, libtool, make)

Installation Steps

# Clone the repository
git clone https://github.com/pgElephant/pgbalancer.git
cd pgbalancer
# Generate configure script
autoreconf -fi
# Configure with options
./configure --with-openssl --with-pam
# Build and install
make
sudo make install

Basic Configuration

Create /etc/pgbalancer/pgbalancer.conf:

# Connection settings
listen_addresses = '*'
port = 5432
socket_dir = '/tmp'
pcp_listen_addresses = '*'
pcp_port = 9898
# Backend PostgreSQL servers
backend_hostname0 = 'localhost'
backend_port0 = 5433
backend_weight0 = 1
backend_data_directory0 = '/usr/local/pgsql/data1'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_hostname1 = 'localhost'
backend_port1 = 5434
backend_weight1 = 1
backend_data_directory1 = '/usr/local/pgsql/data2'
backend_flag1 = 'ALLOW_TO_FAILOVER'
backend_hostname2 = 'localhost'
backend_port2 = 5435
backend_weight2 = 1
backend_data_directory2 = '/usr/local/pgsql/data3'
backend_flag2 = 'ALLOW_TO_FAILOVER'
# Connection pooling
num_init_children = 32
max_pool = 4
child_life_time = 300
child_max_connections = 0
connection_cache = on
reset_query_list = 'ABORT; DISCARD ALL'
# Load balancing
load_balance_mode = on
ignore_leading_white_space = on
# Health checking
health_check_period = 30
health_check_timeout = 20
health_check_user = 'postgres'
health_check_password = 'postgres'
health_check_database = 'postgres'
health_check_max_retries = 3
# AI Load Balancing (NEW)
ai_load_balancing = on
ai_learning_rate = 0.01
ai_exploration_rate = 0.1
ai_health_weight = 0.4
ai_response_time_weight = 0.3
ai_load_weight = 0.3
# REST API Server (NEW)
rest_api_enabled = on
rest_api_port = 8080
rest_api_jwt_secret = 'your-secret-key-here'
rest_api_jwt_expiry = 3600
# MQTT Event Publishing (NEW)
mqtt_enabled = on
mqtt_broker = 'localhost'
mqtt_port = 1883
mqtt_client_id = 'pgbalancer'
mqtt_topic_prefix = 'pgbalancer'
# Watchdog
use_watchdog = on
wd_hostname = 'localhost'
wd_port = 9000

AI Configuration Parameters

The AI load balancing engine can be tuned via configuration:

Learning Rate (0.0-1.0): Controls how quickly the model adapts to new data

  • Default: 0.01 (1% adjustment per query)
  • Higher values: Faster adaptation, less stability
  • Lower values: More stability, slower adaptation

Exploration Rate (0.0-1.0): Balances exploration vs exploitation

  • Default: 0.1 (10% random selection for exploration)
  • Higher values: More experimentation with different backends
  • Lower values: More focus on best-performing backends

Health Weight (0.0-1.0): Importance of backend health in selection

  • Default: 0.4 (40% weight)
  • Combined with response_time_weight and load_weight (must sum to 1.0)

Additional Features

Intelligent Query Cache

pgbalancer includes ML-driven query caching:

# Query cache configuration
memory_cache_enabled = on
memqcache_method = 'shmem'
memqcache_total_size = 64MB
memqcache_max_num_cache = 1000000
memqcache_expire = 0
memqcache_cache_block_size = 1MB
# AI cache warming
ai_cache_warming = on
ai_cache_prefetch = on

The AI engine:

  • Learns which queries benefit from caching
  • Predicts cache hit probability
  • Automatically warms frequently accessed data
  • Prefetches based on query patterns

Automatic Failover

When a backend fails, pgbalancer handles it automatically:

failover_on_backend_error = off
detach_false_primary = on
auto_failover = on
failover_command = '/usr/local/bin/failover.sh %d %h %p %D %M'

Failover process:

  1. Detection: Health check detects failed backend
  2. AI Updates: AI immediately adjusts health score to 0.0
  3. Routing: New queries automatically routed to healthy backends
  4. MQTT Event: Publishes failover event to MQTT broker
  5. Recovery: When backend returns, health gradually improves
  6. Reintegration: Backend automatically rejoins rotation

Watchdog Clustering

For multi-pgbalancer deployments:

# Watchdog configuration
use_watchdog = on
wd_hostname = 'pgbalancer1'
wd_port = 9000
wd_priority = 1
# Other pgbalancer nodes
other_pgpool_hostname0 = 'pgbalancer2'
other_pgpool_port0 = 9999
other_wd_port0 = 9000
other_pgpool_hostname1 = 'pgbalancer3'
other_pgpool_port1 = 9999
other_wd_port1 = 9000
# Virtual IP
delegate_ip = '192.168.1.100'
if_up_cmd = '/usr/local/bin/if_up.sh'
if_down_cmd = '/usr/local/bin/if_down.sh'

REST API Usage

Authentication (Optional JWT)

# Get JWT token
curl -X POST http://localhost:8080/api/v1/auth/login \
-H "Content-Type: application/json" \
-d '{"username":"admin","password":"secret"}'
# Response
{
"token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
"expires_in": 3600
}
# Use token for authenticated requests
curl -H "Authorization: Bearer YOUR_TOKEN" \
http://localhost:8080/api/v1/status

Cluster Management

# Get cluster status
curl http://localhost:8080/api/v1/status
# Response
{
"status": "running",
"uptime": 86400,
"connections": 45,
"nodes": 3,
"healthy_nodes": 3,
"ai_enabled": true
}
# List backend nodes
curl http://localhost:8080/api/v1/nodes | jq '.'
# Response
{
"nodes": [
{
"id": 0,
"host": "localhost",
"port": 5433,
"status": "up",
"role": "primary",
"weight": 1.0,
"health_score": 1.0,
"connections": 15,
"queries": 1250,
"avg_response_ms": 2.5
},
{
"id": 1,
"host": "localhost",
"port": 5434,
"status": "up",
"role": "standby",
"weight": 1.0,
"health_score": 0.98,
"connections": 12,
"queries": 980,
"avg_response_ms": 2.8
},
{
"id": 2,
"host": "localhost",
"port": 5435,
"status": "up",
"role": "standby",
"weight": 1.0,
"health_score": 0.99,
"connections": 13,
"queries": 1050,
"avg_response_ms": 2.6
}
]
}
# Get health statistics
curl http://localhost:8080/api/v1/health/stats | jq '.'
# Attach a node
curl -X POST http://localhost:8080/api/v1/nodes/1/attach
# Reload configuration
curl -X POST http://localhost:8080/api/v1/control/reload

AI Statistics

# Get AI model statistics
curl http://localhost:8080/api/v1/ai/stats | jq '.'
# Response
{
"enabled": true,
"learning_rate": 0.01,
"exploration_rate": 0.1,
"total_queries": 15280,
"ai_routed_queries": 14752,
"exploration_queries": 1528,
"backends": [
{
"id": 0,
"avg_response_time": 2.5,
"current_load": 0.35,
"total_queries": 5120,
"successful_queries": 5098,
"failed_queries": 22,
"error_rate": 0.0043,
"predicted_load": 0.33,
"health_score": 1.0
}
]
}

CLI Tool Usage

Installation

The bctl tool is built with the main project:

# Already installed with 'make install'
which bctl
# /usr/local/bin/bctl
# Check version
bctl --version

Basic Commands

# Check cluster status
bctl status
# List nodes (default format)
bctl nodes
# List nodes with table format
bctl --table nodes
# List nodes with JSON output
bctl --json nodes
# Get node count
bctl nodes-count
# Attach a node
bctl nodes-attach 1
# Detach a node
bctl nodes-detach 1
# Promote a node to primary
bctl nodes-promote 2
# List processes
bctl processes
# Get health status
bctl health
# Reload configuration
bctl reload
# Watchdog status
bctl watchdog-status

Remote Management

# Connect to remote pgbalancer instance
bctl -H pgbalancer1.example.com -p 8080 -U admin status
# Verbose output
bctl -v --table nodes
# JSON output for scripting
bctl --json nodes | jq '.nodes[] | select(.health_score > 0.9)'

MQTT Integration

Subscribing to Events

# Subscribe to all pgbalancer events
mosquitto_sub -h localhost -t 'pgbalancer/#' -v
# Subscribe to node status changes
mosquitto_sub -h localhost -t 'pgbalancer/nodes/status' -v
# Subscribe to failover events
mosquitto_sub -h localhost -t 'pgbalancer/cluster/failover' -v
# Subscribe to health checks
mosquitto_sub -h localhost -t 'pgbalancer/health' -v

Event Examples

Node Status Change:

{
"event": "node_status_change",
"timestamp": "2025-11-03T10:30:45Z",
"node_id": 1,
"old_status": "up",
"new_status": "down",
"reason": "health_check_timeout",
"health_score": 0.0
}

Failover Event:

{
"event": "failover",
"timestamp": "2025-11-03T10:30:46Z",
"failed_node_id": 1,
"new_primary_id": 2,
"promotion_reason": "primary_failure",
"cluster_status": "degraded"
}

Health Check Result:

{
"event": "health_check",
"timestamp": "2025-11-03T10:31:00Z",
"node_id": 0,
"status": "healthy",
"response_time_ms": 2.3,
"health_score": 1.0,
"connections": 15,
"load": 0.35
}

Performance Optimization

Connection Pool Tuning

# Number of pre-forked child processes
num_init_children = 32
# Connections per child process
max_pool = 4
# Total connections = num_init_children Γ— max_pool
# Example: 32 Γ— 4 = 128 total backend connections
# Child process lifetime (seconds)
child_life_time = 300
# Maximum connections per child (0 = unlimited)
child_max_connections = 0
# Connection cache
connection_cache = on

Sizing Guidelines:

  • num_init_children: Set to expected concurrent sessions
  • max_pool: Typically 2-4 connections per child
  • Total connections should be < max_connections on PostgreSQL

AI Performance Tuning

# Fast adaptation for dynamic workloads
ai_learning_rate = 0.05
ai_exploration_rate = 0.15
# Stable performance for consistent workloads
ai_learning_rate = 0.01
ai_exploration_rate = 0.05
# Balanced (recommended starting point)
ai_learning_rate = 0.01
ai_exploration_rate = 0.1

Query Cache Optimization

# Large cache for read-heavy workloads
memqcache_total_size = 128MB
memqcache_max_num_cache = 2000000
# Smaller cache for write-heavy workloads
memqcache_total_size = 32MB
memqcache_max_num_cache = 500000

Monitoring and Observability

Prometheus Integration

pgbalancer exposes Prometheus metrics:

# Prometheus scrape configuration
scrape_configs:
- job_name: 'pgbalancer'
static_configs:
- targets: ['localhost:9191']

Key Metrics:

  • pgbalancer_connections_total: Total active connections
  • pgbalancer_queries_total: Total queries processed
  • pgbalancer_backend_health_score: Backend health scores
  • pgbalancer_query_response_time: Query response time histogram
  • pgbalancer_ai_routing_decisions: AI routing decisions
  • pgbalancer_cache_hits_total: Query cache hit rate

Grafana Dashboard

Example Grafana queries:

# Average response time per backend
avg(pgbalancer_query_response_time) by (backend_id)
# Health score over time
pgbalancer_backend_health_score
# Cache hit rate
rate(pgbalancer_cache_hits_total[5m]) /
rate(pgbalancer_queries_total[5m])
# AI exploration vs exploitation ratio
rate(pgbalancer_ai_exploration_queries[5m]) /
rate(pgbalancer_ai_routing_decisions[5m])

Comparison with Other Poolers

pgbalancer vs pgpool-II

Featurepgbalancerpgpool-II
AI Load Balancingβœ… Machine learning❌ Static algorithms
REST APIβœ… Full HTTP/JSON API⚠️ Binary PCP protocol
MQTT Clusteringβœ… Distributed events❌ None
CLI Toolβœ… Unified bctl⚠️ Multiple pcp_* tools
JWT Authenticationβœ… HMAC-SHA256❌ Password only
Predictive Routingβœ… AI-powered❌ None
Health Scoringβœ… 0.0-1.0 continuous⚠️ Binary up/down
Query Cacheβœ… AI-drivenβœ… Static
Performance<0.5ms routing~1ms routing

pgbalancer vs PgBouncer

FeaturepgbalancerPgBouncer
Connection Poolingβœ… Advancedβœ… Excellent
Load Balancingβœ… AI-powered❌ None
Query Routingβœ… Intelligent❌ None
Health Monitoringβœ… Comprehensive⚠️ Basic
Failoverβœ… Automatic❌ Manual
REST APIβœ… Full API❌ None
Query Cacheβœ… AI-driven❌ None
Use CaseFull clusterConnection pooling only

pgbalancer vs HAProxy

FeaturepgbalancerHAProxy
PostgreSQL Protocolβœ… Native⚠️ TCP only
Load Balancingβœ… AI query-aware⚠️ TCP-level
Connection Poolingβœ… Built-in❌ None
Session Managementβœ… PostgreSQL-aware❌ Generic
Health Checksβœ… PostgreSQL-specific⚠️ TCP/HTTP
Configurationβœ… PostgreSQL-focused⚠️ Generic

Production Deployment

High Availability Setup

Three-pgbalancer cluster with watchdog:

                  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                  β”‚  Virtual IP     β”‚
                  β”‚ 192.168.1.100   β”‚
                  β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β”‚                β”‚                β”‚
    β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”
    β”‚pgbalancer1β”‚    β”‚pgbalancer2β”‚    β”‚pgbalancer3β”‚
    β”‚ Primary   β”‚    β”‚ Standby   β”‚    β”‚ Standby   β”‚
    β”‚ Priority:1β”‚    β”‚ Priority:2β”‚    β”‚ Priority:3β”‚
    β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜
          β”‚                β”‚                β”‚
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚            β”‚            β”‚
        β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”
        β”‚ PostgreSQLβ”‚ PostgreSQLβ”‚ PostgreSQLβ”‚
        β”‚  Primary  β”‚ Standby 1 β”‚ Standby 2 β”‚
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Docker Deployment

FROM postgres:17
# Install dependencies
RUN apt-get update && apt-get install -y \
build-essential \
autoconf \
automake \
libtool \
postgresql-server-dev-17
# Build pgbalancer
COPY . /usr/src/pgbalancer
WORKDIR /usr/src/pgbalancer
RUN autoreconf -fi && \
./configure --with-openssl && \
make && \
make install
# Copy configuration
COPY pgbalancer.conf /etc/pgbalancer/pgbalancer.conf
EXPOSE 5432 8080 9898 9000
CMD ["pgbalancer", "-f", "/etc/pgbalancer/pgbalancer.conf", "-n"]
# docker-compose.yml
version: '3.8'
services:
pgbalancer:
build: .
ports:
- "5432:5432" # PostgreSQL protocol
- "8080:8080" # REST API
- "9898:9898" # PCP protocol
- "9000:9000" # Watchdog
environment:
- AI_LEARNING_RATE=0.01
- AI_EXPLORATION_RATE=0.1
- MQTT_BROKER=mqtt-broker
depends_on:
- pg-primary
- pg-replica1
- pg-replica2
- mqtt-broker
pg-primary:
image: postgres:17
environment:
POSTGRES_PASSWORD: postgres
ports:
- "5433:5432"
pg-replica1:
image: postgres:17
environment:
POSTGRES_PASSWORD: postgres
ports:
- "5434:5432"
pg-replica2:
image: postgres:17
environment:
POSTGRES_PASSWORD: postgres
ports:
- "5435:5432"
mqtt-broker:
image: eclipse-mosquitto:2
ports:
- "1883:1883"

Troubleshooting

Common Issues

Connection Refused:

# Check if pgbalancer is running
ps aux | grep pgbalancer
# Check listen address configuration
grep listen_addresses /etc/pgbalancer/pgbalancer.conf
# Verify port is listening
netstat -tlnp | grep 5432

Backend Connection Failures:

# Check backend health
bctl --table nodes
# View health check logs
tail -f /var/log/pgbalancer/pgbalancer.log | grep health_check
# Test direct backend connection
psql -h localhost -p 5433 -U postgres

AI Not Working:

# Verify AI is enabled
grep ai_load_balancing /etc/pgbalancer/pgbalancer.conf
# Check AI statistics via REST API
curl http://localhost:8080/api/v1/ai/stats | jq '.enabled'
# View AI routing decisions in logs
tail -f /var/log/pgbalancer/pgbalancer.log | grep "AI routing"

REST API Not Responding:

# Check if REST API child process is running
ps aux | grep pgbalancer | grep "PT_REST_API"
# Verify REST API is enabled
grep rest_api_enabled /etc/pgbalancer/pgbalancer.conf
# Test REST API
curl -v http://localhost:8080/api/v1/status

MQTT Connection Issues:

# Test MQTT broker connectivity
mosquitto_pub -h localhost -t 'test' -m 'hello'
# Check pgbalancer MQTT configuration
grep mqtt_ /etc/pgbalancer/pgbalancer.conf
# Monitor MQTT messages
mosquitto_sub -h localhost -t 'pgbalancer/#' -v

Security Considerations

JWT Authentication

# Generate strong secret
rest_api_jwt_secret = 'your-256-bit-secret-key-here'
# Set appropriate expiry
rest_api_jwt_expiry = 3600 # 1 hour
# Use HTTPS in production
# Configure nginx/HAProxy to terminate SSL

Network Security

# Restrict API access
rest_api_listen_addresses = '127.0.0.1'
# Restrict PCP access
pcp_listen_addresses = '127.0.0.1'
# Use SSL for backend connections
backend_flag0 = 'ALLOW_TO_FAILOVER|SSL'

MQTT Security

# Use MQTT with TLS
mqtt_broker = 'mqtts://secure-broker:8883'
# MQTT authentication
mqtt_username = 'pgbalancer'
mqtt_password = 'secure-password'

Best Practices

Configuration Management

  1. Version Control: Store pgbalancer.conf in git
  2. Environment Variables: Use different configs per environment
  3. Automated Testing: Test configuration changes before deployment
  4. Gradual Rollout: Update one node at a time in multi-node setups

AI Model Training

  1. Initial Training Period: Allow 1-2 weeks for AI to learn patterns
  2. Monitor Metrics: Track health scores and query distribution
  3. Tune Gradually: Adjust learning/exploration rates incrementally
  4. Document Changes: Record parameter changes and their effects

Capacity Planning

  1. Connection Limits: num_init_children Γ— max_pool < PostgreSQL max_connections
  2. Memory: ~10MB per child process + cache size
  3. CPU: AI routing adds ~0.1-0.5ms per query
  4. Network: Ensure low latency between pgbalancer and backends

Monitoring Checklist

  • βœ… Prometheus metrics scraping
  • βœ… Grafana dashboards for visualization
  • βœ… MQTT event monitoring
  • βœ… Log aggregation (ELK/Loki)
  • βœ… Alerting on health score degradation
  • βœ… Alerting on failover events
  • βœ… Regular health check validation

Roadmap

Upcoming Features

  • Enhanced AI Models: Deep learning for complex query patterns
  • Kubernetes Operator: Native Kubernetes deployment
  • GraphQL API: Alternative API interface
  • WebAssembly Plugins: Custom routing logic via WASM
  • Multi-Region Support: Geographic load distribution
  • Cost Optimization: Cloud cost-aware routing decisions

Conclusion

pgbalancer combines pooling technologies with machine learning, APIs, and distributed coordination. Whether you scale a single application or manage PostgreSQL clusters, pgbalancer provides the tools needed for production deployments.

Key Points:

  • Load balancing learns and adapts to your workload
  • REST API enables DevOps workflows
  • MQTT clustering provides distributed coordination
  • CLI tool simplifies management
  • Built on pgpool-II foundation
  • Compatible with PostgreSQL 13-18

Ready to deploy? Check out the Getting Started Guide or explore the GitHub repository.


Resources

Related Blog Posts

Support

For questions, issues, or commercial support, contact admin@pgelephant.com

Share This Article

Suggested hashtags:

#PostgreSQL#AI#LoadBalancing#DevOps#Database#MachineLearning#ConnectionPooling#RESTAPI#pgElephant#MQTT#CloudNative
πŸ“‹ View copy-ready text for manual posting
pgbalancer: AI-Powered PostgreSQL Connection Pooler

AI Load Balancing, REST API, MQTT Clustering - Modern PostgreSQL Connection Pooling with Machine Learning Optimization

#PostgreSQL #AI #LoadBalancing #DevOps #Database #MachineLearning #ConnectionPooling #RESTAPI #pgElephant #MQTT #CloudNative

https://www.pgelephant.com/blog/pgbalancer