Documentationpgraft Documentation

pgraft Configuration Reference

Key Requirements

shared_preload_libraries = 'pgraft' on every server.
Identical cluster-wide settings for consensus and membership; only node identity differs.
Create and secure pgraft.data_dir manually before starting PostgreSQL.
Use the same pgraft.cluster_id across the deployment to prevent accidental partitioning.

Cluster Identity & Networking

Define the Raft cluster name and per-node identifiers. All nodes must agree on pgraft.cluster_id; each node receives a unique pgraft.node_id and port.

postgresql.conf identity block

shared_preload_libraries = 'pgraft'

# Cluster-wide identity
pgraft.cluster_id = 'production-cluster'

# Node-specific identity
pgraft.node_id = 1
pgraft.address = '10.0.0.11'
pgraft.port = 7001
pgraft.data_dir = '/var/lib/postgresql/pgraft'

# Optional descriptive metadata
pgraft.node_role = 'leader'
pgraft.zone = 'us-east-1a'

Identity Parameters

pgraft.cluster_id — Shared string that uniquely names the cluster.
pgraft.node_id — Integer ID (1-indexed). Do not reuse until a node is fully removed.
pgraft.address / pgraft.port — Host and port used for Raft RPCs.
pgraft.data_dir — Filesystem path for Raft log and snapshots.

Access Control

pg_hba.conf snippet

# Allow Raft communication within the cluster
host    replication     pgraft_cluster  10.0.0.0/24      md5
host    all             pgraft_cluster  10.0.0.0/24      md5

# Local management connections
local   all             postgres        trust

Consensus Timing & Batching

Tune election behaviour to balance availability and stability. Maintain the rule of thumb election_timeout = 10 × heartbeat_interval.

Consensus defaults

# Milliseconds
pgraft.heartbeat_interval = 100
pgraft.election_timeout = 1000
pgraft.append_batch_size = 512
pgraft.max_inflight_batches = 4
pgraft.quorum_required = 3

Low Latency

Set heartbeat to 40 ms, election to 400 ms, and batch size to 256. Ideal for single data centre deployments.

Balanced (Default)

Heartbeat 100 ms, election 1000 ms, batch size 512. Works for most regional clusters.

Geo-Distributed

Heartbeat 180 ms, election 2200 ms, batch size 1024 to absorb WAN latency.

Storage & Snapshot Settings

Snapshots prevent log growth by periodically materializing cluster state. Adjust thresholds to control disk usage and recovery time.

Snapshot configuration

pgraft.snapshot_interval = 10000   # Entries between snapshot checks
pgraft.snapshot_threshold = 8000   # Entries before forcing snapshot
pgraft.snapshot_retention = 3      # Number of snapshots retained per node
pgraft.log_retention_mb = 256      # Keep additional log for diagnostics

Keep snapshot_threshold lower than the volume of writes generated during maintenance windows so followers can recover using snapshots instead of full log replay.

Runtime Configuration API

Apply live changes without restarting PostgreSQL using SQL helper functions exposed by pgraft.

Inspect and modify configuration

-- Show current values (includes default + overrides)
SELECT * FROM pgraft_get_config();

-- Adjust heartbeat interval at runtime
SELECT pgraft_set_config('heartbeat_interval', '75');

-- Persist changes to disk so they survive restart
SELECT pgraft_save_config();

Rolling configuration across nodes

-- Example: raise quorum requirement to 5
BEGIN;
  SELECT pgraft_set_config('quorum_required', '5');
  SELECT pgraft_save_config();
COMMIT;

-- Repeat on each node or automate via Ansible/Terraform

Validation Checklist

Run SELECT * FROM pgraft_get_cluster_status(); to confirm leader identity and quorum.
Verify file permissions on pgraft.data_dir (chown postgres:postgres, mode 700).
Ensure firewall rules allow TCP traffic on every configured pgraft.port.
Monitor pgraft_log_get_stats() for unexpected increases in pending_snapshots after tuning thresholds.

PreviousSQL Reference

NextTroubleshooting