Technical Deep Dive

RALE - Resilient Adaptive Leader Election

Deep dive into RALE, a distributed consensus algorithm for leader election and maintaining consistency in any distributed system.

pgElephant Team
December 15, 2024
8 min read

Introduction

RALE (Resilient Adaptive Leader Election) is a distributed consensus and key-value store system built with modern C engineering practices. It provides reliable distributed coordination and persistent storage for distributed applications with strong consistency guarantees.

Unlike traditional consensus algorithms that focus solely on leader election, RALE combines consensus with a distributed key-value store, making it particularly well-suited for PostgreSQL clustering scenarios where both coordination and state management are critical.

System Architecture

Core Components

  • librale - Core consensus and distributed store library
  • raled - Daemon process for cluster management
  • ralectrl - Command-line interface for operations

Key Features

  • Thread-safe operations with proper synchronization
  • Memory-safe allocation/deallocation
  • TCP/UDP communication with failover
  • Unified cluster database storage

Cluster Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   (Node 1)      │◄──►│   (Node 2)      │◄──►│   (Node 3)      │
│   raled         │    │   raled         │    │   raled         │
│   + librale     │    │   + librale     │    │   + librale     │
│   + cluster.db  │    │   + cluster.db  │    │   + cluster.db  │
└─────────────────┘    └─────────────────┘    └─────────────────┘
       ▲                        ▲                        ▲
       │                        │                        │
┌─────────────┐        ┌─────────────┐        ┌─────────────┐
│  ralectrl   │        │  ralectrl   │        │  ralectrl   │
│    (CLI)    │        │    (CLI)    │        │    (CLI)    │
└─────────────┘        └─────────────┘        └─────────────┘

RALE Consensus Protocol

The RALE consensus protocol ensures distributed agreement and consistency across cluster nodes. It implements a variant of the Raft algorithm optimized for PostgreSQL clustering scenarios.

Leader Election Process

  1. Candidate Selection: Nodes transition to candidate state during leader timeout
  2. Vote Collection: Candidates request votes from cluster members
  3. Majority Decision: Node with majority votes becomes leader
  4. Heartbeat Maintenance: Leader sends regular heartbeats to maintain authority

Safety Properties

  • Election Safety: At most one leader per term
  • Leader Append-Only: Leaders never overwrite log entries
  • Log Matching: Consistent log replication across nodes
  • Leader Completeness: Leader contains all committed entries

Log Replication Mechanism

  1. Client Request: Application submits operation to leader
  2. Log Append: Leader appends entry to local log
  3. Replication: Leader replicates entry to follower nodes
  4. Commit: Entry committed when majority acknowledges
  5. Application: State machine applies committed entries

Distributed Store (DStore)

The DStore provides a distributed key-value storage layer with strong consistency guarantees. It uses a unified cluster database approach for efficient state management.

Storage Architecture

  • Hash Table: In-memory hash table for fast key lookups
  • Persistence: File-based storage for durability
  • Replication: Automatic replication across cluster nodes
  • Atomic Operations: Consistent read/write operations

Unified Database Benefits

  • Faster Startup: Single file load vs multiple file parsing
  • Consistent State: No race conditions between components
  • Efficient Storage: Binary format vs text parsing
  • Data Integrity: Atomic operations with mutex protection

API Design and Usage

Core API Functions

/* Core RALE API */
int rale_init(const config_t *config);
int rale_finit(void);
int rale_process_command(const char *command, 
                         char *response,
                         size_t response_size);
int rale_get_status(char *status, 
                    size_t status_size);
int rale_quram_process(void);

RALE State Structure

typedef struct rale_state_t {
  int32_t current_term;
  int32_t voted_for;
  int32_t leader_id;
  rale_role_t role;
  int32_t last_log_index;
  int32_t last_log_term;
  int32_t commit_index;
  int32_t last_applied;
  time_t last_heartbeat;
  time_t election_deadline;
} rale_state_t;

Quick Start Example

# Clone and build
git clone https://github.com/pgElephant/rale.git
cd rale && ./build.sh

# Start single node
raled --config conf/raled1.conf

# Use CLI to add nodes
ralectrl ADD --node-id 1 --node-name "node1" \
  --node-ip "127.0.0.1" --rale-port 7400 \
  --dstore-port 7401

Conclusion

RALE represents a significant advancement in distributed consensus systems, specifically designed for PostgreSQL clustering scenarios. By combining robust leader election with a distributed key-value store, RALE provides the foundation for highly available PostgreSQL deployments.

The unified database approach, thread-safe operations, and memory management make RALE suitable for production environments where reliability and performance are paramount. Its clean API design and comprehensive documentation make it accessible for developers building distributed PostgreSQL solutions.

Next Steps

Ready to implement RALE in your PostgreSQL cluster? Check out our comprehensive documentation and examples.