Implementing HAProxy High Availability: A Complete Migration Guide

Migrating from a single HAProxy instance to high availability using keepalived and VRRP. This guide covers implementing automatic failover with Virtual IP management, SSL certificate synchronization and zero-downtime service continuity.

6 min read
Implementing HAProxy High Availability: A Complete Migration Guide

Running a single HAProxy instance creates a critical single point of failure for your entire infrastructure. When that server goes down, all your services become unreachable regardless of how many backend servers you have running. Having previously migrated from Citrix NetScaler to HAProxy (detailed in my earlier post), I found myself missing one key enterprise feature: automatic high availability failover.

The Problem: Missing Enterprise-Grade Availability

In my previous NetScaler setup, I had enjoyed the benefits of an active-passive HA pair with automatic failover. When I migrated to a single HAProxy instance, I gained significant cost savings and configuration simplicity, but lost the reliability that comes with redundant load balancers.

The Problem: Single Point of Failure

My existing setup consisted of:

  • Single HAProxy server (10.0.1.10) handling all traffic
  • 18 different services across multiple domains
  • Backend services distributed across three Docker hosts (10.0.1.20, 10.0.1.21, 10.0.1.22)
  • SSL termination with wildcard certificates

While the backend services were load-balanced and resilient, the HAProxy server itself was a critical vulnerability. Any hardware failure, software crash, or maintenance requirement would take down all services.

The Solution: HAProxy High Availability with Keepalived

The solution involves:

  • Two HAProxy servers in an active-passive configuration
  • Virtual IP (VIP) managed by keepalived using VRRP protocol
  • Automatic SSL certificate synchronization
  • Health monitoring and automatic failover

Architecture Overview

Internet → VIP (10.0.1.100) → Active HAProxy → Backend Services
                    ↓
              Standby HAProxy (ready to take over)

Implementation Steps

Step 1: Prepare the Secondary Server

First, I set up a second Ubuntu server (10.0.1.11) with HAProxy and keepalived:

sudo apt update
sudo apt install haproxy keepalived -y

Step 2: Configure the Virtual IP Strategy

I chose 10.0.1.100 as the Virtual IP address that would float between the two HAProxy servers. This required:

  • Primary server: 10.0.1.10 (priority 110)
  • Secondary server: 10.0.1.11 (priority 100)
  • Virtual IP: 10.0.1.100

The higher priority server becomes the master and holds the VIP.

Step 3: Keepalived Configuration

Primary Server Configuration (/etc/keepalived/keepalived.conf):

vrrp_script chk_haproxy {
    script "/usr/bin/killall -0 haproxy"
    interval 2
    weight -15
    fall 3
    rise 2
}

vrrp_instance VI_1 {
    state MASTER
    interface ens18
    virtual_router_id 51
    priority 110
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass your_secure_password
    }
    virtual_ipaddress {
        10.0.1.100/24
    }
    track_script {
        chk_haproxy
    }
    notify_master "/etc/keepalived/master.sh"
    notify_backup "/etc/keepalived/backup.sh"
    notify_fault "/etc/keepalived/fault.sh"
}

Secondary Server Configuration - identical except:

  • state BACKUP
  • priority 100

The key insight here is the weight -15 parameter. When HAProxy fails on a server, the weight reduces the priority by 15 points, making it lower than the healthy backup server's priority.

Step 4: HAProxy Configuration Synchronization

Both servers need identical HAProxy configurations. I copied the existing configuration:

scp /etc/haproxy/haproxy.cfg admin@10.0.1.11:/etc/haproxy/

The original HAProxy configuration remained unchanged - the HA functionality comes entirely from keepalived managing the VIP.

Step 5: SSL Certificate Synchronization

This proved to be the most complex part of the implementation. SSL certificates need to be available on both servers for seamless failover.

Created an automated sync script (/usr/local/bin/sync-certs.sh):

#!/bin/bash
PRIMARY_HOST="10.0.1.10"
SECONDARY_HOST="10.0.1.11"
CERT_DIR="/etc/haproxy/certs"
LOG_FILE="/var/log/cert-sync.log"

# Only run on primary server
CURRENT_IP=$(hostname -I | awk '{print $1}')
if [ "$CURRENT_IP" != "$PRIMARY_HOST" ]; then
    echo "$(date): This script should only run on the primary server" >> $LOG_FILE
    exit 1
fi

# Check secondary server connectivity
if ! ping -c 1 -W 2 $SECONDARY_HOST > /dev/null 2>&1; then
    echo "$(date): Secondary server unreachable" >> $LOG_FILE
    exit 1
fi

# Sync certificates using rsync
if rsync -avz --delete --timeout=30 --perms --chmod=644 $CERT_DIR/ admin@$SECONDARY_HOST:$CERT_DIR/ >> $LOG_FILE 2>&1; then
    echo "$(date): Certificate sync successful" >> $LOG_FILE
    
    # Fix ownership and test configuration
    ssh admin@$SECONDARY_HOST "sudo chown root:haproxy $CERT_DIR/*.pem 2>/dev/null" >> $LOG_FILE 2>&1
    
    if ssh admin@$SECONDARY_HOST "sudo haproxy -f /etc/haproxy/haproxy.cfg -c" >> $LOG_FILE 2>&1; then
        ssh admin@$SECONDARY_HOST "sudo systemctl reload haproxy" >> $LOG_FILE 2>&1
        echo "$(date): HAProxy reloaded successfully" >> $LOG_FILE
    fi
else
    echo "$(date): Certificate sync failed" >> $LOG_FILE
    exit 1
fi

Automated the sync with cron (every 5 minutes):

*/5 * * * * /usr/local/bin/sync-certs.sh

Step 6: SSH and Sudo Configuration

The certificate sync required passwordless SSH and sudo access. I configured:

  1. SSH key authentication between servers
  2. Sudoers configuration on the secondary server for specific commands:
admin ALL=(ALL) NOPASSWD: /usr/bin/rsync, /bin/chown, /bin/chmod, /usr/sbin/haproxy, /bin/systemctl reload haproxy, /bin/systemctl status haproxy

Step 7: Permission Troubleshooting

Initially, certificate sync failed due to permission issues. The solution required:

# Fix directory ownership and permissions
sudo chown -R root:haproxy /etc/haproxy/certs/
sudo chmod 755 /etc/haproxy/certs/
sudo chmod 644 /etc/haproxy/certs/*.pem

This allows HAProxy to read the certificates while maintaining security.

Critical Configuration Issues and Solutions

Problem 1: Failover Not Triggering

Initial testing revealed that when HAProxy failed on the primary server, the VIP didn't transfer to the secondary server. The issue was with priority weighting:

Original configuration:

  • Primary healthy: 110 + 2 = 112
  • Primary failed: 110 + 0 = 110
  • Secondary healthy: 100 + 2 = 102

Even with HAProxy failed, the primary (110) still had higher priority than the secondary (102).

Solution: Changed weight 2 to weight -15:

  • Primary healthy: 110 priority
  • Primary failed: 110 - 15 = 95 priority
  • Secondary healthy: 100 priority

Now when the primary fails (95), the secondary (100) correctly takes over.

Problem 2: HAProxy Version Compatibility

HAProxy 2.8.5 didn't support some newer configuration directives I initially included:

  • expose-fd listeners in stats socket
  • server-state-file in global section
  • load-server-state-from-file in defaults

Removing these unsupported options resolved configuration validation errors.

Testing the Implementation

Failover Test Results

Before failure:

  • VIP on primary server (10.0.1.10)
  • Services accessible via 10.0.1.100

During primary server failure:

  • HAProxy stopped on primary
  • VIP automatically moved to secondary server (10.0.1.11)
  • Services remained accessible with no downtime
  • Keepalived logs showed priority changes and master election

After primary recovery:

  • HAProxy restarted on primary
  • VIP returned to primary server (preemption due to higher base priority)
  • Services continued operating normally

Verification Commands

# Check VIP assignment
ip addr show | grep 10.0.1.100

# Monitor keepalived status
sudo journalctl -u keepalived -f

# Test service accessibility
curl -k -H "Host: example.com" https://10.0.1.100/ -I

Performance and Reliability Outcomes

The implementation achieved:

  • Zero-downtime failover: Services remain accessible during server failures
  • Automatic recovery: No manual intervention required for failover or failback
  • Certificate synchronization: SSL certificates automatically synced every 5 minutes
  • Health monitoring: Continuous monitoring of HAProxy service health
  • Service continuity: All 18 services across multiple domains maintained availability

Network Configuration Strategy

Rather than updating DNS records, I chose to implement the migration through network-level port forwarding. This approach involved updating the existing port forwarding rules from the public IP to redirect traffic from the original HAProxy server IP (10.0.1.10) to the new VIP (10.0.1.100).

This method provides several advantages:

  • Immediate failover capability without DNS propagation delays
  • No dependency on DNS TTL settings or caching
  • Simplified rollback if issues arise during implementation
  • Transparent to existing monitoring and health check systems

Lessons Learned

  1. Priority calculation matters: Understanding how keepalived calculates effective priority is crucial for proper failover behavior.
  2. Permission management is critical: SSL certificate synchronization requires careful attention to file ownership and permissions across systems.
  3. Version compatibility: Always verify that configuration directives are supported in your specific software versions.
  4. Testing is essential: Theoretical failover and actual failover behavior can differ significantly.
  5. Monitoring and logging: Comprehensive logging of both keepalived and the sync processes is vital for troubleshooting.

Maintenance Considerations

Planned maintenance can be performed by temporarily lowering the priority of the server being maintained, forcing failover to the secondary server:

# Force failover for maintenance
echo "vrrp_instance VI_1 { priority 90 }" > /tmp/keepalived-maintenance.conf
sudo systemctl reload keepalived

Certificate updates are handled automatically by the sync script, requiring certificates to be placed only on the primary server.

Migrating from a single HAProxy instance to a highly available configuration significantly improved infrastructure resilience. The implementation provides automatic failover with zero downtime, continuous health monitoring and simplified certificate management.

The key success factors were:

  • Proper priority weighting configuration
  • Automated certificate synchronization
  • Comprehensive testing of failover scenarios
  • Careful attention to permissions and authentication

This setup now provides enterprise-grade availability for all services while maintaining the simplicity of the original HAProxy configuration.