This is article 51 in the Big Data series, covering Redis high availability architecture: master-slave replication, Sentinel mode, and distributed lock design.

High Availability Basics

High Availability (HA) refers to a system’s ability to stay operational while minimizing unplanned downtime, typically measured in “nines”:

SLAMax Annual Downtime
99.9% (3 nines)~8.76 hours
99.99% (4 nines)~52.6 minutes
99.999% (5 nines)~5.26 minutes

From CAP theory perspective, Redis HA solutions prioritize A (Availability) and P (Partition Tolerance), with possible brief data inconsistency during extreme network partitions.

Four pillars of HA: redundancy, failure detection, auto recovery, load balancing.

Redis Master-Slave Replication

Master-slave replication is the foundation of Redis HA. Slave nodes sync data from the master to provide read scaling and data redundancy.

Configuration

Add to slave’s redis.conf:

replicaof 192.168.1.100 6379

Or dynamically at runtime:

REPLICAOF 192.168.1.100 6379

Three Sync Modes

1. Full Sync

Triggered on first connection or when data gap is too large:

  1. Slave sends PSYNC ? -1 requesting full sync
  2. Master executes BGSAVE to generate RDB snapshot
  3. Master transfers RDB file to slave
  4. Slave loads RDB, meanwhile master sends buffered write commands
  5. Enter incremental sync mode

2. Incremental Sync

During normal operation, master sends each write command to all slaves in real-time.

3. Heartbeat

Slave sends REPLCONF ACK <offset> every 1 second, allowing master to check slave health and sync offset.

Common Topologies

  • One Master Multiple Slaves: Most common - master handles writes, slaves handle reads
  • Chain Replication: Slave has its own slaves, distributing master’s replication load

Redis Sentinel Mode

Sentinel mode adds automatic failover capability on top of master-slave replication. Sentinels are independent processes that continuously monitor Redis instance health.

Sentinel Core Responsibilities

ResponsibilityDescription
MonitoringContinuously checks master/slave health via PING
NotificationNotifies admin or other programs via API on anomalies
FailoverAutomatically elects new master and reconfigures slaves when master fails
Config ProviderClients obtain current master address through Sentinel

Failure Detection & Failover Flow

Single sentinel detects master unresponsive

Subjective Down (SDOWN): Sentinel suspects master failure

Sentinel voting (requires quorum)

Objective Down (ODOWN): Majority confirm master failure

Raft algorithm elects leader sentinel

Leader sentinel executes failover:
  1. Select best candidate from slaves (priority, replication offset)
  2. Promote candidate to new master
  3. Notify other slaves to replicate new master
  4. Update configuration files

Docker Deployment Example

Complete docker-compose.yml for 1 master + 2 slaves + 3 sentinels:

version: '3'
services:
  redis-master:
    image: redis:7
    ports:
      - "6379:6379"

  redis-slave-1:
    image: redis:7
    ports:
      - "6380:6379"
    command: redis-server --replicaof redis-master 6379
    depends_on:
      - redis-master

  redis-slave-2:
    image: redis:7
    ports:
      - "6381:6379"
    command: redis-server --replicaof redis-master 6379
    depends_on:
      - redis-master

  sentinel-1:
    image: redis:7
    ports:
      - "26379:26379"
    command: redis-sentinel /etc/sentinel.conf
    depends_on:
      - redis-master
      - redis-slave-1
      - redis-slave-2

  sentinel-2:
    image: redis:7
    ports:
      - "26380:26379"
    command: redis-sentinel /etc/sentinel.conf
    depends_on:
      - redis-master

  sentinel-3:
    image: redis:7
    ports:
      - "26381:26379"
    command: redis-sentinel /etc/sentinel.conf
    depends_on:
      - redis-master

Sentinel config sentinel.conf key parameters:

# Monitor master named mymaster, quorum=2 means need 2 sentinels to confirm ODOWN
sentinel monitor mymaster redis-master 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 60000

Distributed Lock Basics

Redis distributed lock core command:

SET lock_key unique_value NX PX 30000
  • NX: Set only if key doesn’t exist (atomic lock acquisition)
  • PX 30000: 30 second TTL (prevents deadlocks)

Release requires verifying value belongs to current holder, typically using Lua script for atomicity:

if redis.call("get", KEYS[1]) == ARGV[1] then
    return redis.call("del", KEYS[1])
else
    return 0
end

In Sentinel mode, there’s lock loss risk during master switch (master write success but slave not synced). Production should evaluate Redlock algorithm or strong consistency solutions like ZooKeeper.

Summary

  • Master-slave replication provides data redundancy and read scaling, foundation of HA
  • Sentinel mode adds automatic failover on top of master-slave, improving availability
  • Distributed lock depends on Redis atomic operations, but需要注意边界情况 during HA failover
  • Production deployment recommended: at least 3 sentinel nodes, quorum set to (sentinel_count/2) + 1