Redis RDB Persistence: Snapshot Principles, Configuration and Trade-offs

This is article 46 in the Big Data series. This article provides an in-depth analysis of Redis RDB persistence mechanism working principles, core configurations, and production practices.

Full illustrated version (with screenshots): CSDN Original | Juejin

What is RDB Persistence

RDB (Redis Database) is Redis’s default persistence method. It is essentially a snapshot: at a specific moment, all data in memory is serialized as a binary file and saved to disk. When persistence is triggered, Redis forks a child process. The child process writes memory data to a temporary file, then atomically replaces the old file. The entire process doesn’t block the parent process from handling requests.

Trigger Methods

Automatic triggers:

Configure save rules in redis.conf (time window + write operation threshold)
Redis automatically executes when shutting down normally

Manual triggers:

SAVE: Synchronous execution, blocks all client requests, use with caution in production
BGSAVE: Asynchronous background execution, doesn’t affect normal service
FLUSHALL: Also triggers an RDB before clearing the database

Special scenarios:

During master-replica full sync, master automatically executes BGSAVE to transfer RDB file to replica
When AOF is not enabled, Redis uses RDB to recover data on restart

Core Configuration Parameters

# Disable RDB (explicitly turn off)
save ""

# Trigger condition: Within time window (seconds), at least N keys are modified
save 900 1      # At least 1 key changed within 15 minutes
save 300 10     # At least 10 keys changed within 5 minutes
save 60 10000   # At least 10000 keys changed within 1 minute

# Filename and storage directory
dbfilename dump.rdb
dir /var/lib/redis

# Enable LZF compression (default yes, reduces file size by ~75%)
rdbcompression yes

# Write CRC64 checksum at file end (default yes, prevents file corruption)
rdbchecksum yes

BGSAVE Execution Flow

Parent process check: If a child process is already running persistence tasks, return error
Fork child process: Fork process has brief blocking (millisecond-level), pay attention to this duration for large memory instances
Parent process recovery: After fork completes, parent immediately resumes handling client requests
Child process write: Use Copy-on-Write mechanism to serialize memory data to temporary .rdb file
Atomic replacement: After child process completes, rename temporary file to dump.rdb
Notify parent: Child process exits, parent updates statistics

Copy-on-Write ensures that after fork, the parent process’s write operations don’t affect the memory pages being read by the child process, achieving snapshot semantics while minimizing the impact of persistence on the main thread.

RDB File Structure

Field	Description
5-byte magic number	Fixed value `"REDIS"`
4-byte version number	RDB format version
Auxiliary fields	Metadata like Redis version, creation time
Database number + size	Identifies which database stores the data
Expiration information	Expiration timestamp for each key
Key-value pair data	Actual serialized key-value data
End marker	`0xFF`
CRC64 checksum	File integrity verification

Pros and Cons Analysis

Advantages:

Compact file size, after LZF compression ~25% of original data, suitable for backup and migration
Child process executes independently, parent process performance unaffected
Restoring large datasets is extremely fast, better than AOF replaying all operations
Almost no interference with main thread

Disadvantages:

Risk of data loss between snapshots (default config may lose up to 5 minutes of data)
Fork operation on large memory instances (tens of GB) may take hundreds of milliseconds, causing brief service stutter
Cannot achieve real-time persistence, not suitable for scenarios requiring extremely high data integrity

RDB vs AOF

Dimension	RDB	AOF
Persistence method	Data snapshot (result)	Operation log (process)
File size	Small (binary compression)	Large (text commands)
Recovery speed	Fast	Slow (needs replay)
Data safety	Low (lost within snapshot interval)	High (fsync every second/every command)
Write performance impact	Low	Low (async fsync)

Hybrid Persistence (Redis 4.0+)

aof-use-rdb-preamble yes

After enabling hybrid persistence, the AOF file header stores RDB snapshot content, followed by AOF incremental commands. It retains RDB’s fast recovery capability while having AOF’s data integrity. This is the recommended configuration for production environments.

Production Practice Recommendations

For large memory instances (> 8 GB), focus on monitoring fork duration; use latency monitor when necessary
Coordinate with bgsave for manual backups during business off-peak, and regularly upload dump.rdb to object storage
For scenarios requiring high data safety, recommend enabling AOF and using hybrid persistence mode
Use rdbchecksum yes to ensure file integrity, avoiding silent corruption from disk failures