This is article 49 in the Big Data series. This article systematically explains five classic Redis problems in high-concurrency scenarios and their solutions.
Full illustrated version (with screenshots): CSDN Original | Juejin
Cache Penetration
Problem Description
Requested data exists in neither cache nor database, causing every request to bypass cache and hit the database directly, creating invalid DB query pressure. Common in malicious attacks (using random/invalid IDs to make massive requests).
Solutions
Solution 1: Cache Null Values
When database query result is empty, write null or empty marker value to cache with short TTL (30-60 seconds) to prevent repeated database hits.
Solution 2: Bloom Filter
Deploy Bloom filter in front of cache layer, write all valid keys in advance. When request arrives, first query Bloom filter, those confirmed as non-existent are directly intercepted without checking cache or database.
Bloom filter principle: Uses a bit array of length m and k independent hash functions. When writing element, set k hash positions to 1; when querying, if any position is 0, the element definitely doesn’t exist; if all are 1, it probably exists (has false positive rate).
False positive rate formula: p ≈ (1 - e^(-kn/m))^k, can reduce false positive rate by increasing m or adjusting k.
Solution 3: Parameter Validation + Rate Limiting
Enforce strict validation of request parameter legality in business layer, combine with interface rate limiting and abnormal request monitoring to intercept illegal requests in advance.
Cache Avalanche
Problem Description
Many keys expire simultaneously, or Redis cluster fails, causing massive requests to simultaneously penetrate to database, extreme case triggers system cascading collapse.
Solutions
Solution 1: Stagger Expiration Times
Avoid setting same TTL when batch writing, add random offset on base time:
int expireTime = baseTime + ThreadLocalRandom.current().nextInt(0, 300);
redisTemplate.expire(key, expireTime, TimeUnit.SECONDS);
Solution 2: Multi-Level Cache
Build local cache (Caffeine/Guava) → distributed cache (Redis) → database three-level protection. When Redis fails, local cache can fallback.
Solution 3: High-Availability Architecture
- Deploy Redis Sentinel or Redis Cluster to avoid single point of failure
- Integrate circuit breaker and degradation in business layer (like Sentinel, Hystrix), when Redis is unavailable degrade to database or return fallback data
- Pre-warm critical data during off-peak hours
Cache Breakdown
Problem Description
At the moment a single hot key expires, massive concurrent requests experience cache miss simultaneously, all hit database to rebuild cache, creating “thundering herd” effect.
Solutions
Solution 1: Mutex Lock
On cache miss, only allow one thread to acquire distributed lock to query database and rebuild cache. Other threads wait or spin retry. Ensure database is queried only once.
Solution 2: Never Expire (Logical Expiration)
Hot key doesn’t set physical expiration time, instead stores logical expiration time in value. Background async task refreshes cache before logical expiration. Users always read from cache (may briefly read stale data).
Solution 3: Early Renewal
Monitor remaining TTL of key, actively refresh before expiration to fundamentally eliminate expiration gap.
Data Consistency Problem
Delayed Double Delete Strategy
1. Delete cache before updating database
2. Update database
3. Wait ~200ms~2s (let other threads' read requests complete)
4. Delete cache again (clear possibly rewritten old data)
5. Set reasonable TTL on cache as final fallback
More thorough solution is to listen to database Binlog (like Canal), precisely invalidate corresponding cache when data change events occur, achieving near real-time cache consistency.
Hot Key Problem
Problem Description
Massive requests concentrate on the same key, exceeding single Redis node’s network bandwidth or CPU processing capability, which may cause node crash and trigger avalanche.
Detection Methods
- Offline:
redis-cli --hotkeys(based on LFU statistics, need to enable corresponding config) - Online:
MONITORcommand captures request traffic (has performance impact, use with caution) - Stream computing: Integrate Flink/Spark for real-time access frequency statistics, write hot key info to ZooKeeper to notify application layer
Solutions
- Local cache fallback: Replicate hot key to application process memory (Caffeine), accept cost of brief data inconsistency
- Sharded reading: Replicate hot key to multiple Redis nodes (key_1, key_2…key_N), randomly select node when reading
- Rate limiting + circuit breaker: Rate limit abnormally high-frequency access keys, circuit breaker protect backend
Big Key Problem
Problem Description
Single key’s value is too large (String exceeds 10KB, collection elements exceed 5000), causes:
- Uneven memory distribution, affects cluster data migration and rebalance
- Long blocking time for read/write operations, other request latencies increase
DELon big key directly blocks main thread (synchronous operation)
Detection Methods
redis-cli --bigkeys: Scan entire keyspace (time-consuming when data volume is large)- RDB file analysis tools (like rdbtools, redis-rdb-tools): Offline analysis, doesn’t affect production
Solutions
- Split big key: Split large String into multiple keys (e.g., sharded storage), split large Hash/List/Set by hash bucketing
- External storage: Store extra large values (images, documents, serialized objects) in MongoDB or CDN, only store reference ID in Redis
- Lazy deletion: Use
UNLINKinstead ofDEL, asynchronously delete big key in background to avoid blocking main thread
# Safely delete big key
UNLINK big_key_name
Summary
| Problem | Root Cause | Primary Solution |
|---|---|---|
| Cache Penetration | Request non-existent keys | Bloom filter |
| Cache Avalanche | Many keys expire simultaneously | Random TTL + multi-level cache |
| Cache Breakdown | Hot key expiration moment concurrency | Mutex lock / never expire |
| Hot Key | Traffic concentrated on single node | Local cache + sharded reading |
| Big Key | Single key data volume too large | Split + UNLINK delete |