TL;DR
- Scenario: Using EVCache in large-scale distributed cache needs understanding of underlying Memcached memory management and eviction behavior
- Conclusion: EVCache essentially relies on Memcached + Slab Allocation + Lazy Expiration + Approximate LRU, good performance but also has engineering risks like memory fragmentation and eviction jitter
- Output: Sort out EVCache underlying Memcached architecture, Slab Class/Item structure, Lazy Expiration and LRU eviction logic
1. Memcached Architecture
Memcached is a high-performance distributed in-memory object caching system developed by danga company, originally designed for LiveJournal.
Core Features:
- C/S Architecture: Uses classic client-server model, server runs as daemon listening on port 11211
- libevent-based High-performance Event Processing: Uses libevent library for efficient event-driven model
- Multi-threading Support: Memcached is multi-threaded
Workflow:
- Application first checks if desired data exists in Memcached
- If cache hits, return result directly
- If miss, query database
- Store database result in Memcached (with reasonable expiration time)
- Return data to application
2. Slab Allocation Memory Management Mechanism
Memcached uses Slab Allocation for memory management to solve memory fragmentation problem.
Core Principles:
-
Memory Allocation Basics
- Memory allocated in fixed-size Pages, default 1MB per Page
-
Memory Segmentation Mechanism
- Allocates memory into Chunks of different sizes
- Chunk sizes follow growth factor规律, default growth factor is 1.25
-
Data Storage Strategy
- When storing data, Memcached selects the smallest Slab Class that can hold the data
3. Data Item Structure
Item is stored data, in doubly linked list form:
typedef struct _stritem {
struct _stritem *next; // Back pointer in LRU doubly linked list
struct _stritem *prev; // Forward pointer in LRU doubly linked list
rel_time_t time; // Last access time (for LRU)
rel_time_t exptime; // Expiration time
int nbytes; // Data length in value area
unsigned short refcount; // Reference count
uint8_t slabs_clsid; // Slab class ID
uint8_t nkey; // Key length
} item;
4. Expiration Mechanism
4.1 Lazy Expiration
Memcached uses lazy expiration strategy, does not actively scan and clean expired data, only checks if data is expired when executing get operation.
Advantages:
- Saves system resources: No need for extra background threads or scheduled tasks
- On-demand processing: Expiration check only when truly accessing data
4.2 LRU (Least Recently Used)
When memory used by Memcached exceeds the configured maximum memory limit, system triggers LRU algorithm to reclaim memory space.
Eviction Execution Steps:
- System starts reverse scanning from end of corresponding slab class’s item list
- Prioritize finding items with refcount=0
- If no refcount=0 item found, find items with last access time exceeding 3 hours
5. Error Quick Reference
| Symptom | Root Cause | Fix |
|---|---|---|
| Normal hit rate but occasional latency spike | -m configured too small or hot data expansion | Increase -m quota or split business cache |
| Some keys occasionally missing | LRU eviction preferentially recycles long-unaccessed items | Adjust cache granularity and access pattern |
| Memory usage appears not high but cannot insert new data | Slab Allocation causes “class full” but structural waste | Re-plan key/value sizes |
| Many seemingly expired keys still visible after expiration time | Using Lazy Expiration | For critical keys, actively trigger cleanup via get |
| High CAS update failure ratio | Severe write competition on hot keys | Split hot keys into buckets/shards |