EVCache Internal Principles: Memcached Architecture, Slab...

TL;DR

Scenario: Using EVCache in large-scale distributed cache needs understanding of underlying Memcached memory management and eviction behavior
Conclusion: EVCache essentially relies on Memcached + Slab Allocation + Lazy Expiration + Approximate LRU, good performance but also has engineering risks like memory fragmentation and eviction jitter
Output: Sort out EVCache underlying Memcached architecture, Slab Class/Item structure, Lazy Expiration and LRU eviction logic

1. Memcached Architecture

Memcached is a high-performance distributed in-memory object caching system developed by danga company, originally designed for LiveJournal.

Core Features:

C/S Architecture: Uses classic client-server model, server runs as daemon listening on port 11211
libevent-based High-performance Event Processing: Uses libevent library for efficient event-driven model
Multi-threading Support: Memcached is multi-threaded

Workflow:

Application first checks if desired data exists in Memcached
If cache hits, return result directly
If miss, query database
Store database result in Memcached (with reasonable expiration time)
Return data to application

2. Slab Allocation Memory Management Mechanism

Memcached uses Slab Allocation for memory management to solve memory fragmentation problem.

Core Principles:

Memory Allocation Basics
- Memory allocated in fixed-size Pages, default 1MB per Page
Memory Segmentation Mechanism
- Allocates memory into Chunks of different sizes
- Chunk sizes follow growth factor规律, default growth factor is 1.25
Data Storage Strategy
- When storing data, Memcached selects the smallest Slab Class that can hold the data

3. Data Item Structure

Item is stored data, in doubly linked list form:

typedef struct _stritem {
    struct _stritem *next;    // Back pointer in LRU doubly linked list
    struct _stritem *prev;    // Forward pointer in LRU doubly linked list
    rel_time_t time;          // Last access time (for LRU)
    rel_time_t exptime;       // Expiration time
    int nbytes;               // Data length in value area
    unsigned short refcount;  // Reference count
    uint8_t slabs_clsid;     // Slab class ID
    uint8_t nkey;            // Key length
} item;

4. Expiration Mechanism

4.1 Lazy Expiration

Memcached uses lazy expiration strategy, does not actively scan and clean expired data, only checks if data is expired when executing get operation.

Advantages:

Saves system resources: No need for extra background threads or scheduled tasks
On-demand processing: Expiration check only when truly accessing data

4.2 LRU (Least Recently Used)

When memory used by Memcached exceeds the configured maximum memory limit, system triggers LRU algorithm to reclaim memory space.

Eviction Execution Steps:

System starts reverse scanning from end of corresponding slab class’s item list
Prioritize finding items with refcount=0
If no refcount=0 item found, find items with last access time exceeding 3 hours

5. Error Quick Reference

Symptom	Root Cause	Fix
Normal hit rate but occasional latency spike	-m configured too small or hot data expansion	Increase -m quota or split business cache
Some keys occasionally missing	LRU eviction preferentially recycles long-unaccessed items	Adjust cache granularity and access pattern
Memory usage appears not high but cannot insert new data	Slab Allocation causes “class full” but structural waste	Re-plan key/value sizes
Many seemingly expired keys still visible after expiration time	Using Lazy Expiration	For critical keys, actively trigger cleanup via get
High CAS update failure ratio	Severe write competition on hot keys	Split hot keys into buckets/shards