TL;DR

  • Scenario: Using EVCache in large-scale distributed cache needs understanding of underlying Memcached memory management and eviction behavior
  • Conclusion: EVCache essentially relies on Memcached + Slab Allocation + Lazy Expiration + Approximate LRU, good performance but also has engineering risks like memory fragmentation and eviction jitter
  • Output: Sort out EVCache underlying Memcached architecture, Slab Class/Item structure, Lazy Expiration and LRU eviction logic

1. Memcached Architecture

Memcached is a high-performance distributed in-memory object caching system developed by danga company, originally designed for LiveJournal.

Core Features:

  • C/S Architecture: Uses classic client-server model, server runs as daemon listening on port 11211
  • libevent-based High-performance Event Processing: Uses libevent library for efficient event-driven model
  • Multi-threading Support: Memcached is multi-threaded

Workflow:

  1. Application first checks if desired data exists in Memcached
  2. If cache hits, return result directly
  3. If miss, query database
  4. Store database result in Memcached (with reasonable expiration time)
  5. Return data to application

2. Slab Allocation Memory Management Mechanism

Memcached uses Slab Allocation for memory management to solve memory fragmentation problem.

Core Principles:

  1. Memory Allocation Basics

    • Memory allocated in fixed-size Pages, default 1MB per Page
  2. Memory Segmentation Mechanism

    • Allocates memory into Chunks of different sizes
    • Chunk sizes follow growth factor规律, default growth factor is 1.25
  3. Data Storage Strategy

    • When storing data, Memcached selects the smallest Slab Class that can hold the data

3. Data Item Structure

Item is stored data, in doubly linked list form:

typedef struct _stritem {
    struct _stritem *next;    // Back pointer in LRU doubly linked list
    struct _stritem *prev;    // Forward pointer in LRU doubly linked list
    rel_time_t time;          // Last access time (for LRU)
    rel_time_t exptime;       // Expiration time
    int nbytes;               // Data length in value area
    unsigned short refcount;  // Reference count
    uint8_t slabs_clsid;     // Slab class ID
    uint8_t nkey;            // Key length
} item;

4. Expiration Mechanism

4.1 Lazy Expiration

Memcached uses lazy expiration strategy, does not actively scan and clean expired data, only checks if data is expired when executing get operation.

Advantages:

  • Saves system resources: No need for extra background threads or scheduled tasks
  • On-demand processing: Expiration check only when truly accessing data

4.2 LRU (Least Recently Used)

When memory used by Memcached exceeds the configured maximum memory limit, system triggers LRU algorithm to reclaim memory space.

Eviction Execution Steps:

  1. System starts reverse scanning from end of corresponding slab class’s item list
  2. Prioritize finding items with refcount=0
  3. If no refcount=0 item found, find items with last access time exceeding 3 hours

5. Error Quick Reference

SymptomRoot CauseFix
Normal hit rate but occasional latency spike-m configured too small or hot data expansionIncrease -m quota or split business cache
Some keys occasionally missingLRU eviction preferentially recycles long-unaccessed itemsAdjust cache granularity and access pattern
Memory usage appears not high but cannot insert new dataSlab Allocation causes “class full” but structural wasteRe-plan key/value sizes
Many seemingly expired keys still visible after expiration timeUsing Lazy ExpirationFor critical keys, actively trigger cleanup via get
High CAS update failure ratioSevere write competition on hot keysSplit hot keys into buckets/shards