Spymemcached Deep Dive: Thread Model, Sharding, Serializa...

TL;DR

Scenario: Java service integrating with Memcached needs to understand Spymemcached’s thread model, sharding routing and serialization details
Conclusion: Spymemcached implements async IO via NIO+callback, uses ketama consistent hashing for Sharding
Output: Complete architecture breakdown, thread and Sharding mechanism explanation, and error quick reference

Spymemcached Introduction

Spymemcached is a memcached client implemented using NIO.

Protocol Support:

Text Protocol: Pure text-based protocol, easy to debug and read
Binary Protocol: Binary protocol, higher transmission efficiency

Async Communication Mechanism: Uses NIO (Non-blocking I/O) for efficient network communication, handles responses through callback mechanism.

Cluster and Sharding: Supports Sharding mechanism, uses consistent hashing algorithm to allocate keys, supports dynamic node addition and removal.

Overall Design Architecture

API Interface Design: Provides synchronous and asynchronous calling methods
Task Encapsulation Mechanism: Encapsulates requests as independent Task objects
Routing Partition Strategy: Uses consistent hashing algorithm to determine target node
Task Queue Management: Each connection maintains independent task queue
Async IO Processing Flow: Selector monitors IO events for all connections
Response Processing Mechanism: Matches original Task based on opaque field, executes callback

Thread Design

Spymemcached has two types of threads: business threads and Selector threads.

Business threads are responsible for:

Request encapsulation
Object serialization
Protocol encapsulation
Task distribution

Selector threads are responsible for:

Send processing: Queue polling, data sending, flow control
Receive processing: Response reading, result notification
Connection management: Fault detection, automatic recovery, load balancing

Sharding Mechanism

Routing Mechanism

Two hashing algorithms:

arrayMod (Array Modulo Hashing)
- Calculation: hash(key) % node count = target node index
- Feature: Node addition/removal causes large-scale data migration
ketama (Consistent Hashing)
- Generates 160 virtual nodes for each physical node
- Builds hash ring, evenly distributes virtual nodes
- Advantage: Only 1/N data needs migration on average when nodes are added/removed

Fault Tolerance

Three Failover handling strategies:

Redistribute strategy (recommended): Round-robin/randomly select next available node
Retry strategy: Max retry 3 times, linearly increasing retry interval
Cancel strategy: Log error, throw ConnectionException

Serialization

Objects are stored in Memcached in binary form. After serialization, if length exceeds threshold of 16384 bytes, GZip compression is applied.

Error Quick Reference

Symptom	Root Cause	Fix
Many get timeouts under high concurrency	Single node pressure too high, Selector thread can’t keep up	Reduce single node pressure, increase timeout config
Cache hit rate drops to 0 after node add/remove	Using arrayMod sharding	Switch to ketama consistent hashing
Many connection exceptions after a node fails	Failover strategy misconfigured	Limit max retry count
Client process memory continuously grows after QPS increases	Task queue, Future or callback holds large object references	Set reasonable upper limit for queue
Some keys occasionally return NOT_FOUND	Inconsistent sharding routing	Fix node list order and weights
CPU spikes when writing large objects	Serialization + GZip compression overhead	Adjust compression threshold, avoid super large objects in cache