TL;DR
- Scenario: Want to learn Netflix’s EVCache system for self-development, but only understand it’s “Memcached-based distributed cache”
- Conclusion: Break down EVCache/Rend/Memcached/Mnemonic four layers to understand cache layer responsibilities, performance ceiling and multi-AZ replication model
- Output: Practical architecture understanding framework + typical deployment path + common error quick reference
EVCache
EVCache is a high-performance distributed cache system open-sourced by Netflix, based on Memcached’s in-memory storage architecture, using Spymemcached client implementation.
EVCache Name Meaning:
- E: Ephemeral - Data storage has TTL
- V: Volatile - Emphasizes non-persistence
- Cache: In-memory key-value storage
Main Features:
- Linear scaling capability through consistent hashing
- Multi-region deployment supports cross-datacenter replication
- Complete monitoring integration
- Smart client routing, automatically handles node failures and network partitions
Performance Metrics:
- Single EVCache node cluster peak: 200KB requests/second
- Global deployment: Thousands of memcached server nodes
- Daily peak: Over 30 million operations
- Stored objects: 50-100 billion
- Daily requests: Nearly 2 trillion
- Normal load response latency: 1-5ms
- 99% requests complete within 20ms
- Hit rate: Maintained at 99%
Architecture Components
Rend Service
- High-performance proxy service written in Go
- Uses goroutine and channel mechanisms for lightweight concurrent processing
- Built-in smart connection pooling
- Supports Memcached binary and text protocols
- Uses consistent hashing for backend node distribution
Memcached
- In-memory distributed key-value storage system
- All data stored in RAM, microsecond-level response time
- LRU eviction mechanism
- Single node QPS: Up to 500,000 times/second
Mnemonic
- SSD-based embedded key-value storage engine
- Deep integration with RocksDB
- Optimized SSD Direct I/O
- Supports WAL data persistence
- Random read latency < 1ms
Typical Deployment
Single Availability Zone Deployment
Cluster Startup Phase:
- EVCache server instances automatically register to service registry
- Each instance carries metadata: IP, port, memory capacity, load status, cluster name
- Client reads server list during initialization, establishes TCP long connection pool
- Uses consistent hash ring for virtual node distribution
Multi-Availability Zone Deployment
Cross-AZ Replication Flow:
- Initial Write Phase: Application initiates SET operation, data first writes to local AZ cache node
- Metadata Replication Preparation: EVCache client generates replication metadata message containing Key, operation type, timestamp, source AZ
- Cross-AZ Data Transfer: Propagator sends SET request to target region’s Replication Agent via internal network
- Target Region Data Update: Replication Agent verifies request, executes same SET operation on local cache cluster
Error Quick Reference
| Symptom | Root Cause | Solution |
|---|---|---|
| Business treats EVCache as strongly consistent persistent storage | EVCache is essentially TTL + volatile memory cache | Use database as data source, cache only for acceleration |
| Cache hit rate far below expectations | Key design without hotspot reuse, TTL/capacity too small | Optimize key design, set reasonable TTL |
| Read/write latency spikes in other zones after AZ failure | Multi-AZ topology design issue | Read locally, cross-region only for replication |
| Cache hit rate significantly drops after adding or removing nodes | Consistent hash virtual node misconfiguration | Adjust virtual node count, use rolling scaling + pre-warm strategy |