EVCache Introduction: Netflix-level Architecture and Mult...

TL;DR

Scenario: Want to learn Netflix’s EVCache system for self-development, but only understand it’s “Memcached-based distributed cache”
Conclusion: Break down EVCache/Rend/Memcached/Mnemonic four layers to understand cache layer responsibilities, performance ceiling and multi-AZ replication model
Output: Practical architecture understanding framework + typical deployment path + common error quick reference

EVCache

EVCache is a high-performance distributed cache system open-sourced by Netflix, based on Memcached’s in-memory storage architecture, using Spymemcached client implementation.

EVCache Name Meaning:

E: Ephemeral - Data storage has TTL
V: Volatile - Emphasizes non-persistence
Cache: In-memory key-value storage

Main Features:

Linear scaling capability through consistent hashing
Multi-region deployment supports cross-datacenter replication
Complete monitoring integration
Smart client routing, automatically handles node failures and network partitions

Performance Metrics:

Single EVCache node cluster peak: 200KB requests/second
Global deployment: Thousands of memcached server nodes
Daily peak: Over 30 million operations
Stored objects: 50-100 billion
Daily requests: Nearly 2 trillion
Normal load response latency: 1-5ms
99% requests complete within 20ms
Hit rate: Maintained at 99%

Architecture Components

Rend Service

High-performance proxy service written in Go
Uses goroutine and channel mechanisms for lightweight concurrent processing
Built-in smart connection pooling
Supports Memcached binary and text protocols
Uses consistent hashing for backend node distribution

Memcached

In-memory distributed key-value storage system
All data stored in RAM, microsecond-level response time
LRU eviction mechanism
Single node QPS: Up to 500,000 times/second

Mnemonic

SSD-based embedded key-value storage engine
Deep integration with RocksDB
Optimized SSD Direct I/O
Supports WAL data persistence
Random read latency < 1ms

Typical Deployment

Single Availability Zone Deployment

Cluster Startup Phase:

EVCache server instances automatically register to service registry
Each instance carries metadata: IP, port, memory capacity, load status, cluster name
Client reads server list during initialization, establishes TCP long connection pool
Uses consistent hash ring for virtual node distribution

Multi-Availability Zone Deployment

Cross-AZ Replication Flow:

Initial Write Phase: Application initiates SET operation, data first writes to local AZ cache node
Metadata Replication Preparation: EVCache client generates replication metadata message containing Key, operation type, timestamp, source AZ
Cross-AZ Data Transfer: Propagator sends SET request to target region’s Replication Agent via internal network
Target Region Data Update: Replication Agent verifies request, executes same SET operation on local cache cluster

Error Quick Reference

Symptom	Root Cause	Solution
Business treats EVCache as strongly consistent persistent storage	EVCache is essentially TTL + volatile memory cache	Use database as data source, cache only for acceleration
Cache hit rate far below expectations	Key design without hotspot reuse, TTL/capacity too small	Optimize key design, set reasonable TTL
Read/write latency spikes in other zones after AZ failure	Multi-AZ topology design issue	Read locally, cross-region only for replication
Cache hit rate significantly drops after adding or removing nodes	Consistent hash virtual node misconfiguration	Adjust virtual node count, use rolling scaling + pre-warm strategy