TL;DR

  • Scenario: Want to learn Netflix’s EVCache system for self-development, but only understand it’s “Memcached-based distributed cache”
  • Conclusion: Break down EVCache/Rend/Memcached/Mnemonic four layers to understand cache layer responsibilities, performance ceiling and multi-AZ replication model
  • Output: Practical architecture understanding framework + typical deployment path + common error quick reference

EVCache

EVCache is a high-performance distributed cache system open-sourced by Netflix, based on Memcached’s in-memory storage architecture, using Spymemcached client implementation.

EVCache Name Meaning:

  • E: Ephemeral - Data storage has TTL
  • V: Volatile - Emphasizes non-persistence
  • Cache: In-memory key-value storage

Main Features:

  1. Linear scaling capability through consistent hashing
  2. Multi-region deployment supports cross-datacenter replication
  3. Complete monitoring integration
  4. Smart client routing, automatically handles node failures and network partitions

Performance Metrics:

  • Single EVCache node cluster peak: 200KB requests/second
  • Global deployment: Thousands of memcached server nodes
  • Daily peak: Over 30 million operations
  • Stored objects: 50-100 billion
  • Daily requests: Nearly 2 trillion
  • Normal load response latency: 1-5ms
  • 99% requests complete within 20ms
  • Hit rate: Maintained at 99%

Architecture Components

Rend Service

  • High-performance proxy service written in Go
  • Uses goroutine and channel mechanisms for lightweight concurrent processing
  • Built-in smart connection pooling
  • Supports Memcached binary and text protocols
  • Uses consistent hashing for backend node distribution

Memcached

  • In-memory distributed key-value storage system
  • All data stored in RAM, microsecond-level response time
  • LRU eviction mechanism
  • Single node QPS: Up to 500,000 times/second

Mnemonic

  • SSD-based embedded key-value storage engine
  • Deep integration with RocksDB
  • Optimized SSD Direct I/O
  • Supports WAL data persistence
  • Random read latency < 1ms

Typical Deployment

Single Availability Zone Deployment

Cluster Startup Phase:

  1. EVCache server instances automatically register to service registry
  2. Each instance carries metadata: IP, port, memory capacity, load status, cluster name
  3. Client reads server list during initialization, establishes TCP long connection pool
  4. Uses consistent hash ring for virtual node distribution

Multi-Availability Zone Deployment

Cross-AZ Replication Flow:

  1. Initial Write Phase: Application initiates SET operation, data first writes to local AZ cache node
  2. Metadata Replication Preparation: EVCache client generates replication metadata message containing Key, operation type, timestamp, source AZ
  3. Cross-AZ Data Transfer: Propagator sends SET request to target region’s Replication Agent via internal network
  4. Target Region Data Update: Replication Agent verifies request, executes same SET operation on local cache cluster

Error Quick Reference

SymptomRoot CauseSolution
Business treats EVCache as strongly consistent persistent storageEVCache is essentially TTL + volatile memory cacheUse database as data source, cache only for acceleration
Cache hit rate far below expectationsKey design without hotspot reuse, TTL/capacity too smallOptimize key design, set reasonable TTL
Read/write latency spikes in other zones after AZ failureMulti-AZ topology design issueRead locally, cross-region only for replication
Cache hit rate significantly drops after adding or removing nodesConsistent hash virtual node misconfigurationAdjust virtual node count, use rolling scaling + pre-warm strategy