Basic Concepts
Introduction
In July 2000, Professor Eric Brewer from UC Berkeley proposed the CAP conjecture. Two years later, Seth Gilbert and Nancy Lynch from MIT theoretically proved the possibility of the conjecture. Since then, the CAP theorem has become an officially recognized theorem in the field of distributed computing and has deeply influenced the development of distributed computing.
The CAP theorem meaning is that a distributed system cannot simultaneously satisfy three basic requirements: Consistency (C), Availability (A), and Partition Tolerance (P). It can satisfy at most two of them at the same time.
- C Consistency: Consistency in a distributed system refers to consistency across all nodes, or across all replicas
- A Availability: Reads and writes always succeed, meaning the system is always in use and services remain normal
- P Partition Tolerance: The system can still satisfy consistency and availability requirements when encountering some node or network partition failures
Core Conclusion
In a distributed system, when a network partition (P) occurs, system designers face a fundamental trade-off: they can only choose between data consistency (C) and system availability (A). This famous CAP theorem was proposed by computer scientist Eric Brewer and has become the golden rule for distributed system design.
The first thing to clarify is that a true distributed system must have the ability to tolerate network partitions (P), which is the essential characteristic of distributed systems. Network partitioning refers to cluster nodes being split into multiple sub-networks that cannot communicate with each other due to network failures. For example, when dedicated lines between data centers are interrupted, or server rack switches fail, network partitioning occurs.
In this situation, the system must make difficult choices:
- Choose to Maintain Consistency (C): This means the system will reject all requests that may lead to data inconsistency, ensuring all nodes see the same data. For example, bank transfer systems usually choose this approach, even if they cannot provide services temporarily, they ensure accounts are completely accurate.
- Choose to Maintain Availability (A): The system continues to process requests, but may return different data in different partitions. A typical example is the like function on social media. During network partitioning, different users may see different like counts, but the service remains available.
It is worth noting that the key point of the CAP theorem is:
- Network partitioning is an inevitable reality, especially in cross-region deployments
- It emphasizes that choices must be made when partitioning occurs, not that only two items can be satisfied at any time
- Under normal network conditions, the system can guarantee both C and A
In actual system design, engineers choose different trade-off schemes based on business requirements. For example:
- E-commerce system shopping carts usually prioritize availability (AP systems)
- Financial transaction systems prioritize consistency (CP systems)
- Modern distributed databases like MongoDB or Cassandra provide configurable consistency levels, allowing users to choose flexibly based on scenarios
Consistency
Core Concepts of Consistency
Consistency in distributed systems means that after a write operation completes, read operations can immediately obtain the latest data state. When data is distributed across multiple nodes, regardless of which node the data is read from, the same latest result can be obtained. This characteristic is particularly important for scenarios like product information management on e-commerce platforms.
Consistency Goals for Product Information Read/Write
- Write Success Scenario: After the product service successfully writes to the primary database, querying product information from any replica database must retrieve the latest data.
- Example: After modifying the product price to 99 yuan, all users must see 99 yuan when querying
- Write Failure Scenario:
- If the product service fails to write to the primary database, querying the product from any replica database should return a failure or error message
- If the primary database write fails, the system must ensure the replica database does not provide outdated product information
Consistency Implementation Mechanisms
- Primary-Replica Synchronization:
- Write operations are first submitted to the primary database
- The primary database propagates changes to replica databases through log replication (such as MySQL’s binlog) or streaming transmission
- Replica databases apply these changes to stay consistent with the primary database
- Locking Mechanism During Synchronization:
- After writing to the primary database and before replica databases complete synchronization, relevant data records are locked
- Any query request for this data will be blocked until synchronization is complete
- After locks are released, all queries can obtain the latest data
Characteristics and Impact of Distributed Consistency
- Performance Characteristics:
- Increased write operation latency: Write operation response time is extended due to waiting for data synchronization to complete
- Typical latency range: From a few milliseconds (same data center) to hundreds of milliseconds (cross-region)
- Resource Locking:
- Modified data is locked during synchronization
- Other read/write operations need to wait for lock release
- Lock duration depends on network conditions and replica processing capability
- Error Handling Mechanism:
- If any replica database cannot complete synchronization (such as network partition), the system returns an error rather than old data
- Clients can receive clear prompts like “data synchronization in progress” or “system busy”
- Trade-offs in Actual Scenarios:
- Strong consistency reduces system throughput
- Suitable for scenarios with extremely high data accuracy requirements (such as inventory management, payment systems)
- For scenarios that can accept brief inconsistency (such as product reviews), eventual consistency models can be used
Availability
Definition and Importance
Availability means the system can provide correct responses to user operations at any time, ensuring no response timeouts or errors occur. In e-commerce systems, the availability of product information read/write is particularly important, directly related to user experience and platform reputation.
Availability Goals for Product Information Read/Write
To achieve high-availability product information read/write, the system needs to meet the following specific requirements:
- Immediate Response: After the database receives a query request, it must return query results within a reasonable time (usually requiring millisecond-level response)
- Zero Errors: The system does not allow response timeouts (such as no response exceeding 500ms) or error responses (such as 404/500 error codes)
Technical Solutions for Achieving High Availability
1. Primary-Replica Database Architecture
- Primary Database Write: All write operations are first completed on the primary database
- Real-time Data Synchronization: Data is synchronized to multiple replica databases in real-time through database replication technology (such as MySQL’s binlog replication)
- Example Scenario: When a merchant modifies a product price, the modification is first completed on the primary library and then automatically synchronized to replicas
2. Avoiding Resource Locking
- Lock-free Mechanism: Use optimistic locking instead of pessimistic locking to avoid long-term locking of database resources
- Read-Write Separation: Write operations only occur on the primary database; read operations are distributed across multiple replica databases
- Impact: This design ensures that even under high concurrency, the database can maintain high response speed
3. Eventual Consistency Guarantee
- Old Data Priority: When primary-replica synchronization delays occur, replicas should return slightly old data instead of reporting errors
- Implementation: Set reasonable synchronization timeout thresholds (such as 1 second); return available old data if exceeded
- User Experience: Users may see brief data inconsistency, but the system always remains available
Partition Tolerance
In distributed systems, Partition Tolerance refers to the system’s ability to continue serving external requests when network partitioning (Network Partition) occurs. Network partitioning refers to cluster nodes being split into multiple subgroups that cannot communicate with each other due to network failures.
Network Partition Scenario Analysis
Common network partition scenarios include:
- Dedicated line interruption between data centers
- Switch/router failure
- Host network configuration errors
- Regional network failures (such as submarine cables being cut)
For example, in a distributed database across data centers, if a certain data center loses network connection with other data centers, the system should still be able to process read/write requests within that data center normally.
Key Technologies for Achieving Partition Tolerance
1. Asynchronous Communication Mechanism
- Implementation:
- Use message queues (such as Kafka/RabbitMQ) for asynchronous data replication
- Use eventual consistency model instead of strong consistency
- Implement Write-Ahead Logs (WAL) to ensure data recoverability
- Advantages:
- Avoid synchronous blocking causing system stagnation
- Automatically retry failed operations after network recovery
- Reduce coupling between nodes
2. Multi-Node Redundancy Design
- Implementation:
- Adopt multi-replica mechanism (usually 3-5 replicas)
- Deploy replicas across racks/data centers
- Implement automatic fault detection and switching
- Specific Measures:
- Use consistent hashing algorithm for data allocation
- Configure sufficient quorum nodes
- Implement smart routing (such as client load balancing)
CAP: Choosing 2 of 3
Why can’t CAP be simultaneously satisfied?
Assume there is a system: a user sends a request to N1 to change data, updating the database from V0 to V1. Due to a network disconnect, the N2 database still shows V0. If a request is sent to N2 at this time, but N2 cannot directly give the latest result V1, what should be done?
At this moment, there are two methods. One is to go ahead with the error, returning the erroneous V0 data to the user. The second is to block and wait, returning to the user after the network communication is restored and N2’s data is updated. Obviously, the former sacrifices consistency, and the latter sacrifices availability.
Although these examples are simple, in CAP, the three characteristics cannot be simultaneously satisfied, and one must be sacrificed.
- Sacrifice A (Availability), keep CP (Consistency and Partition Tolerance): A system that guarantees consistency and partition tolerance means that in extreme cases, the system is allowed to be inaccessible, sacrificing user experience. Users wait, and service is restored after the system recovers.
- Sacrifice C (Consistency), keep AP (Availability and Partition Tolerance): Most distributed system designs guarantee high availability and partition tolerance but sacrifice consistency
- Sacrifice P (Partition Tolerance), keep CA (Consistency and Availability): If P is sacrificed, then the distributed system is sacrificed, and CAP becomes moot. It can be said that P is a prerequisite for distributed systems, so this situation does not exist.
Design Trade-offs
- CP Systems: Availability (A), more focused on consistency. When partitioning occurs,宁愿拒绝服务也不返回错误数据. For example: HBase, MongoDB (specific configuration), Zookeeper
- AP Systems: Consistency (C), more focused on availability. Allows temporary data inconsistency, synchronizing after partition recovery. For example: Cassandra, DynamoDB, CouchDB
- CA Systems: Partition Tolerance (P). Theoretically can only exist in non-distributed systems or local environments. Can satisfy C and A when the network is reliable (but cannot tolerate network issues)
Extended Theories
CAP is a classical model, but there are also subsequent models that are closer to real system needs:
- BASE: Basically Available, Soft state, Eventually consistent
- PACELC: Extends CAP, adding latency-consistency trade-offs when partitioning is not occurring
Summary
The CAP theorem tells us: Distributed systems cannot simultaneously achieve the best among partition tolerance, consistency, and availability. They can only choose two.
- When designing systems, clarify business priorities: would you rather have short-term inconsistency or sacrifice some availability?
- This provides important theoretical basis for architects in system selection and trade-off decisions.