Introduction
The Paxos algorithm is a message-passing distributed consensus algorithm proposed by computer scientist Leslie Lamport. This groundbreaking work earned him the 2013 Turing Award. The algorithm solves the problem of how to reach consensus on a value (resolution) in distributed systems, even when some nodes fail or the network is unstable, while ensuring system consistency.
Development History
The Paxos algorithm was first publicly introduced by Lamport in 1998 in the paper “The Part-Time Parliament.” The paper used a unique narrative approach, using a small island in Greece called Paxos as a metaphor, describing in detail the process of passing parliamentary resolutions in Paxos and naming the algorithm accordingly.
In 2001, Lamport realized that academic peers had difficulty understanding his humorous way of expression, so he republished a more concise and direct algorithm description version called “Paxos Made Simple.”
Industry Impact and Applications
Since its introduction, Paxos has dominated the field of distributed consensus algorithms. The algorithm has been widely applied in industry:
- Google System Applications: Chubby (distributed lock service), Megastore (high-availability storage system), Spanner (globally distributed database)
- Open Source System Applications: ZooKeeper (distributed coordination service), MySQL 5.7’s Group Replication
Algorithm Characteristics
The Paxos algorithm, although powerful, is known for being difficult to understand and implement. Its main difficulties are:
- The algorithm description is abstract, requiring deep understanding of distributed system principles
- Implementation needs to handle various edge cases and exceptional situations
- Performance optimization requires consideration of actual network environments
What Problems Does It Solve
Problem Background
In modern distributed systems, data is typically stored with multiple replicas to improve availability and fault tolerance. Typical scenarios include: distributed databases (such as Google Spanner), distributed storage systems (such as HDFS), distributed computing frameworks, etc.
Specific Challenges
- Network Partitioning: Communication between nodes may be interrupted
- Message Delay: Operation instructions arrive at different nodes at inconsistent times
- Node Failure: Some nodes may fail temporarily or permanently
- Concurrent Conflicts: Multiple clients simultaneously initiate modification requests
Solution Details
Paxos guarantees consistency through the following mechanisms:
- Prepare Phase: Nodes propose values and collect majority responses
- Accept Phase: Determine the final value and obtain majority confirmation
- Learn Phase: Propagate the determined value to all replicas
Paxos can guarantee that the system presents a consistent state to the outside world as long as more than half of the nodes are working normally, even when facing network problems and node failures.
Related Concepts
Proposal Mechanism
A proposal is the core concept of the Paxos algorithm. It is an information packet containing two key components:
- Proposal ID: A globally unique identifier, usually composed of a proposal number (typically an incrementing number) and a unique identifier of the proposer (such as server ID). For example: <5, ServerA> represents proposal number 5 initiated by ServerA.
- Value: This is the final data value that the distributed system is expected to reach consensus on.
Role Division in Paxos Algorithm
The Paxos algorithm defines four key roles, each with its own responsibilities:
1. Client
- Is the initiator of external system requests
- Sends operation requests (such as read/write requests) to the distributed system
- Waits for system response and confirmation
2. Proposer
- Core responsibilities: Receive client requests and convert them into proposals, promote the acceptance process of proposals, handle coordination work when proposal conflicts occur
- Workflow:
- Prepare phase: Send prepare requests to Acceptors
- Propose phase: Send formal proposals after receiving responses from majority Acceptors
3. Acceptor
- Key functions: Receive and process Proposer requests, decide whether to accept a proposal, guarantee safety constraints
- Decision rules: Only accept proposals with numbers higher than the promised number, must remember the highest numbered proposal accepted
- Important characteristic: Usually requires a quorum of Acceptors to accept before a proposal can be selected
4. Learner
- Main tasks: Observe and learn the selected Value, propagate the final resolution to the entire system
- Implementation: Can monitor Acceptors’ decisions, or be notified of selection results by the Proposer
These roles in actual systems are usually held by server nodes concurrently. For example, a node may simultaneously play the roles of both Proposer and Acceptor.
Derivation Process
Simplest Solution: Only One Acceptor
In the implementation of the Paxos algorithm, the simplest solution is to use only one Acceptor (there can be multiple Proposers). In this configuration, the system workflow is as follows:
- Any Proposer can submit a proposal to the sole Acceptor
- The Acceptor follows a “first-come-first-accepted” principle, accepting the first proposal it receives
- Once the Acceptor accepts a proposal, that proposal is selected
- The Value contained in that proposal becomes the final selected value for the entire system
The advantage of this solution is simple implementation, ensuring the system will eventually select a value. However, this solution has serious problems:
- Single Point of Failure Risk: If this sole Acceptor crashes, the entire system will completely stop working
- Performance Bottleneck: All requests must pass through a single Acceptor for processing
- Lack of Fault Tolerance: Unable to handle exceptional situations like network partitioning
Therefore, in actual production environments, multiple Acceptors must be used to build a reliable distributed system.
Multiple Acceptors
How to ensure that a Value is selected with multiple Proposers and multiple Acceptors?
P1: An Acceptor must accept the first proposal it receives.
However, if each Proposer respectively proposes different Values and sends them to different Acceptors. According to the constraint just mentioned, Acceptors respectively accept the first proposal they receive, leading to different Values being selected, resulting in inconsistency.
Therefore, we need to add a rule: A proposal requires acceptance from more than half of the Acceptors to be selected.
Based on the above, although we now allow multiple proposals to be selected, we must ensure that all selected proposals have the same Value; otherwise, inconsistency will occur again.
P2: If a proposal with Value v is selected, then every selected proposal with a higher number must also have Value v.
A proposal can only be selected if it is accepted by an Acceptor, so we can rewrite the P2 constraint into a constraint on proposals accepted by Acceptors, P2a.
P2a: If a proposal with Value v is selected, then every proposal with a higher number accepted by an Acceptor must also have Value v.
As long as P2a is satisfied, P2 is satisfied.
However, consider the following situation: Suppose there are 5 Acceptors in total. The Proposer proposes a proposal [M1, V1]. Acceptors 2 through 5 (more than half) accept the proposal. Therefore, for Acceptors 2 through 5 and Proposer 2, they all believe V1 is selected. Acceptor 1 has just recovered from a crash (previously Acceptor 1 did not receive any proposals). At this point, Proposer 1 sends proposal [M2, V2] to Acceptor 1 (V2 is not equal to V1, and M2 is greater than M1). For Acceptor 1, this is the first proposal it has received. According to P1 (an Acceptor must accept the first proposal it receives), Acceptor 1 must accept this proposal, and Acceptor 1 believes V2 is selected.
This results in the following two problems:
- Acceptor 1 believes V2 is selected, while Acceptors 2 through 5 and Proposer 2 believe V1 is selected, resulting in inconsistency
- V1 is selected, but the proposal [M2, V2] with a higher number accepted by Acceptor 1 has Value V2, and V2 != V1. This contradicts P2a
Therefore, we need to strengthen the P2a constraint!
P2a is a constraint on proposals accepted by Acceptors, but proposals are actually proposed by Proposers. So we can impose a constraint on proposals proposed by Proposers to get P2b:
P2b: If a proposal with Value v is selected, then any proposal with a higher number proposed by any Proposer afterward must also have Value v.
From P2b, we can derive P2a, which then derives P2. So, how to ensure that after a proposal with Value v is selected, all proposals with higher numbers proposed by Proposers have Value v?
As long as P2c is satisfied:
P2c: For any Mn and Vn, if proposal [Mn, Vn] is proposed, then there must definitely exist a set S composed of more than half of the Acceptors that satisfies one of the following two conditions:
- Either each Acceptor in S has not accepted any proposal with a number less than Mn
- Or among all proposals with numbers less than Mn approved by all Acceptors in S, the proposal with the highest number has Value Vn
From the above, it can be seen that the process from P1 to P2c is actually a gradual strengthening of a series of conditions. If it is necessary to prove that these conditions can guarantee consistency, then reverse derivation can be performed: P2c => P2b => P2a => P2, and then consistency is guaranteed through P2 and P1.