Distributed Services ACID: Two-Phase Commit 2PC Protocol

Database Transactions

Basic Characteristics

Atomicity

Atomicity means that all operations in a transaction are executed as an indivisible whole - either all complete successfully or none are executed. If any operation fails during transaction execution, the system rolls back the database to its state before the transaction began. This is like the “all or nothing” principle in bank transfer operations.

Implementation Mechanisms:

Use Transaction Logs to record all operations
Implement rollback operations through Undo logs
Use Two-Phase Commit protocol in distributed systems

Application Example: When a user transfers 100 yuan from account A to account B:

Deduct 100 yuan from account A
Add 100 yuan to account B If step 2 fails, the system automatically reverses step 1, ensuring both accounts remain unchanged.

Consistency

Consistency ensures that before and after transaction execution, the database transitions from one legal state to another, with all data constraints, rules, and integrity maintained.

Key Elements:

Entity integrity (primary key constraints)
Referential integrity (foreign key constraints)
User-defined business rules (such as account balance cannot be negative)

Isolation

Isolation defines the visibility and impact degree between multiple concurrent transactions, ensuring transaction execution is not interfered with by other concurrent transactions.

Isolation Levels (from low to high):

Read Uncommitted
Read Committed
Repeatable Read
Serializable

Concurrency Problems:

Dirty read: Transaction A reads modifications not yet committed by transaction B
Non-repeatable read: Transaction A reads the same data multiple times, during which transaction B modifies and commits the data
Phantom read: Transaction A reads multiple rows meeting conditions, during which transaction B inserts/deletes new data meeting the conditions

Durability

Durability ensures that once a transaction is committed, its changes are permanently saved in the database, even if system failures occur.

Implementation Methods:

Write-Ahead Logging (WAL) mechanism
Regular Checkpoints
Database backup and recovery strategies

Basic Introduction

Distributed transactions are essentially consistent with the concept of database transactions. Since they are transactions, they also need to satisfy ACID (Atomicity, Consistency, Isolation, Durability) four major characteristics. Compared to traditional local transactions, distributed transactions involve multiple independent resources or services that may be distributed across different physical nodes and communicate over networks, so their implementation methods and manifestations are quite different.

In local transactions, all operations are completed within the same database instance, and the transaction manager can directly control transaction commit or rollback. In a distributed environment, transactions may span multiple databases, microservices, or heterogeneous systems like message queues. Each participating party has its own transaction manager. This distributed characteristic brings new challenges:

Uncertainty in network communication (such as delays, packet loss)
Each party may use different transaction implementation methods
Independence of system failures (a certain node failure does not affect other nodes)

Common distributed transaction solutions include:

Two-Phase Commit (2PC): Coordinator first asks all participants if they can commit, then notifies to commit after receiving confirmation
Three-Phase Commit (3PC): Adds a pre-commit phase on top of 2PC to reduce blocking time
TCC (Try-Confirm-Cancel): Implements through reserved resources, confirm/cancel business logic
Message-based eventual consistency: Uses message queues for asynchronous transaction processing

Two-Phase Commit Protocol (2PC)

2PC (Two-Phase Commit) is a distributed transaction processing mechanism used to ensure that data operations across multiple nodes either all succeed or all fail and roll back. The protocol divides the entire transaction process into two clear phases:

Prepare Phase
- The Coordinator sends prepare requests to all Participants
- Each participant executes transaction operations but does not commit, writing Undo/Redo information to transaction logs
- Participants lock relevant resources to ensure data consistency
- Participants feedback preparation results to the coordinator (agree or reject)
Commit Phase
- If all participants agree, the coordinator sends commit instructions
- Participants complete transaction commit and release resources
- If any participant rejects, the coordinator sends rollback instructions
- Participants use transaction logs to roll back operations

Prepare Phase

In this critical prepare phase, the Transaction Coordinator sends Prepare requests to all database nodes participating in the transaction. The specific process is as follows:

Prepare Message Sending: The transaction manager sends a Prepare message to each database node participating in the transaction, asking if they are ready to commit the transaction.
Local Transaction Execution:
- After each participant receives the Prepare message, they execute all operations of the transaction locally
- During execution, necessary database locks are acquired to prevent interference from other transactions
- But the transaction is not truly committed at this time
Log Recording:
- Undo Log: Records the data state before transaction modifications
- Redo Log: Records the data state after transaction modifications
- Logs are written to disk to ensure durability, allowing recovery even if the system crashes
Response Preparation Results:
- After completing the above operations, participants feedback responses to the coordinator
- Success returns “Ready” response
- Failure returns “Abort” response

Commit Phase

The commit phase is the link that determines the final fate of the transaction, containing the following detailed steps:

Decision Phase:
- The coordinator collects responses from all participants
- If a failure response is received from any participant or timeout:
  - Send Rollback instructions to all participants
  - Participants use Undo logs to roll back transactions
- If all participants return “Ready”:
  - Coordinator writes “commit” decision to persistent storage
  - Send Commit instructions to all participants
Execution Phase:
- After participants receive Commit instructions:
  - Use Redo logs to complete transaction persistence
  - Formally commit the transaction
- After participants receive Rollback instructions:
  - Use Undo logs to restore data to pre-transaction state
  - Mark the transaction as rolled back
Resource Release:
- The final phase must release all lock resources acquired during the transaction
- Including row locks, table locks, and various levels of locks
- After release, other transactions can access related data

2PC Execution Process

Successful Execution

Phase One

Transaction inquiry: The coordinator sends transaction content to all participants, asking if they can execute the transaction commit operation, and starts waiting for responses from participants
Execute transaction (write local Undo/Redo logs)
Each participant feedbacks transaction inquiry response to the coordinator

Phase Two

Send commit request: The coordinator sends a Commit request to all participants
Transaction commit: After participants receive the Commit request, they formally execute the transaction commit operation and release all transaction resources occupied during transaction execution after completing the commit
Feedback transaction commit results: After participants complete the transaction commit, they send an ACK message to the coordinator
Complete transaction: After the coordinator receives ACK messages from all participants, the transaction is completed

Abort Transaction

Phase One

Transaction inquiry: The coordinator sends transaction content to all participants, asking if they can execute the transaction commit operation, and starts waiting for responses from participants
Execute transaction: (write local Undo/Redo logs)
Each participant feedbacks transaction inquiry response to the coordinator

Phase Two

Send rollback request: The coordinator sends a RollBack request to all participants
Transaction rollback: After participants receive the RollBack request, they execute the rollback operation using the Undo information recorded in phase one, and release all resources occupied during transaction execution after completing the rollback
Feedback transaction rollback results: After participants complete the transaction rollback, they send an ACK message to the coordinator
Abort transaction: After the coordinator receives ACKs from all participants, the transaction abort is completed

2PC Advantages and Disadvantages

Advantages

Simple principle and convenient implementation

Disadvantages

Synchronous Blocking: The most obvious and biggest problem with the two-phase commit protocol is synchronous blocking. During the execution of the two-phase commit, all logics participating in the transaction operation are in a blocking state. That is, various participants cannot perform other operations while waiting for responses from other participants. This synchronous blocking greatly limits the performance of distributed systems.
Single Point Problem: The coordinator is very important throughout the two-phase commit process. If something goes wrong with the coordinator during the commit phase, the entire process will not function. More importantly, other participants will be in a state of constantly locking transaction resources and cannot continue completing transaction operations.
Data Inconsistency: Assuming the coordinator sends a commit request to all participants and then a partial exception occurs, or the coordinator crashes before sending all commit requests, resulting in only some participants receiving the commit request. This will lead to serious data inconsistency problems.
Too Conservative: If during the two-phase commit process, a participant fails causing the coordinator to never obtain responses from all participants, the coordinator can only rely on its own timeout mechanism to determine whether to abort the transaction. Obviously, this strategy is too conservative. In other words, the two-phase protocol does not design a complete fault tolerance mechanism. Failure of any node leads to failure of the entire transaction.