Overview

MySQL master-slave replication delay has always been a focus for database administrators and developers. With data volume growth and increasing business requirements, the traditional single-threaded replication mode can no longer meet performance demands. MySQL introduced parallel replication functionality starting from version 5.6, officially called enhanced multi-threaded slave (MTS), designed to significantly improve replication delay issues.

Problems with Traditional Replication Architecture

In the standard master-slave replication architecture, replicas mainly rely on two key threads:

  1. IO Thread: Responsible for receiving binlog events from the master and writing to the replica’s relay log
  2. SQL Thread: Responsible for reading events from the relay log and executing them on the replica

Both threads work in single-threaded mode. When the master has large write volumes, especially involving large amounts of DML operations, the SQL Thread easily becomes a bottleneck, causing replica replication delay to continuously increase.

Parallel Replication Evolution

MySQL 5.6 Initial Implementation

Version 5.6 introduced database-based parallel replication for the first time. Its core idea is:

  • DML operations on different schemas can execute in parallel
  • Configure slave_parallel_workers parameter to set the number of worker threads
  • Main applicable scenarios: When multiple businesses use different schemas

Limitations: Operations within a single schema still execute serially, limited improvement for scenarios with large single-schema write volumes.

MySQL 5.7 Enhancement

Version 5.7 launched logical timestamp-based parallel replication:

  • Introduced slave_parallel_type=LOGICAL_CLOCK configuration
  • Uses group commit information to determine which transactions can execute in parallel
  • Added binlog_group_commit_sync_delay and binlog_group_commit_sync_no_delay_count parameters to control master group commit behavior

Advantages: Transactions within the same schema can also execute in parallel, significantly improving performance.

MySQL 8.0 Optimization

Version 8.0 further improved to writeset-based parallel replication:

  • Controlled through binlog_transaction_dependency_tracking parameter
  • Can select WRITESET or WRITESET_SESSION mode
  • Automatically identifies non-conflicting transactions to achieve finer-grained parallelism

Characteristics: Does not rely on group commit, reduces master performance impact, higher degree of parallelism.

Application Scenarios and Configuration Suggestions

Applicable Scenarios

  1. OLTP systems with high master write pressure
  2. Report systems requiring low-latency real-time data
  3. Replicas in read-write separation architecture

Configuration Suggestions

-- MySQL 5.7+ recommended configuration
slave_parallel_workers=8              -- Set according to CPU core count
slave_parallel_type=LOGICAL_CLOCK     -- Or WRITESET(8.0+)
binlog_group_commit_sync_delay=10000  -- Microsecond unit, appropriate increase can improve parallelism

Performance Considerations

Actual tests show that with proper configuration, parallel replication can:

  • Reduce replication delay by 60%-80%
  • Improve CPU utilization by 30%-50%
  • Basic unchanged demand for network bandwidth

However, note that excessively increasing the number of worker threads may lead to intensified lock competition,反而降低性能. It is recommended to gradually adjust and test based on actual load.

5.6 Parallel Replication Principle

MySQL 5.6’s parallel replication is a database-level parallel execution mechanism. Its core idea is to improve replica replication efficiency through multi-threading. Specific implementation principles are as follows:

  1. Multi-database parallel mechanism:

    • When the master has multiple databases, the replica allocates an independent worker thread for each database
    • These threads can parallel apply transaction changes from different databases
    • For example, if the master has db1, db2, db3 three databases, the replica can create three threads to respectively process binlog events from these databases
  2. Implementation characteristics:

    • Transaction execution order on the replica remains consistent with the master
    • Transactions in the same database still execute serially
    • Transactions between different databases can execute in parallel
    • System automatically maintains dependency relationships between transactions
  3. Applicable scenarios:

    • Multi-tenant systems, each tenant uses an independent database
    • Business systems split into different databases by functional modules
    • Multiple independent data marts in data warehouse environments
  4. Performance impact:

    • For single database environments, this mechanism cannot provide performance improvement
    • The more databases, the more obvious the parallel effect
    • Recommended configuration: set slave_parallel_workers to 1.5-2 times the number of databases
  5. Configuration method:

-- Enable parallel replication
SET GLOBAL slave_parallel_workers = 4;
-- Set parallel mode to database-level parallel
SET GLOBAL slave_parallel_type = 'DATABASE';
  1. Limitations:
    • Cannot handle single large database performance bottlenecks
    • Cross-database transactions still need serial processing
    • Limited support for DDL operations

Although this parallel replication mechanism is simple, it can bring significant replication performance improvement for typical web application architectures (such as each customer having an independent database), reducing replication delay by an average of 30%-50%.

5.7 Parallel Replication Principle

MySQL 5.7 implemented group commit-based parallel replication, which is truly parallel replication technology. Compared to previous versions, 5.7’s parallel replication has the following major improvements:

Working Principles

  1. Group commit mechanism: On the Master side, transactions in the same group commit are assigned the same commit_id, these transactions can be replayed in parallel on the Slave
  2. Binary log improvements: Binlog added last_committed and sequence_number fields to identify transaction parallelism relationships
  3. Coordination mechanism: The Slave’s SQL thread parses these identifiers, assigning transactions that can execute in parallel to different worker threads

Core Advantages

  • Maintains execution order consistency: Slave replay order is completely consistent with Master, ensuring data consistency
  • Breaks through database-level limitations: Different from 5.6’s database-based parallel replication, 5.7 can achieve parallel replication within the same database
  • Dynamic parallelism: Parallelism can be dynamically adjusted based on system load, controlled by slave_parallel_workers parameter

Practical Application Scenarios

  1. High-concurrency write environments: When Master has large concurrent writes, Slave can maintain near real-time synchronization
  2. Large transaction processing: Multiple large transactions can replay in parallel, significantly improving replication efficiency
  3. Multi-table operation scenarios: Even when operating different tables in the same database, parallel replication can be achieved

Performance Comparison

Tests show that in typical OLTP scenarios:

  • Compared to version 5.6, 5.7 parallel replication performance improved by 5-10 times
  • Delay time reduced by more than 80%
  • Higher resource utilization, CPU multi-core advantages are fully utilized

Configuration Parameters

Key configuration parameters include:

  • slave_parallel_type = LOGICAL_CLOCK
  • slave_parallel_workers = (recommended to set to 2-4 times CPU core count)
  • slave_preserve_commit_order = ON (ensures transaction final commit order)

This group commit-based parallel replication mechanism enabled MySQL 5.7 to achieve a qualitative leap in master-slave replication performance, providing reliable guarantee for data synchronization in high-concurrency scenarios.

MySQL 5.7 Group Commit Parallel Replication Implementation Mechanism

MySQL 5.7’s parallel replication significantly improved replication performance by introducing a transaction grouping mechanism. Its core implementation principles are as follows:

1. Transaction Grouping Mechanism

  • Master processing: When multiple transactions enter the commit phase simultaneously, the system groups these transactions into one group
  • Binary log marking: When writing to the binary log, a special group commit marker is added for transactions in the same group
  • In-group transaction characteristics: Transactions in the same group satisfy the following conditions:
    • No lock conflicts
    • Do not modify the same data rows
    • Can safely execute in parallel

2. Implementation Principles

  • Prepare phase detection: During the transaction prepare phase, the system detects conflict relationships between transactions
  • Parallel commit judgment: All transactions that pass the prepare phase detection are considered ready for parallel commit
  • Binary log writing: Transactions ready for parallel commit are written to the binary log as an entire group

3. Replica Parallel Execution

  • Group information parsing: The replica identifies transaction groups that can execute in parallel by parsing group commit information in the binary log
  • Parallel replay: Transactions in the same group are executed in parallel by different worker threads
  • Execution guarantee: The system ensures that transactions within the group will not:
    • Generate data conflicts
    • Cause deadlocks
    • Violate transaction isolation levels

4. Technical Advantages

Compared with traditional replication solutions, this implementation has the following advantages:

  1. Completely avoids conflict detection: Essentially avoids conflicts through the group commit mechanism
  2. Simplifies concurrency control: No longer needs complex concurrency algorithms and waiting strategies
  3. Significantly improves performance: Higher parallelism, better resource utilization

5. Application Scenario Examples

  • E-commerce system’s high-concurrency order processing
  • Social media user behavior log recording
  • Financial system’s batch transaction processing

This innovative parallel replication approach represents a significant advancement in MySQL replication technology, providing more efficient replication solutions for high-concurrency scenarios.

InnoDB transactions use a two-phase commit (2PC) mode, which is an important mechanism to ensure transaction atomicity and durability. The specific process is divided into two phases:

  1. Prepare phase:

    • Transaction writes to redo log (marked as prepare state)
    • At this point, the transaction has completed all modification operations but has not yet finally confirmed commit
    • InnoDB ensures redo log is persisted to disk
  2. Commit phase:

    • Transaction writes to binlog (binary log)
    • Marks redo log as commit state
    • Completes transaction commit

In MySQL 5.6 version, master-slave replication used database-based (database) parallel replication, where transactions from different databases can execute in parallel on the replica. The parallelism of this approach depends on the number of databases in the database instance.

To improve replication performance, MySQL 5.7 introduced a new variable slave-parallel-type, which has the following two configuration values:

  • DATABASE (default value):

    • Maintains the same database-based parallel replication as version 5.6
    • Suitable for multi-database environments, such as database sharding scenarios
    • Example: If there are db1, db2, db3 three databases, transactions from these three databases can execute in parallel on the replica
  • LOGICAL_CLOCK:

    • Group commit (group commit) based parallel replication
    • Can achieve higher parallelism in single database scenarios
    • Working principle: Identifies transactions on the master that entered the prepare phase simultaneously, these transactions can execute in parallel on the replica
    • Performance advantage: In OLTP scenarios, especially with high-concurrency writes to a single database, can significantly improve replication performance

The choice of which mode to use in practical applications needs to consider:

  • Database architecture (multi-database or single database)
  • Workload characteristics (OLTP or OLAP)
  • Sensitivity to replication delay

In MySQL 5.7 and higher versions, it is recommended to use LOGICAL_CLOCK mode in high-concurrency write scenarios to obtain better replication performance.

How to Know if Transactions are in the Same Group?

In MySQL, determining whether transactions belong to the same group commit primarily through the following mechanism:

  1. Group commit identifier mechanism:

    • Transactions within the same group share the same group commit identifier
    • This identifier is recorded in the transaction metadata
    • The system determines whether transactions belong to the same group by comparing this identifier
  2. Specific implementation method:

    • During the prepare phase, the coordinator thread assigns the same group ID to the same batch of committed transactions
    • This group ID is persisted along with the transaction into logs
    • During recovery or replication, transaction group relationships are determined by reading this group ID

How Does Generated Binlog Tell Slave Which Transactions Can Be Replicated in Parallel?

MySQL marks transactions that can be replicated in parallel in Binlog through the following methods:

  1. GTID mechanism:

    • Each transaction is assigned a globally unique transaction identifier (GTID)
    • Transactions in the same group commit are assigned consecutive GTID sequences
    • Slave determines transaction parallelism relationships by analyzing GTID sequences
  2. Specific marking method:

    • Set special flag bits in the binlog event header
    • Use specific log event types to mark group boundaries
    • Record group commit information in transaction metadata
  3. Communication protocol:

    • Master passes group commit information to Slave through binlog events
    • Slave’s parallel replication threads parse this information
    • Determine how to apply these transactions in parallel based on parsing results

MySQL 5.7 Implementation Details

In MySQL 5.7 version, the specific design to achieve group commit information transmission is:

  1. GTID scheme:

    • By default, group commit information is stored in GTID
    • Each GTID contains: source_id:transaction_id
    • Transactions in the same group commit have consecutive transaction_id
  2. Anonymous GTID scheme:

    • For scenarios where GTID is not enabled (GTID_MODE=OFF)
    • Introduced ANONYMOUS_GTID_LOG_EVENT event type
    • This event contains group commit information similar to GTID
    • But does not expose GTID identifiers visible to users
  3. Implementation characteristics:

    • Supports group commit information transmission regardless of whether GTID is enabled
    • Ensures backward compatibility
    • Transparent to users, no additional configuration needed
    • Stored in binary form in Binlog, does not occupy much space
  4. Typical application scenarios:

    • Master-slave replication environments
    • Binlog-based recovery scenarios
    • Database migration processes
    • Data acquisition for data synchronization tools

By analyzing binlog logs through the mysqlbinlog tool, you can discover the internal information of group commits. It can be found that MySQL 5.7’s binary has additional last_committed and sequence_number compared to the original binary before. last_committed represents the number of the previous transaction commit when the transaction commits. If transactions have the same last_committed, it means these transactions are in one group and can be replayed in parallel.

8.0 Parallel Replication Principle

MySQL 8.0 is writeset-based parallel replication. MySQL uses a set variable to store information about records modified by transactions (primary key hash values). All committed transactions’ modified primary key values are hashed and compared with the set variable to determine whether there are row conflicts, and以此来确定依赖关系,没有冲突即可并行. The granularity in this case is at the row level, the judgment granularity is more refined, and parallelism is faster.

Semi-Synchronous Replication

Background

Traditional asynchronous replication may lose data.

Principles

  1. Master executes transaction and writes to binlog
  2. Master waits for at least one replica confirmation
  3. After receiving confirmation, master completes transaction commit

Transaction Write Process

  1. InnoDB Redo File Write (Prepare Write)
  2. Binlog File Flush & Sync to Binlog File
  3. Waiting for replica confirmation
  4. InnoDB Redo File Commit (Commit Write)
  5. Return client response

Types of Semi-Synchronous Replication

  1. After-Commit: Introduced in MySQL 5.5
  2. After-Sync: Introduced in MySQL 5.7, safer