MyCat Sharding Strategies: E-R, Global Tables & Sharding ...

E-R Table Sharding Strategy Explained

1. E-R Relationship Sharding Principle

MyCat’s E-R relationship sharding strategy is an innovative solution designed based on database entity-relationship models. The core idea is to store data of associated parent and child tables in the same data shard.

2. Table Grouping Mechanism

Group tables with association relationships into the same logical group
Ensure associated data rows are always on the same physical node

3. Typical Application Scenarios

Order-Order Detail scenario: Order table as parent, order detail table as child
User-User Extension Information scenario
Product-Product Category scenario

4. Configuration Example

<table name="orders" primaryKey="order_id" dataNode="dn1,dn2" rule="mod-long">
    <childTable name="order_items" joinKey="order_id" parentKey="order_id"/>
</table>

5. Advantage Analysis

Query performance improvement: Join queries do not need cross-node execution
Transaction simplification: Related table operations can complete on the same node
Data consistency: Avoids complexity from distributed transactions

Global Table

Concept and Definition

Global table is a special type of data table in distributed database systems, mainly used for storing dictionary data or configuration data.

Typical Characteristics

Low data change frequency: Very few update operations
Stable data scale: Number of records basically remains constant
Moderate data volume: Single table data volume usually does not exceed 100MB

Solution

MyCat uses data redundancy mechanism for dictionary table join queries:

All shards save complete copies of global table data
Any update operation synchronizes to all shards

Design Principles

Identification criteria: Data change cycle greater than 1 month, record count less than 100,000
Implementation notes: Need to configure automatic synchronization mechanism
Performance optimization suggestions: Establish appropriate indexes for global tables

Shard Node

Shard node refers to the actual database instance where each data shard is stored after splitting a single database table according to specific rules.

Sharding mechanism: Data rows are allocated to different physical databases according to preset sharding rules
Node characteristics: Each shard node is a complete database instance
Typical deployment scenario: E-commerce system order table sharded by user ID

Node Host

Node host refers to the physical server or virtual machine that carries one or more shard nodes.

Relationship between sharding and host: One node host can run multiple shard instances
Concurrency limitation issue: When multiple high-load shards concentrate on the same host, resource bottlenecks easily occur
Load balancing strategy: Shard distribution, read-write separation, weight allocation

Sharding Rules

Importance of Sharding Rules

Even data distribution: Good sharding rules ensure data is evenly distributed across all shard nodes
Query efficiency optimization: Reasonable sharding rules can reduce cross-shard queries
Scalability guarantee: When expansion is needed, good sharding rules support smooth expansion

Common Sharding Rule Types

Hash sharding: Hash calculate the sharding key to determine data distribution
Range sharding: Shard by numeric range
List sharding: Shard by enumerated values
Complex sharding: Combine multiple sharding strategies