E-R Table Sharding Strategy Explained
1. E-R Relationship Sharding Principle
MyCat’s E-R relationship sharding strategy is an innovative solution designed based on database entity-relationship models. The core idea is to store data of associated parent and child tables in the same data shard.
2. Table Grouping Mechanism
- Group tables with association relationships into the same logical group
- Ensure associated data rows are always on the same physical node
3. Typical Application Scenarios
- Order-Order Detail scenario: Order table as parent, order detail table as child
- User-User Extension Information scenario
- Product-Product Category scenario
4. Configuration Example
<table name="orders" primaryKey="order_id" dataNode="dn1,dn2" rule="mod-long">
<childTable name="order_items" joinKey="order_id" parentKey="order_id"/>
</table>
5. Advantage Analysis
- Query performance improvement: Join queries do not need cross-node execution
- Transaction simplification: Related table operations can complete on the same node
- Data consistency: Avoids complexity from distributed transactions
Global Table
Concept and Definition
Global table is a special type of data table in distributed database systems, mainly used for storing dictionary data or configuration data.
Typical Characteristics
- Low data change frequency: Very few update operations
- Stable data scale: Number of records basically remains constant
- Moderate data volume: Single table data volume usually does not exceed 100MB
Solution
MyCat uses data redundancy mechanism for dictionary table join queries:
- All shards save complete copies of global table data
- Any update operation synchronizes to all shards
Design Principles
- Identification criteria: Data change cycle greater than 1 month, record count less than 100,000
- Implementation notes: Need to configure automatic synchronization mechanism
- Performance optimization suggestions: Establish appropriate indexes for global tables
Shard Node
Shard node refers to the actual database instance where each data shard is stored after splitting a single database table according to specific rules.
- Sharding mechanism: Data rows are allocated to different physical databases according to preset sharding rules
- Node characteristics: Each shard node is a complete database instance
- Typical deployment scenario: E-commerce system order table sharded by user ID
Node Host
Node host refers to the physical server or virtual machine that carries one or more shard nodes.
- Relationship between sharding and host: One node host can run multiple shard instances
- Concurrency limitation issue: When multiple high-load shards concentrate on the same host, resource bottlenecks easily occur
- Load balancing strategy: Shard distribution, read-write separation, weight allocation
Sharding Rules
Importance of Sharding Rules
- Even data distribution: Good sharding rules ensure data is evenly distributed across all shard nodes
- Query efficiency optimization: Reasonable sharding rules can reduce cross-shard queries
- Scalability guarantee: When expansion is needed, good sharding rules support smooth expansion
Common Sharding Rule Types
- Hash sharding: Hash calculate the sharding key to determine data distribution
- Range sharding: Shard by numeric range
- List sharding: Shard by enumerated values
- Complex sharding: Combine multiple sharding strategies