MongoDB Sharded Cluster Overview
What is Sharding
Sharding is the core technology MongoDB uses to achieve horizontal scaling, horizontally partitioning large datasets and distributing them across multiple servers.
Components of Sharding
- Shard: Each shard contains a subset of the dataset
- Config Server: Stores cluster metadata
- Mongos: Acts as the entry point for applications
Sharding Strategies
- Range sharding: Data is divided based on value ranges of the shard key
- Hash sharding: Uses hash function to evenly distribute data
- Zone sharding: Defines data distribution rules based on geographic location
Shard Key Details
Selection Criteria
- Cardinality: High-cardinality fields are more suitable
- Write distribution: Avoid hot spot write issues
- Query pattern: Common query conditions should include shard key fields
Shard Key Selection Recommendations
| Scenario | Recommended Shard Key |
|---|---|
| User orders | {user_id: “hashed”} |
| Query many by user | {user_id: “hashed”} |
| Many time range queries | {order_time: 1} |
| Device + time | {device_id:1, ts:1} |
Minimal Runnable Example
Shard Operations
// Execute on mongos
sh.addShard("rs0/rs0a:27018,rs0b:27018,rs0c:27018")
sh.enableSharding("shop")
db = db.getSiblingDB("shop")
db.createCollection("orders")
// Shard key: hashed example
sh.shardCollection("shop.orders", { user_id: "hashed" })
Pre-split + Balancing
sh.splitAt("shop.orders", { user_id: NumberLong(0) })
sh.splitAt("shop.orders", { user_id: NumberLong(1e9) })
sh.startBalancer()
Verify Distribution
sh.status()
db.orders.getShardDistribution()
FAQ
- Can shard key be changed? ≥5.0 can use online reshardCollection
- Cross-shard aggregation slow? Prioritize pushing filter conditions to shard key