MongoDB Sharded Cluster Overview

What is Sharding

Sharding is the core technology MongoDB uses to achieve horizontal scaling, horizontally partitioning large datasets and distributing them across multiple servers.

Components of Sharding

  1. Shard: Each shard contains a subset of the dataset
  2. Config Server: Stores cluster metadata
  3. Mongos: Acts as the entry point for applications

Sharding Strategies

  1. Range sharding: Data is divided based on value ranges of the shard key
  2. Hash sharding: Uses hash function to evenly distribute data
  3. Zone sharding: Defines data distribution rules based on geographic location

Shard Key Details

Selection Criteria

  1. Cardinality: High-cardinality fields are more suitable
  2. Write distribution: Avoid hot spot write issues
  3. Query pattern: Common query conditions should include shard key fields

Shard Key Selection Recommendations

ScenarioRecommended Shard Key
User orders{user_id: “hashed”}
Query many by user{user_id: “hashed”}
Many time range queries{order_time: 1}
Device + time{device_id:1, ts:1}

Minimal Runnable Example

Shard Operations

// Execute on mongos
sh.addShard("rs0/rs0a:27018,rs0b:27018,rs0c:27018")

sh.enableSharding("shop")
db = db.getSiblingDB("shop")
db.createCollection("orders")

// Shard key: hashed example
sh.shardCollection("shop.orders", { user_id: "hashed" })

Pre-split + Balancing

sh.splitAt("shop.orders", { user_id: NumberLong(0) })
sh.splitAt("shop.orders", { user_id: NumberLong(1e9) })
sh.startBalancer()

Verify Distribution

sh.status()
db.orders.getShardDistribution()

FAQ

  • Can shard key be changed? ≥5.0 can use online reshardCollection
  • Cross-shard aggregation slow? Prioritize pushing filter conditions to shard key