TL;DR

  • Scenario: High concurrency / massive small files (trunk) / single disk or RAID
  • Conclusion: Upgrade to v5.04+, increase max_connections and align with nofile; total threads ≈ CPU cores; reduce directory levels; adjust sync parameters based on latency targets.
  • Output: Deployable parameter baseline + version notes + error quick reference.

Core Content

1. max_connections Configuration

  • File: tracker.conf, storage.conf
  • Default: 256
  • Key Change: Since v5.04, uses incremental pre-allocation (tracker: 1024, storage: 256 per time)
  • Recommendation: Set to 10240 or higher,配合 with system nofile limit

System Limit Settings:

vi /etc/security/limits.conf
* soft nofile 65535
* hard nofile 65535

2. work_threads Configuration

Recommended Formula:

  • Tracker: work_threads + 1 ≈ CPU cores
  • Storage: work_threads + 1 + (disk_reader_threads + disk_writer_threads) × store_path_count ≈ CPU cores

3. subdir_count_per_path

  • Default: 256 (creates 256 × 256 = 65,536 directories)
  • Recommendation: For trunk scenarios with massive small files, reduce to 32 (creates 32 × 32 = 1,024 directories)

4. Disk I/O Threads

ScenarioConfiguration
Single diskdisk_reader_threads = 1, disk_writer_threads = 1
RAIDEnable disk_rw_separated, appropriately increase read/write threads

5. Sync Parameters

  • sync_binlog_buff_interval: Default 60s, reduce to 10-30s to reduce latency
  • sync_wait_msec: Default 200ms, reduce to 50-100ms to speed up task discovery
  • sync_interval: Default 0ms, set to 1-5ms to smooth I/O peaks

Optimization Checklist

  1. Version Check: Use v5.04+ to avoid old version connection buffer issues
  2. Connection Count and FD Limit: Set max_connections (5k-10k+) and nofile ≥ max_connections
  3. Threads and CPU: Match total thread count to CPU cores using the formula above
  4. Directory Structure: In trunk scenarios, reduce subdir_count_per_path to about 32
  5. Disk I/O: Single disk = 1 thread; RAID = increase threads based on CPU limits
  6. Sync Parameters: Adjust based on latency tolerance, verify with pressure testing

Error Quick Reference

SymptomRoot CauseFix
”Too many open files” startup errornofile/LimitNOFILE too smallIncrease limits.conf, restart
High peak connections, 4xx/5xxmax_connections too smallIncrease max_connections, align with nofile
High CPU but low QPSToo many work_threadsReduce to ≈ CPU cores
Slow initialization/traversalsubdir_count_per_path too largeReduce to about 32
RAID idle but low throughputInsufficient read/write threadsEnable disk_rw_separated, increase threads
High cross-machine sync latencysync_binlog_buff_interval too largeReduce binlog interval to 10-30s
Disk thrashing, latency spikeSync/write too aggressiveSet sync_interval to 1-5ms
Abnormal initial memory usageOld version allocates buffer at onceUpgrade to v5.04+