TL;DR
- Scenario: High concurrency / massive small files (trunk) / single disk or RAID
- Conclusion: Upgrade to v5.04+, increase max_connections and align with nofile; total threads ≈ CPU cores; reduce directory levels; adjust sync parameters based on latency targets.
- Output: Deployable parameter baseline + version notes + error quick reference.
Core Content
1. max_connections Configuration
- File: tracker.conf, storage.conf
- Default: 256
- Key Change: Since v5.04, uses incremental pre-allocation (tracker: 1024, storage: 256 per time)
- Recommendation: Set to 10240 or higher,配合 with system nofile limit
System Limit Settings:
vi /etc/security/limits.conf
* soft nofile 65535
* hard nofile 65535
2. work_threads Configuration
Recommended Formula:
- Tracker: work_threads + 1 ≈ CPU cores
- Storage: work_threads + 1 + (disk_reader_threads + disk_writer_threads) × store_path_count ≈ CPU cores
3. subdir_count_per_path
- Default: 256 (creates 256 × 256 = 65,536 directories)
- Recommendation: For trunk scenarios with massive small files, reduce to 32 (creates 32 × 32 = 1,024 directories)
4. Disk I/O Threads
| Scenario | Configuration |
|---|---|
| Single disk | disk_reader_threads = 1, disk_writer_threads = 1 |
| RAID | Enable disk_rw_separated, appropriately increase read/write threads |
5. Sync Parameters
- sync_binlog_buff_interval: Default 60s, reduce to 10-30s to reduce latency
- sync_wait_msec: Default 200ms, reduce to 50-100ms to speed up task discovery
- sync_interval: Default 0ms, set to 1-5ms to smooth I/O peaks
Optimization Checklist
- ✅ Version Check: Use v5.04+ to avoid old version connection buffer issues
- ✅ Connection Count and FD Limit: Set max_connections (5k-10k+) and nofile ≥ max_connections
- ✅ Threads and CPU: Match total thread count to CPU cores using the formula above
- ✅ Directory Structure: In trunk scenarios, reduce subdir_count_per_path to about 32
- ✅ Disk I/O: Single disk = 1 thread; RAID = increase threads based on CPU limits
- ✅ Sync Parameters: Adjust based on latency tolerance, verify with pressure testing
Error Quick Reference
| Symptom | Root Cause | Fix |
|---|---|---|
| ”Too many open files” startup error | nofile/LimitNOFILE too small | Increase limits.conf, restart |
| High peak connections, 4xx/5xx | max_connections too small | Increase max_connections, align with nofile |
| High CPU but low QPS | Too many work_threads | Reduce to ≈ CPU cores |
| Slow initialization/traversal | subdir_count_per_path too large | Reduce to about 32 |
| RAID idle but low throughput | Insufficient read/write threads | Enable disk_rw_separated, increase threads |
| High cross-machine sync latency | sync_binlog_buff_interval too large | Reduce binlog interval to 10-30s |
| Disk thrashing, latency spike | Sync/write too aggressive | Set sync_interval to 1-5ms |
| Abnormal initial memory usage | Old version allocates buffer at once | Upgrade to v5.04+ |