TL;DR
- Scenario: Coexisting with traditional IBM MQ, need open source, operatable, scalable, consistency/reliability
- Conclusion:
- RabbitMQ: Suits “reliability-first business decoupling”
- RocketMQ: Suits “transaction/order/delay messages”
- Kafka: Suits “data pipeline/log/stream processing”
- Output: Selection comparison table + deployment considerations + error quick reference
Middleware Selection Principles
Selection Criteria
- Open Source: Must be open source to ensure transparency and autonomy
- Popularity: Recent trend technology, active community
- Core Function Requirements:
- Message transmission reliability
- Cluster support capability
- Excellent performance
RabbitMQ
Overview: Based on AMQP protocol, lightweight and easy to deploy.
Advantages
-
Lightweight and Efficient:
- Small installation package (tens of MB)
- Second-level startup
- Docker support
-
Flexible Routing:
- Direct Exchange: Exact routing key match
- Fanout Exchange: Broadcast to all bound queues
- Topic Exchange: Supports wildcard matching
-
Multi-language Support:
- Official support for Java, .NET, Python
- Community support for 30+ languages
Disadvantages
- Message Backlog Issue: Throughput drops 50%+ when queue exceeds 100,000 messages
- Performance Limitations:
- Persistent messages: ~50,000-80,000 TPS
- Non-persistent messages: ~100,000-150,000 TPS
- Erlang Tech Stack: Plugin development requires Erlang knowledge
RocketMQ
Overview: Distributed message middleware open-sourced by Alibaba, based on Java.
Typical Application Scenarios
- Ordered message processing (e.g., order status changes)
- Distributed transactional messages (e.g., e-commerce systems)
- Real-time stream computing
- Message push notifications
Advantages
-
Comprehensive Features:
- Publish/Subscribe and P2P modes
- Message ordering, transactional messages, scheduled messages, message tracking
-
Developer Friendly:
- Java implementation, clear source code structure
- SPI extension mechanism support
-
Performance:
- Average response time < 3ms
- Single machine: 10W+ TPS
- Cluster mode: Million-level throughput
Disadvantages
- Ecosystem Integration: Limited Prometheus integration
- Operations Complexity: Multi-replica sync configuration complex
Kafka
Overview: Distributed stream processing platform designed for high throughput.
Advantages
- High Reliability: Multi-replica mechanism ensures no data loss
- Excellent Stability: Validated by LinkedIn, Uber; MTBF up to 99.99%
- Rich Features: Exactly-once semantics, transactional messages, message replay
- Ecosystem Compatibility: Deep integration with Hadoop, Spark, Flink
- Excellent Performance:
- Single machine: 100,000+ TPS
- Cluster mode: Million-level TPS
Technical Features
-
Scalable Architecture:
- Partition-based horizontal scaling
- Consumer Group mechanism achieves linear consumption scaling
-
Persistent Storage:
- Default 7-day retention
- Sequential read/write + page cache technology
-
Replica and Fault Tolerance:
- Multi-replica support (usually 3 replicas)
- ISR mechanism ensures data consistency
Overall Comparison
| Feature | RabbitMQ | RocketMQ | Kafka |
|---|---|---|---|
| Language | Erlang | Java | Java/Scala |
| Protocol | AMQP | Custom | Custom |
| TPS | 50K-150K | 100K+ | 100K-2M |
| Latency | Low | Very Low | Medium |
| Ordering | Supported | Supported | Supported |
| Transaction | Supported | Strong | Supported |
| Ecosystem | Good | Moderate | Excellent |
Selection Suggestions
-
Choose RabbitMQ:
- Reliability over performance
- Complex routing needs
- Medium throughput requirements (tens of thousands TPS)
-
Choose RocketMQ:
- Need transactional message support
- Need ordered messages
- Finance/trading systems
-
Choose Kafka:
- High throughput requirements (hundreds of thousands+ TPS)
- Data pipeline/log processing scenarios
- Need big data ecosystem integration
Error Quick Reference
RabbitMQ Issues
| Symptom | Root Cause | Fix |
|---|---|---|
| Throughput drops, latency jitter | Queue backlog, memory pressure | Check queue depth; reduce backlog through queue splitting, TTL+DLQ |
RocketMQ Issues
| Symptom | Root Cause | Fix |
|---|---|---|
| Send timeout | NameServer routing inconsistent, Broker pressure | Ensure NameServer HA; optimize Broker parameters |
Kafka Issues
| Symptom | Root Cause | Fix |
|---|---|---|
| Consumer Lag rises | Insufficient partitions, downstream slow | Increase partitions; optimize batch processing; control rebalance |
| ISR shrinks | Broker I/O bottleneck | Upgrade disk/network; adjust replica parameters |