This is article 52 in the Big Data series, introducing Kafka’s core architecture and high-throughput principles.
What is Kafka
Kafka is a distributed, partitioned, multi-replica publish-subscribe messaging system developed by LinkedIn and contributed to Apache. It processes massive data with millisecond latency while ensuring data persistence, becoming one of the core message middleware in the big data ecosystem.
Core features:
- Partitioning: Messages distributed across nodes, naturally supports horizontal scaling
- Multi-replica strategy: ISR (In-Sync Replica) mechanism ensures reliability
- Persistent storage: O(1) time complexity access, supports TB-level data
- Zero-copy optimization: Combined with batch sending, significantly improves throughput
Core Design Goals
Efficiency
Kafka uses segmented log + index file storage model. Accessing any message has O(1) time complexity. Even with TB-level data accumulated in a Topic, latency remains at millisecond level.
High Throughput
Single machine can achieve 100,000+ messages/second, relying on three key technologies:
| Technology | Principle |
|---|---|
| Batch Processing | Producer merges multiple messages into one batch, reducing network round trips |
| Sequential Write | Append-only log, sequential disk write performance approaches random memory write |
| Zero-Copy | Uses sendfile system call, data flows directly from page cache to NIC, skipping user-space copy |
Partition Ordering
Messages within the same Partition are strictly ordered; cross-Partition order is not guaranteed. Two partitioning strategies supported:
- Hash Partitioning: Route by message Key hash, same Key always goes to same Partition
- Round-Robin Partitioning: Default when no Key, evenly distributes across all Partitions
Elastic Scaling
Add nodes online, trigger Partition rebalancing, the entire process does not interrupt service.
Message Model
Kafka supports two consumption modes, abstracted through Consumer Group:
| Mode | Description | Use Cases |
|---|---|---|
| Queue (Point-to-Point) | Only one consumer in same Group processes each message | Task queues, load balancing |
| Topic (Publish-Subscribe) | Different Groups all receive the same message | Broadcast notifications, multi-system sync |
Message pulling uses Pull mode: consumers pull actively, control consumption rate themselves, avoid being overwhelmed by pushes; cost is handling empty polls (optimized via fetch.max.wait.ms long polling).
Message Structure
Each Kafka message contains:
| Field | Description |
|---|---|
| Key | Optional, used for partition routing and ordering |
| Value | Actual message body |
| Timestamp | Message creation or write time, used for sorting and expiration |
| Offset | Unique position identifier within Partition, monotonically increasing |
| Headers | Optional metadata key-value pairs |
Core APIs
Producer API → Publish message streams to Topic
Consumer API → Subscribe and process records in Topic
Streams API → Transform input streams to output streams (real-time compute)
Connector API → Connect Kafka with external systems (DB, HDFS, etc.)
Architecture Components
Topic and Partition
- Topic: Logical classification of messages, like a database table
- Partition: Physical shard of Topic, each Partition is an ordered, immutable message log
- Multiple Partitions of a Topic spread across different Brokers, enabling parallel read/write
Broker and Controller
- Broker: Single service node in Kafka cluster, responsible for storing and forwarding messages
- Controller: Automatically elected from active Brokers, responsible for Partition allocation and Broker monitoring
- Each Partition has exactly one Leader Broker, handling all read/write requests; others are Followers, only for synchronization
ISR Mechanism
ISR (In-Sync Replica) is the set of replicas synchronized with the Leader:
Leader writes message
↓
Followers in ISR pull synchronously
↓
All ISR replicas acknowledge, message considered "committed" (when acks=all)
- Follower lagging beyond
replica.lag.time.max.msis removed from ISR - When Leader fails, only replicas in ISR are eligible to be elected as new Leader
Recommended Message Format: Apache Avro
Compared to JSON/Protobuf, Apache Avro has significant advantages in big data scenarios:
- Compact binary encoding, saves 50%+ storage space
- Schema evolution: Forward/backward compatible, field add/remove doesn’t break consumers
- Schema Registry: Centralized schema version management, Producer/Consumer auto-negotiation
Typical Use Cases
| Scenario | Description |
|---|---|
| Log Aggregation | Collect Nginx, application logs, write to HDFS/ES |
| Message Middleware | Replace traditional MQ, support higher throughput and longer retention |
| User Behavior Tracking | Real-time collection of tracking data, power recommendation systems |
| Operations Monitoring | Metrics data flow, integrate with Prometheus/Grafana |
| Stream Computing | Integrate with Spark Streaming / Flink, build real-time data pipelines |
Performance Metrics
- Single node supports thousands of concurrent client connections
- 7×24 continuous operation
- Millisecond-level end-to-end processing latency
- TB-level message storage without performance degradation
- Multi-replica zero data loss guarantee
Summary
Kafka’s high throughput comes from sequential writes, zero-copy, and batch processing; high availability comes from Partition multi-replica and ISR mechanism; elastic scaling comes from Partition’s horizontal sharding design. Understanding these three dimensions is the foundation for deep learning Kafka operations and tuning.