This is article 53 in the Big Data series, deeply dissecting the working principles and collaboration flow of Kafka’s three core components.
Overall Architecture Review
Kafka is a distributed publish-subscribe messaging system with three core roles:
Producer (Producer)
↓ Publish messages
Broker Cluster (Message storage & routing)
↓ Consume messages
Consumer Group (Consumer group)
The three are decoupled through Topic + Partition: Producer only publishes, Consumer only subscribes, Broker handles middle persistence and routing.
Producer: Message Producer
Basic Responsibilities
Producer is responsible for creating messages and publishing to specified Topics. Sending process:
- Serialize message (Key/Value)
- Select target Partition based on partitioning strategy
- Append message to local buffer (
RecordAccumulator) - Background Sender thread batch sends to target Broker
Partitioning Strategy
| Strategy | Trigger Condition | Effect |
|---|---|---|
| 指定分区 | Explicitly specify partition at send | Precise routing |
| Hash Partitioning | Message with Key | Same Key always writes to same Partition, guarantees local order |
| Round-Robin | No Key, default | Even distribution, maximizes throughput |
| Custom Partitioner | Implement Partitioner interface | Business custom routing logic |
ACK Confirmation Mechanism
acks parameter controls message reliability vs throughput trade-off:
| acks Value | Semantics | Use Cases |
|---|---|---|
0 | No waiting for any acknowledgment, fire-and-forget | Log collection, allows minor loss |
1 | Leader writes successfully | Default, balanced reliability and performance |
all (-1) | All ISR replicas write | Finance, orders and core data |
Async Send Example
ProducerRecord<Integer, String> record =
new ProducerRecord<>("my_topic", 0, 42, "hello kafka");
// Async send, handle result via callback
producer.send(record, (metadata, exception) -> {
if (exception == null) {
System.out.println("Sent to partition " + metadata.partition()
+ " at offset " + metadata.offset());
} else {
exception.printStackTrace();
}
});
Broker: Message Storage Node
Core Responsibilities
Broker is the basic unit in Kafka cluster. Each Broker is an independent JVM process, responsible for:
- Receiving Producer written messages, persisting to local disk
- Responding to Consumer pull requests
- Maintaining Partition Leader/Follower roles
- Coordinating cluster metadata with ZooKeeper (or KRaft)
Physical Storage of Partition
Each Partition corresponds to a directory on disk, composed of multiple Segment files:
/var/kafka-logs/my_topic-0/
├── 00000000000000000000.log ← Message data file
├── 00000000000000000000.index ← Offset sparse index
└── 00000000000000000000.timeindex ← Timestamp index
- Filename is the 20-digit zero-padded starting Offset of that Segment
- Write is append-only, read uses binary search index to locate Offset
- Default new Segment generated when single Segment exceeds 1GB or 7 days
Leader and Follower
Partition Leader (on some Broker)
← Receives Producer writes
→ Responds to Consumer reads
→ Pushes to Followers for sync asynchronously
Partition Follower (on other Brokers)
→ Pulls messages from Leader, catches up
→ Enters ISR list
→ When Leader fails, elected from ISR as new Leader
Controller Role
There is exactly one Broker serving as Controller in the cluster, elected through ZooKeeper:
- Monitor Broker存活状态
- Responsible for Partition Leader election
- Manage Partition ISR list changes
- Coordinate Partition reallocation when new Broker joins
Consumer: Message Consumer
Consumer Group Mechanism
Kafka implements two consumption semantics through Consumer Group:
Within same Group: Between different Groups:
Partition 1 → Consumer A Group A → Each consumes independently
Partition 2 → Consumer B Group B → Each consumes independently
Partition 3 → Consumer C (publish-subscribe semantics)
(point-to-point semantics)
Key rules:
- Each Partition in same Consumer Group can only be consumed by one Consumer
- When Consumer count exceeds Partition count, excess Consumers idle
- Consumer exit/join triggers Rebalance, reassigning Partitions
Offset Management
Consumer tracks consumption progress through Offset:
| Method | Description |
|---|---|
| Auto commit | enable.auto.commit=true, auto commits every auto.commit.interval.ms |
| Manual sync commit | consumer.commitSync(), ensure commit after successful consumption, avoid message loss |
| Manual async commit | consumer.commitAsync(), higher performance, no retry on failure |
Offset is stored in Kafka internal Topic __consumer_offsets (Kafka 0.9+ no longer depends on ZooKeeper).
Consumption Position Reset
Control consumer behavior when first starting or Offset lost via auto.offset.reset:
earliest: Start from earliest message in Partitionlatest(default): Only consume new messages afternone startup: Throw exception when no committed Offset exists
Rebalance Triggers
Rebalance triggers:
1. Consumer joins Group (new instance comes online)
2. Consumer leaves Group (crash or normal shutdown)
3. Consumer timeout without heartbeat (session.timeout.ms)
4. Topic Partition count changes
All consumers pause during Rebalance. Production should properly configure session.timeout.ms and max.poll.interval.ms to reduce unnecessary Rebalances.
Full Message Flow
1. Producer serializes message, selects Partition
2. Producer batch sends to target Broker (Leader)
3. Leader Broker writes to local log (.log file)
4. Follower actively pulls new messages from Leader, writes to local replica
5. After all ISR replicas confirm write, message "committed"
6. Consumer initiates Poll request to Leader, carrying current Offset
7. Leader returns message batch starting from that Offset
8. Consumer commits new Offset after processing
Summary
- Producer partitioning strategy and ACK mechanism determine message routing and reliability
- Broker Segment storage and ISR mechanism ensure high-performance persistence and high availability
- Consumer Group achieves parallel consumption through Partition exclusive assignment, Offset management controls consumption semantics
- The three collaborate to form Kafka’s complete闭环 of high throughput, low latency, and high availability