This is article 60 in the Big Data series, diving deep into Kafka Consumer side, explaining consumption flow, Consumer Group partition assignment, heartbeat mechanism, Rebalance triggers and parameter tuning.
Consumer Group and Partition Assignment
Kafka consumers belong to the same Consumer Group via group.id. Within same Group, each Partition is assigned to only one Consumer, naturally achieving load balancing.
Using 4 partitions (4P) as example, assignment with different Consumer counts:
| Scenario | Assignment |
|---|---|
| 4P, 1C | Single Consumer consumes all 4 partitions |
| 4P, 2C | Each Consumer assigned 2 partitions |
| 4P, 4C | Each Consumer assigned 1 partition (optimal) |
| 4P, 5C | 4 Consumers each consume 1 partition, 5th idle |
Recommended: Consumer count should not exceed partition count, otherwise idle Consumers waste resources.
Message Consumption Flow
Kafka uses Pull model, Consumer actively pulls messages from Broker rather than Broker pushing. Benefit is consumption speed is controlled by Consumer, won’t be overwhelmed if Broker pushes too fast.
Consumption process has three stages:
- Partition assignment: Group Coordinator assigns partitions to each Consumer based on Consumer count and assignment strategy (RangeAssignor / RoundRobinAssignor / StickyAssignor).
- Pull messages: Consumer calls
poll()to pull batch messages, maxmax.poll.recordsper pull (default 500). - Offset commit: After processing messages, commit offset, recorded to internal topic
__consumer_offsets. Supports auto commit (enable.auto.commit=true) and manual commit (commitSync()/commitAsync()).
Heartbeat Mechanism
Consumer notifies Group Coordinator it’s still alive via heartbeat:
Consumer ──── heartbeat ────► Group Coordinator
◄──── response ───────────
Heartbeat thread is separate from business thread (poll() thread), so even if poll() message processing takes long time, heartbeat still sends normally.
Core Parameters
| Parameter | Default | Description |
|---|---|---|
session.timeout.ms | 45000 ms | Coordinator considers Consumer offline if no heartbeat received within this time, triggers Rebalance |
heartbeat.interval.ms | 3000 ms | Frequency Consumer sends heartbeat, must be < session.timeout.ms |
max.poll.interval.ms | 300000 ms (5 minutes) | Max interval between poll() calls, timeout means consumer “stuck”, triggers Rebalance |
request.timeout.ms | 30000 ms | Timeout waiting for Broker response |
Recommended configuration principles:
session.timeout.msset to 3-5 timesheartbeat.interval.ms, leaving sufficient tolerance window.- If consumption logic takes long time (e.g., batch write to database), appropriately increase
max.poll.interval.msor decreasemax.poll.records, avoid unnecessary Rebalance due to processing timeout.
Rebalance Flow
Rebalance (partition reassignment) triggers in these situations:
- Consumer joins or leaves Group
- Consumer heartbeat timeout (
session.timeout.msexceeded) poll()call interval exceedsmax.poll.interval.ms- Topic partition count changes
Rebalance execution process:
1. All Consumers pause consumption
2. Release currently held partitions
3. Group Coordinator reassigns partitions based on strategy
4. Consumer gets new partitions, continues from last committed offset
Rebalance pauses consumption during execution, causes consumption delay. Production should minimize unnecessary Rebalances, keep Consumer count stable.
Offset Management
Offset stored in Kafka internal topic __consumer_offsets, each commit record format:
GroupId + Topic + Partition → Offset
Auto commit (enabled by default):
spring:
kafka:
consumer:
enable-auto-commit: true
auto-commit-interval: 5000 # Auto commit every 5 seconds
Auto commit has message loss risk: after pulling but before processing complete, offset already committed, message lost if process crashes.
Manual commit (recommended for production):
@KafkaListener(topics = "wzk_topic_test")
public void consume(ConsumerRecord<String, String> record, Acknowledgment ack) {
try {
// Process message
process(record);
// Manual commit after successful processing
ack.acknowledge();
} catch (Exception e) {
// Processing failed, don't commit offset, will re-consume next time
log.error("Consumption failed", e);
}
}
To enable manual commit in Spring-Kafka, configure ack-mode:
spring:
kafka:
listener:
ack-mode: manual
Summary
| Focus | Recommended Practice |
|---|---|
| Consumer count | Not exceed partition count |
| Rebalance frequency | Keep Consumer count stable, reasonably set timeout parameters |
| Message loss prevention | Use manual offset commit |
| Consumption stuck detection | Appropriately reduce max.poll.interval.ms, or increase consumption throughput |
| Heartbeat parameters | session.timeout.ms = 3-5 × heartbeat.interval.ms |
Kafka Consumer tuning core goal: stable consumption, no message loss, minimize Rebalance.