This is article 60 in the Big Data series, diving deep into Kafka Consumer side, explaining consumption flow, Consumer Group partition assignment, heartbeat mechanism, Rebalance triggers and parameter tuning.

Consumer Group and Partition Assignment

Kafka consumers belong to the same Consumer Group via group.id. Within same Group, each Partition is assigned to only one Consumer, naturally achieving load balancing.

Using 4 partitions (4P) as example, assignment with different Consumer counts:

ScenarioAssignment
4P, 1CSingle Consumer consumes all 4 partitions
4P, 2CEach Consumer assigned 2 partitions
4P, 4CEach Consumer assigned 1 partition (optimal)
4P, 5C4 Consumers each consume 1 partition, 5th idle

Recommended: Consumer count should not exceed partition count, otherwise idle Consumers waste resources.

Message Consumption Flow

Kafka uses Pull model, Consumer actively pulls messages from Broker rather than Broker pushing. Benefit is consumption speed is controlled by Consumer, won’t be overwhelmed if Broker pushes too fast.

Consumption process has three stages:

  1. Partition assignment: Group Coordinator assigns partitions to each Consumer based on Consumer count and assignment strategy (RangeAssignor / RoundRobinAssignor / StickyAssignor).
  2. Pull messages: Consumer calls poll() to pull batch messages, max max.poll.records per pull (default 500).
  3. Offset commit: After processing messages, commit offset, recorded to internal topic __consumer_offsets. Supports auto commit (enable.auto.commit=true) and manual commit (commitSync() / commitAsync()).

Heartbeat Mechanism

Consumer notifies Group Coordinator it’s still alive via heartbeat:

Consumer ──── heartbeat ────► Group Coordinator
        ◄──── response ───────────

Heartbeat thread is separate from business thread (poll() thread), so even if poll() message processing takes long time, heartbeat still sends normally.

Core Parameters

ParameterDefaultDescription
session.timeout.ms45000 msCoordinator considers Consumer offline if no heartbeat received within this time, triggers Rebalance
heartbeat.interval.ms3000 msFrequency Consumer sends heartbeat, must be < session.timeout.ms
max.poll.interval.ms300000 ms (5 minutes)Max interval between poll() calls, timeout means consumer “stuck”, triggers Rebalance
request.timeout.ms30000 msTimeout waiting for Broker response

Recommended configuration principles:

  • session.timeout.ms set to 3-5 times heartbeat.interval.ms, leaving sufficient tolerance window.
  • If consumption logic takes long time (e.g., batch write to database), appropriately increase max.poll.interval.ms or decrease max.poll.records, avoid unnecessary Rebalance due to processing timeout.

Rebalance Flow

Rebalance (partition reassignment) triggers in these situations:

  • Consumer joins or leaves Group
  • Consumer heartbeat timeout (session.timeout.ms exceeded)
  • poll() call interval exceeds max.poll.interval.ms
  • Topic partition count changes

Rebalance execution process:

1. All Consumers pause consumption
2. Release currently held partitions
3. Group Coordinator reassigns partitions based on strategy
4. Consumer gets new partitions, continues from last committed offset

Rebalance pauses consumption during execution, causes consumption delay. Production should minimize unnecessary Rebalances, keep Consumer count stable.

Offset Management

Offset stored in Kafka internal topic __consumer_offsets, each commit record format:

GroupId + Topic + Partition → Offset

Auto commit (enabled by default):

spring:
  kafka:
    consumer:
      enable-auto-commit: true
      auto-commit-interval: 5000  # Auto commit every 5 seconds

Auto commit has message loss risk: after pulling but before processing complete, offset already committed, message lost if process crashes.

Manual commit (recommended for production):

@KafkaListener(topics = "wzk_topic_test")
public void consume(ConsumerRecord<String, String> record, Acknowledgment ack) {
    try {
        // Process message
        process(record);
        // Manual commit after successful processing
        ack.acknowledge();
    } catch (Exception e) {
        // Processing failed, don't commit offset, will re-consume next time
        log.error("Consumption failed", e);
    }
}

To enable manual commit in Spring-Kafka, configure ack-mode:

spring:
  kafka:
    listener:
      ack-mode: manual

Summary

FocusRecommended Practice
Consumer countNot exceed partition count
Rebalance frequencyKeep Consumer count stable, reasonably set timeout parameters
Message loss preventionUse manual offset commit
Consumption stuck detectionAppropriately reduce max.poll.interval.ms, or increase consumption throughput
Heartbeat parameterssession.timeout.ms = 3-5 × heartbeat.interval.ms

Kafka Consumer tuning core goal: stable consumption, no message loss, minimize Rebalance.