This is article 53 in the Big Data series, deeply dissecting the working principles and collaboration flow of Kafka’s three core components.

Overall Architecture Review

Kafka is a distributed publish-subscribe messaging system with three core roles:

Producer (Producer)
    ↓ Publish messages
Broker Cluster (Message storage & routing)
    ↓ Consume messages
Consumer Group (Consumer group)

The three are decoupled through Topic + Partition: Producer only publishes, Consumer only subscribes, Broker handles middle persistence and routing.

Producer: Message Producer

Basic Responsibilities

Producer is responsible for creating messages and publishing to specified Topics. Sending process:

  1. Serialize message (Key/Value)
  2. Select target Partition based on partitioning strategy
  3. Append message to local buffer (RecordAccumulator)
  4. Background Sender thread batch sends to target Broker

Partitioning Strategy

StrategyTrigger ConditionEffect
指定分区Explicitly specify partition at sendPrecise routing
Hash PartitioningMessage with KeySame Key always writes to same Partition, guarantees local order
Round-RobinNo Key, defaultEven distribution, maximizes throughput
Custom PartitionerImplement Partitioner interfaceBusiness custom routing logic

ACK Confirmation Mechanism

acks parameter controls message reliability vs throughput trade-off:

acks ValueSemanticsUse Cases
0No waiting for any acknowledgment, fire-and-forgetLog collection, allows minor loss
1Leader writes successfullyDefault, balanced reliability and performance
all (-1)All ISR replicas writeFinance, orders and core data

Async Send Example

ProducerRecord<Integer, String> record =
    new ProducerRecord<>("my_topic", 0, 42, "hello kafka");

// Async send, handle result via callback
producer.send(record, (metadata, exception) -> {
    if (exception == null) {
        System.out.println("Sent to partition " + metadata.partition()
            + " at offset " + metadata.offset());
    } else {
        exception.printStackTrace();
    }
});

Broker: Message Storage Node

Core Responsibilities

Broker is the basic unit in Kafka cluster. Each Broker is an independent JVM process, responsible for:

  • Receiving Producer written messages, persisting to local disk
  • Responding to Consumer pull requests
  • Maintaining Partition Leader/Follower roles
  • Coordinating cluster metadata with ZooKeeper (or KRaft)

Physical Storage of Partition

Each Partition corresponds to a directory on disk, composed of multiple Segment files:

/var/kafka-logs/my_topic-0/
├── 00000000000000000000.log     ← Message data file
├── 00000000000000000000.index   ← Offset sparse index
└── 00000000000000000000.timeindex ← Timestamp index
  • Filename is the 20-digit zero-padded starting Offset of that Segment
  • Write is append-only, read uses binary search index to locate Offset
  • Default new Segment generated when single Segment exceeds 1GB or 7 days

Leader and Follower

Partition Leader (on some Broker)
    ← Receives Producer writes
    → Responds to Consumer reads
    → Pushes to Followers for sync asynchronously

Partition Follower (on other Brokers)
    → Pulls messages from Leader, catches up
    → Enters ISR list
    → When Leader fails, elected from ISR as new Leader

Controller Role

There is exactly one Broker serving as Controller in the cluster, elected through ZooKeeper:

  • Monitor Broker存活状态
  • Responsible for Partition Leader election
  • Manage Partition ISR list changes
  • Coordinate Partition reallocation when new Broker joins

Consumer: Message Consumer

Consumer Group Mechanism

Kafka implements two consumption semantics through Consumer Group:

Within same Group:           Between different Groups:
  Partition 1 → Consumer A     Group A → Each consumes independently
  Partition 2 → Consumer B    Group B → Each consumes independently
  Partition 3 → Consumer C   (publish-subscribe semantics)
(point-to-point semantics)

Key rules:

  • Each Partition in same Consumer Group can only be consumed by one Consumer
  • When Consumer count exceeds Partition count, excess Consumers idle
  • Consumer exit/join triggers Rebalance, reassigning Partitions

Offset Management

Consumer tracks consumption progress through Offset:

MethodDescription
Auto commitenable.auto.commit=true, auto commits every auto.commit.interval.ms
Manual sync commitconsumer.commitSync(), ensure commit after successful consumption, avoid message loss
Manual async commitconsumer.commitAsync(), higher performance, no retry on failure

Offset is stored in Kafka internal Topic __consumer_offsets (Kafka 0.9+ no longer depends on ZooKeeper).

Consumption Position Reset

Control consumer behavior when first starting or Offset lost via auto.offset.reset:

  • earliest: Start from earliest message in Partition
  • latest (default): Only consume new messages after
  • none startup: Throw exception when no committed Offset exists

Rebalance Triggers

Rebalance triggers:
  1. Consumer joins Group (new instance comes online)
  2. Consumer leaves Group (crash or normal shutdown)
  3. Consumer timeout without heartbeat (session.timeout.ms)
  4. Topic Partition count changes

All consumers pause during Rebalance. Production should properly configure session.timeout.ms and max.poll.interval.ms to reduce unnecessary Rebalances.

Full Message Flow

1. Producer serializes message, selects Partition
2. Producer batch sends to target Broker (Leader)
3. Leader Broker writes to local log (.log file)
4. Follower actively pulls new messages from Leader, writes to local replica
5. After all ISR replicas confirm write, message "committed"
6. Consumer initiates Poll request to Leader, carrying current Offset
7. Leader returns message batch starting from that Offset
8. Consumer commits new Offset after processing

Summary

  • Producer partitioning strategy and ACK mechanism determine message routing and reliability
  • Broker Segment storage and ISR mechanism ensure high-performance persistence and high availability
  • Consumer Group achieves parallel consumption through Partition exclusive assignment, Offset management controls consumption semantics
  • The three collaborate to form Kafka’s complete闭环 of high throughput, low latency, and high availability