This is article 18 in the Big Data series. Demonstrates complete Source→Channel→Sink data flow pipeline through simplest Flume case.

Complete illustrated version: CSDN Original | Juejin

Case Goal

Build simplest Flume Agent using three components:

  • Source: netcat source, monitors TCP port, receives text via telnet
  • Channel: memory channel, caches Events in JVM memory
  • Sink: logger sink, prints Event content to console

This is the standard Hello World scenario to verify Flume installation.

Component Description

ComponentTypeFeatures
netcat sourcenetcatMonitors TCP port, each line input becomes an Event
memory channelmemoryHigh performance, data may be lost on process crash
logger sinkloggerOutputs to console log, for debugging only

Configuration File

Create directory and config file /opt/wzk/flume_test/flume-netcat-logger.conf:

mkdir -p /opt/wzk/flume_test

Configuration file content:

# Declare three component names for Agent
a1.sources = r1
a1.channels = c1
a1.sinks = k1

# Source config: monitor TCP port 8888
a1.sources.r1.type = netcat
a1.sources.r1.bind = h122.wzk.icu
a1.sources.r1.port = 8888

# Channel config: memory buffer
a1.channels.c1.type = memory
a1.channels.c1.capacity = 10000
a1.channels.c1.transactionCapacity = 100

# Sink config: output to console log
a1.sinks.k1.type = logger

# Binding: Source connects to Channel, Sink connects to Channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

Key parameter explanations:

  • capacity: Maximum Events Channel can buffer (10000)
  • transactionCapacity: Maximum Events per transaction (100)
  • bind: Hostname or IP for netcat to monitor

Start Flume Agent

Confirm port 8888 is not in use:

lsof -i:8888

Start Agent, specify Agent name and config file:

$FLUME_HOME/bin/flume-ng agent \
  --name a1 \
  --conf-file /opt/wzk/flume_test/flume-netcat-logger.conf \
  -Dflume.root.logger=INFO,console

Parameter explanations:

  • --name a1: Matches Agent name in config file
  • --conf-file: Config file path
  • -Dflume.root.logger=INFO,console: Set log level to INFO and output to console

Send Test Data

Install telnet client (if not installed):

sudo apt install telnet

In another terminal, connect to Flume listening port:

telnet h122.wzk.icu 8888

After successful connection, enter any text and press Enter, for example:

hello flume
big data test

Observe Output

Go back to terminal where Flume started, can see log output similar to:

INFO sink.LoggerSink: Event: { headers:{} body: 68 65 6C 6C 6F 20 66 6C 75 6D 65  hello flume }
INFO sink.LoggerSink: Event: { headers:{} body: 62 69 67 20 64 61 74 61 20 74 65  big data te }

Event body shows both hexadecimal and readable text, indicating data successfully passed through Source → Channel → Sink complete pipeline.

Summary

This case verifies Flume’s basic working principle: Source receives data and packages as Event, Channel buffers, Sink consumes Event and writes out. Subsequent cases will replace Sink with HDFS Sink to implement real log collection to disk.