Apache Flink is a distributed stream processing framework widely used for real-time data computing scenarios. In production environments, Flink supports multiple deployment modes. This article provides a detailed introduction to three主流 deployment methods—Local, Standalone, and YARN—to help you choose the appropriate deployment approach based on actual requirements.

Flink supports three main installation modes, each suitable for different scenarios:

ModeCharacteristicsUse Cases
LocalSingle machine, out-of-boxDevelopment/debugging, learning
StandaloneFlink native cluster managementMedium/small-scale production
YARNBased on Hadoop YARN resource managementLarge-scale production

1. Local Mode

Local mode is the simplest way to run Flink, requiring no extra configuration, suitable for quick start and development debugging.

Main Characteristics:

  • No extra configuration needed, out-of-box
  • Runs in a single JVM process
  • Supports local IDE debugging
  • Suitable for learning Flink basic features

Use Cases:

  • Local development testing
  • Small-scale data verification
  • Learning Flink API

Startup:

cd /opt/servers/flink-1.11.1/bin
./start-local.sh

2. Standalone Mode

Standalone mode is Flink’s native cluster management approach, not depending on external resource managers, suitable for medium/small-scale production environments.

Main Characteristics:

  • Flink provides its own cluster management
  • Contains JobManager and TaskManager
  • Requires manual deployment and configuration
  • Resource allocation done by Flink

Advantages:

  • Simple and easy to use, no extra resource management system needed
  • Strong independence, no dependency on external systems
  • Low latency, no external resource scheduling involved

Disadvantages:

  • Does not support dynamic resource scaling
  • Complex management of large-scale clusters
  • Lacks automated resource allocation

3. YARN Mode

YARN mode uses Hadoop YARN for resource management, suitable for large-scale production environments.

Main Characteristics:

  • Resources uniformly scheduled by YARN
  • Supports dynamic resource allocation
  • Can coexist with other YARN applications

Running Modes:

  • Session Mode: Creates long-running cluster, multiple jobs share resources
  • Per-Job Mode: Each job independently allocated resources, released after completion

Advantages:

  • High resource utilization
  • Seamless integration with Hadoop ecosystem
  • Supports dynamic scaling

Cluster Planning

This article uses three machines for deployment:

HostnameConfigurationRole
h121.wzk.icu2C4GJobManager + TaskManager
h122.wzk.icu2C4GTaskManager
h123.wzk.icu2C2GTaskManager

Environment Preparation

Java Environment Configuration

Ensure all nodes have JAVA_HOME configured:

echo $JAVA_HOME
# Output example: /usr/local/java/jdk1.8.0_301

SSH Passwordless Login

Configure SSH passwordless login between three machines:

# Execute on h121 node
ssh-keygen -t rsa
ssh-copy-id h121.wzk.icu
ssh-copy-id h122.wzk.icu
ssh-copy-id h123.wzk.icu

Download and Installation

Choose stable version Flink 1.11.1:

cd /opt/software/
wget https://archive.apache.org/dist/flink/flink-1.11.1/flink-1.11.1-bin-scala_2.12.tgz
tar -zxvf flink-1.11.1-bin-scala_2.12.tgz
mv flink-1.11.1 /opt/servers/

Sync installation package to all nodes:

scp -r /opt/servers/flink-1.11.1 h122.wzk.icu:/opt/servers/
scp -r /opt/servers/flink-1.11.1 h123.wzk.icu:/opt/servers/

Standalone Mode Configuration

Modify Flink configuration file:

# jobmanager.rpc.address configures JobManager address
jobmanager.rpc.address: h121.wzk.icu

# Configure TaskManager Slot count
taskmanager.numberOfTaskSlots: 2

workers File

Configure TaskManager node list:

cd /opt/servers/flink-1.11.1/conf
vim workers

Write the following:

h121.wzk.icu
h122.wzk.icu
h123.wzk.icu

masters File

Configure JobManager node:

vim masters

Write the following:

h121.wzk.icu:8081

Start Cluster

After configuration, start Flink cluster:

cd /opt/servers/flink-1.11.1/bin/
./start-cluster.sh

After successful startup, access Flink Web UI:

http://h121.wzk.icu:8081/#/overview

Run Test

Flink provides rich example programs. Run WordCount example:

cd /opt/servers/flink-1.11.1/bin
./flink run ../examples/streaming/WordCount.jar

The example program reads wc.input file (if exists), or reads from data source specified by parameters, to perform word count statistics.

Deployment Mode Selection Suggestions

Choose appropriate deployment mode based on actual business needs:

  • Learning/Development: Choose Local mode for quick start
  • Medium/Small-scale Production: Choose Standalone mode, simple deployment
  • Large-scale Production: Choose YARN mode, high resource utilization

Standalone mode is suitable for scenarios with less than 10 nodes, fixed resources, and stable job loads. For production environments requiring frequent resource adjustments, it is recommended to use YARN or Kubernetes deployment modes.