Flink Installation & Deployment: Local, Standalone, YARN ...

Apache Flink is a distributed stream processing framework widely used for real-time data computing scenarios. In production environments, Flink supports multiple deployment modes. This article provides a detailed introduction to three主流 deployment methods—Local, Standalone, and YARN—to help you choose the appropriate deployment approach based on actual requirements.

Flink Supported Installation Modes

Flink supports three main installation modes, each suitable for different scenarios:

Mode	Characteristics	Use Cases
Local	Single machine, out-of-box	Development/debugging, learning
Standalone	Flink native cluster management	Medium/small-scale production
YARN	Based on Hadoop YARN resource management	Large-scale production

1. Local Mode

Local mode is the simplest way to run Flink, requiring no extra configuration, suitable for quick start and development debugging.

Main Characteristics:

No extra configuration needed, out-of-box
Runs in a single JVM process
Supports local IDE debugging
Suitable for learning Flink basic features

Use Cases:

Local development testing
Small-scale data verification
Learning Flink API

Startup:

cd /opt/servers/flink-1.11.1/bin
./start-local.sh

2. Standalone Mode

Standalone mode is Flink’s native cluster management approach, not depending on external resource managers, suitable for medium/small-scale production environments.

Main Characteristics:

Flink provides its own cluster management
Contains JobManager and TaskManager
Requires manual deployment and configuration
Resource allocation done by Flink

Advantages:

Simple and easy to use, no extra resource management system needed
Strong independence, no dependency on external systems
Low latency, no external resource scheduling involved

Disadvantages:

Does not support dynamic resource scaling
Complex management of large-scale clusters
Lacks automated resource allocation

3. YARN Mode

YARN mode uses Hadoop YARN for resource management, suitable for large-scale production environments.

Main Characteristics:

Resources uniformly scheduled by YARN
Supports dynamic resource allocation
Can coexist with other YARN applications

Running Modes:

Session Mode: Creates long-running cluster, multiple jobs share resources
Per-Job Mode: Each job independently allocated resources, released after completion

Advantages:

High resource utilization
Seamless integration with Hadoop ecosystem
Supports dynamic scaling

Cluster Planning

This article uses three machines for deployment:

Hostname	Configuration	Role
h121.wzk.icu	2C4G	JobManager + TaskManager
h122.wzk.icu	2C4G	TaskManager
h123.wzk.icu	2C2G	TaskManager

Environment Preparation

Java Environment Configuration

Ensure all nodes have JAVA_HOME configured:

echo $JAVA_HOME
# Output example: /usr/local/java/jdk1.8.0_301

Configure SSH passwordless login between three machines:

# Execute on h121 node
ssh-keygen -t rsa
ssh-copy-id h121.wzk.icu
ssh-copy-id h122.wzk.icu
ssh-copy-id h123.wzk.icu

Download and Installation

Choose stable version Flink 1.11.1:

cd /opt/software/
wget https://archive.apache.org/dist/flink/flink-1.11.1/flink-1.11.1-bin-scala_2.12.tgz
tar -zxvf flink-1.11.1-bin-scala_2.12.tgz
mv flink-1.11.1 /opt/servers/

Sync installation package to all nodes:

scp -r /opt/servers/flink-1.11.1 h122.wzk.icu:/opt/servers/
scp -r /opt/servers/flink-1.11.1 h123.wzk.icu:/opt/servers/

Standalone Mode Configuration

flink-conf.yaml

Modify Flink configuration file:

# jobmanager.rpc.address configures JobManager address
jobmanager.rpc.address: h121.wzk.icu

# Configure TaskManager Slot count
taskmanager.numberOfTaskSlots: 2

workers File

Configure TaskManager node list:

cd /opt/servers/flink-1.11.1/conf
vim workers

Write the following:

h121.wzk.icu
h122.wzk.icu
h123.wzk.icu

masters File

Configure JobManager node:

vim masters

Write the following:

h121.wzk.icu:8081

Start Cluster

After configuration, start Flink cluster:

cd /opt/servers/flink-1.11.1/bin/
./start-cluster.sh

After successful startup, access Flink Web UI:

http://h121.wzk.icu:8081/#/overview

Run Test

Flink provides rich example programs. Run WordCount example:

cd /opt/servers/flink-1.11.1/bin
./flink run ../examples/streaming/WordCount.jar

The example program reads wc.input file (if exists), or reads from data source specified by parameters, to perform word count statistics.

Deployment Mode Selection Suggestions

Choose appropriate deployment mode based on actual business needs:

Learning/Development: Choose Local mode for quick start
Medium/Small-scale Production: Choose Standalone mode, simple deployment
Large-scale Production: Choose YARN mode, high resource utilization

Standalone mode is suitable for scenarios with less than 10 nodes, fixed resources, and stable job loads. For production environments requiring frequent resource adjustments, it is recommended to use YARN or Kubernetes deployment modes.

Flink Supported Installation Modes

1. Local Mode

2. Standalone Mode

3. YARN Mode

Cluster Planning

Environment Preparation

Java Environment Configuration

SSH Passwordless Login

Download and Installation

Standalone Mode Configuration

flink-conf.yaml

workers File

masters File

Start Cluster

Run Test

Deployment Mode Selection Suggestions