Apache Flink is a distributed stream processing framework widely used for real-time data computing scenarios. In production environments, Flink supports multiple deployment modes. This article provides a detailed introduction to three主流 deployment methods—Local, Standalone, and YARN—to help you choose the appropriate deployment approach based on actual requirements.
Flink Supported Installation Modes
Flink supports three main installation modes, each suitable for different scenarios:
| Mode | Characteristics | Use Cases |
|---|---|---|
| Local | Single machine, out-of-box | Development/debugging, learning |
| Standalone | Flink native cluster management | Medium/small-scale production |
| YARN | Based on Hadoop YARN resource management | Large-scale production |
1. Local Mode
Local mode is the simplest way to run Flink, requiring no extra configuration, suitable for quick start and development debugging.
Main Characteristics:
- No extra configuration needed, out-of-box
- Runs in a single JVM process
- Supports local IDE debugging
- Suitable for learning Flink basic features
Use Cases:
- Local development testing
- Small-scale data verification
- Learning Flink API
Startup:
cd /opt/servers/flink-1.11.1/bin
./start-local.sh
2. Standalone Mode
Standalone mode is Flink’s native cluster management approach, not depending on external resource managers, suitable for medium/small-scale production environments.
Main Characteristics:
- Flink provides its own cluster management
- Contains JobManager and TaskManager
- Requires manual deployment and configuration
- Resource allocation done by Flink
Advantages:
- Simple and easy to use, no extra resource management system needed
- Strong independence, no dependency on external systems
- Low latency, no external resource scheduling involved
Disadvantages:
- Does not support dynamic resource scaling
- Complex management of large-scale clusters
- Lacks automated resource allocation
3. YARN Mode
YARN mode uses Hadoop YARN for resource management, suitable for large-scale production environments.
Main Characteristics:
- Resources uniformly scheduled by YARN
- Supports dynamic resource allocation
- Can coexist with other YARN applications
Running Modes:
- Session Mode: Creates long-running cluster, multiple jobs share resources
- Per-Job Mode: Each job independently allocated resources, released after completion
Advantages:
- High resource utilization
- Seamless integration with Hadoop ecosystem
- Supports dynamic scaling
Cluster Planning
This article uses three machines for deployment:
| Hostname | Configuration | Role |
|---|---|---|
| h121.wzk.icu | 2C4G | JobManager + TaskManager |
| h122.wzk.icu | 2C4G | TaskManager |
| h123.wzk.icu | 2C2G | TaskManager |
Environment Preparation
Java Environment Configuration
Ensure all nodes have JAVA_HOME configured:
echo $JAVA_HOME
# Output example: /usr/local/java/jdk1.8.0_301
SSH Passwordless Login
Configure SSH passwordless login between three machines:
# Execute on h121 node
ssh-keygen -t rsa
ssh-copy-id h121.wzk.icu
ssh-copy-id h122.wzk.icu
ssh-copy-id h123.wzk.icu
Download and Installation
Choose stable version Flink 1.11.1:
cd /opt/software/
wget https://archive.apache.org/dist/flink/flink-1.11.1/flink-1.11.1-bin-scala_2.12.tgz
tar -zxvf flink-1.11.1-bin-scala_2.12.tgz
mv flink-1.11.1 /opt/servers/
Sync installation package to all nodes:
scp -r /opt/servers/flink-1.11.1 h122.wzk.icu:/opt/servers/
scp -r /opt/servers/flink-1.11.1 h123.wzk.icu:/opt/servers/
Standalone Mode Configuration
flink-conf.yaml
Modify Flink configuration file:
# jobmanager.rpc.address configures JobManager address
jobmanager.rpc.address: h121.wzk.icu
# Configure TaskManager Slot count
taskmanager.numberOfTaskSlots: 2
workers File
Configure TaskManager node list:
cd /opt/servers/flink-1.11.1/conf
vim workers
Write the following:
h121.wzk.icu
h122.wzk.icu
h123.wzk.icu
masters File
Configure JobManager node:
vim masters
Write the following:
h121.wzk.icu:8081
Start Cluster
After configuration, start Flink cluster:
cd /opt/servers/flink-1.11.1/bin/
./start-cluster.sh
After successful startup, access Flink Web UI:
http://h121.wzk.icu:8081/#/overview
Run Test
Flink provides rich example programs. Run WordCount example:
cd /opt/servers/flink-1.11.1/bin
./flink run ../examples/streaming/WordCount.jar
The example program reads wc.input file (if exists), or reads from data source specified by parameters, to perform word count statistics.
Deployment Mode Selection Suggestions
Choose appropriate deployment mode based on actual business needs:
- Learning/Development: Choose Local mode for quick start
- Medium/Small-scale Production: Choose Standalone mode, simple deployment
- Large-scale Production: Choose YARN mode, high resource utilization
Standalone mode is suitable for scenarios with less than 10 nodes, fixed resources, and stable job loads. For production environments requiring frequent resource adjustments, it is recommended to use YARN or Kubernetes deployment modes.