Article Overview
This article provides a detailed introduction to the complete process of deploying Flink in YARN mode, including environment variable configuration, yarn-site.xml configuration, resource application, and job submission.
Environment Variable Configuration
Configure in /etc/profile:
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_CLASSPATH=`hadoop classpath`
yarn-site.xml Configuration
Need to add the following key configurations:
<!-- YARN Flink related -->
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>h123.wzk.icu:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>h123.wzk.icu:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>h123.wzk.icu:8031</value>
</property>
Sync Configuration
Need to keep configuration consistent across h121, h122, h123 three nodes. Can use rsync tool for synchronization.
Service Management
Stop Services
# Stop Hadoop
cd /opt/servers/hadoop-2.9.2/sbin
stop-all.sh
# Stop YARN (execute on h123 node)
start-yarn.sh
# Stop Flink (execute on h121 node)
./stop-cluster.sh
Start Services
# Start Hadoop (h121 node)
start-all.sh
# Start YARN (h123 node)
start-yarn.sh
Apply for Resources
yarn-session.sh Usage
./yarn-session.sh -n 2 -tm 800 -s 1 -d
Parameter description:
-n: Apply for 2 containers (TaskManager count)-s: Each TaskManager’s Slots count-tm: Each TaskManager’s memory size-d: Run in background
Note: Even if writing -n 2, actually applies for 3 Containers, because ApplicationMaster and JobManager occupy one extra container.
Submit for Execution
Method 1: Session Mode
Apply for resources first, then submit job:
./yarn-session.sh -n 2 -tm 800 -s 1 -d
Method 2: Direct Submission
./flink run -m yarn-cluster -yn 2 -yjm 1024 -ytm 1024 /opt/wzk//WordCount.jar
Parameter description:
-m: JobManager address-yn: TaskManager count
Stop yarn-cluster
yarn application -kill application_xxxxxxxxx
Configuration Points Summary
| Configuration | Description |
|---|---|
| HADOOP_CONF_DIR | Hadoop configuration directory |
| YARN_CONF_DIR | YARN configuration directory |
| HADOOP_CLASSPATH | Hadoop classpath |
| yarn.nodemanager.pmem-check-enabled | Disable physical memory check |
| yarn.nodemanager.vmem-check-enabled | Disable virtual memory check |
| yarn.resourcemanager.address | ResourceManager address |
This article demonstrates the complete deployment process of Flink on YARN through detailed steps, including environment preparation, configuration modification, service start/stop, and resource application.