Installation Modes

1. Local Mode

  • Definition: Single-machine running mode, suitable for development and debugging
  • Features:
    • No additional configuration needed, out-of-box
    • Can only run in a single JVM process
    • Supports local IDE debugging (e.g., IntelliJ IDEA)
  • Use Cases: Local development testing, small-scale data validation, learning Flink basics

2. Standalone Mode

  • Definition: Flink’s built-in cluster management
  • Features: Manual cluster deployment and management required, contains JobManager and TaskManager, resource allocation done by Flink itself
  • Advantages: No external resource manager dependency, relatively simple deployment
  • Use Cases: Medium to small production environments, scenarios with less dependency on Hadoop ecosystem, quick Flink cluster setup

3. YARN Mode

  • Definition: Resource management mode based on Hadoop YARN
  • Features: Computing resources unified by YARN, supports dynamic resource allocation, can coexist with other YARN applications
  • Running Modes: Session mode (long-running cluster), Per-Job mode (resource allocation per job)
  • Advantages: High resource utilization, seamless integration with Hadoop ecosystem, suitable for large-scale production

Cluster Planning

  • h121 2C4G
  • h122 2C4G
  • h123 2C2G

Download & Installation

Selected version: Flink 1.11.1

cd /opt/software/
wget https://archive.apache.org/dist/flink/flink-1.11.1/flink-1.11.1-bin-scala_2.12.tgz
tar -zxvf flink-1.11.1-bin-scala_2.12.tgz
mv flink-1.11.1 ../servers/

Standalone Mode Deployment

Modify:

jobmanager.rpc.address: h121.wzk.icu
taskmanager.numberOfTaskSlots: 2

Workers

Write to three cloud nodes:

h121.wzk.icu
h122.wzk.icu
h123.wzk.icu

Master

h121.wzk.icu:8081

Service Startup

cd /opt/servers/flink-1.11.1/bin/
./start-cluster.sh

Access: http://h121.wzk.icu:8081/#/overview

Test

cd /opt/servers/flink-1.11.1/bin
./flink run ../examples/streaming/WordCount.jar

Features, Advantages & Disadvantages

Advantages

  • Simple and easy to use: No additional resource management system needed, relatively simple configuration
  • Strong independence: Does not depend on external systems, can run independently without YARN, Kubernetes, etc.
  • Low latency: Does not involve external resource scheduling system, relatively low latency

Disadvantages

  • Poor resource elasticity: Does not support dynamic resource expansion or reduction
  • Complex management: Manually managing multiple JobManagers and TaskManagers in large-scale clusters can become complex
  • Lacks advanced features: Compared to YARN or Kubernetes deployment modes, lacks automated resource allocation, dynamic scaling, etc.

Use Cases

  • Development and testing
  • Small clusters (less than 10 nodes)
  • Edge computing (industrial IoT gateways, vehicle-mounted devices, etc.)

Scalability & Limitations

  • Limited scalability: Although allows expansion under fixed resources, difficult to handle large-scale or dynamically changing workloads
  • Adaptability: May not be very suitable for scenarios requiring frequent resource adjustments, but can provide stable and reliable service when resources are fixed and job load is relatively stable