Installation Modes
1. Local Mode
- Definition: Single-machine running mode, suitable for development and debugging
- Features:
- No additional configuration needed, out-of-box
- Can only run in a single JVM process
- Supports local IDE debugging (e.g., IntelliJ IDEA)
- Use Cases: Local development testing, small-scale data validation, learning Flink basics
2. Standalone Mode
- Definition: Flink’s built-in cluster management
- Features: Manual cluster deployment and management required, contains JobManager and TaskManager, resource allocation done by Flink itself
- Advantages: No external resource manager dependency, relatively simple deployment
- Use Cases: Medium to small production environments, scenarios with less dependency on Hadoop ecosystem, quick Flink cluster setup
3. YARN Mode
- Definition: Resource management mode based on Hadoop YARN
- Features: Computing resources unified by YARN, supports dynamic resource allocation, can coexist with other YARN applications
- Running Modes: Session mode (long-running cluster), Per-Job mode (resource allocation per job)
- Advantages: High resource utilization, seamless integration with Hadoop ecosystem, suitable for large-scale production
Cluster Planning
- h121 2C4G
- h122 2C4G
- h123 2C2G
Download & Installation
Selected version: Flink 1.11.1
cd /opt/software/
wget https://archive.apache.org/dist/flink/flink-1.11.1/flink-1.11.1-bin-scala_2.12.tgz
tar -zxvf flink-1.11.1-bin-scala_2.12.tgz
mv flink-1.11.1 ../servers/
Standalone Mode Deployment
flink-conf.yaml
Modify:
jobmanager.rpc.address: h121.wzk.icu
taskmanager.numberOfTaskSlots: 2
Workers
Write to three cloud nodes:
h121.wzk.icu
h122.wzk.icu
h123.wzk.icu
Master
h121.wzk.icu:8081
Service Startup
cd /opt/servers/flink-1.11.1/bin/
./start-cluster.sh
Access: http://h121.wzk.icu:8081/#/overview
Test
cd /opt/servers/flink-1.11.1/bin
./flink run ../examples/streaming/WordCount.jar
Features, Advantages & Disadvantages
Advantages
- Simple and easy to use: No additional resource management system needed, relatively simple configuration
- Strong independence: Does not depend on external systems, can run independently without YARN, Kubernetes, etc.
- Low latency: Does not involve external resource scheduling system, relatively low latency
Disadvantages
- Poor resource elasticity: Does not support dynamic resource expansion or reduction
- Complex management: Manually managing multiple JobManagers and TaskManagers in large-scale clusters can become complex
- Lacks advanced features: Compared to YARN or Kubernetes deployment modes, lacks automated resource allocation, dynamic scaling, etc.
Use Cases
- Development and testing
- Small clusters (less than 10 nodes)
- Edge computing (industrial IoT gateways, vehicle-mounted devices, etc.)
Scalability & Limitations
- Limited scalability: Although allows expansion under fixed resources, difficult to handle large-scale or dynamically changing workloads
- Adaptability: May not be very suitable for scenarios requiring frequent resource adjustments, but can provide stable and reliable service when resources are fixed and job load is relatively stable