Flink indeed adopts classic Master/Slave architecture design, which is very common in big data processing systems. Below is a detailed introduction to Flink’s core roles and functions:

1. JobManager (Master Node)

As cluster manager, JobManager undertakes these key responsibilities:

1.1 Task Coordination and Management
  • Task Scheduling: Responsible for decomposing user-submitted tasks into specific Tasks and assigning them to various TaskManagers for execution. For example, when user submits a stream processing job, JobManager decomposes it into operations like Source, Transformation, and Sink, and maps to specific Tasks
  • Checkpoint Coordination: Periodically triggers Checkpoint mechanism to ensure stream processing consistency. For example, under Exactly-Once semantics, JobManager coordinates all TaskManagers to simultaneously save state snapshots
  • Fault Recovery: When TaskManager fails, JobManager is responsible for recovering task state from latest checkpoint and rescheduling affected tasks
1.2 High Availability Design
  • Leader-Follower Mode: In production environment, usually configure multiple JobManager instances. Among them:
    • One as Leader: Responsible for handling all client requests and task management
    • Others as Standby: Keep synchronized with Leader, take over through election mechanism when Leader fails
  • Election Mechanism: Usually based on ZooKeeper for Leader election, ensuring fast recovery when single point fails
1.3 Application Reception

JobManager supports two main application submission methods:

  • Jar Package Method: User packages complete Flink application into Jar file for submission
  • JobGraph Method: User directly submits pre-built task topology (JobGraph)

For example, when submitting via command line:

./bin/flink run -c com.example.MyJob /path/to/myjob.jar

2. TaskManager (Slave Node)

TaskManager is the actual worker node executing tasks:

  • Each TaskManager contains certain number of Task Slots for executing specific Tasks
  • Responsible for data buffering, network transmission and other underlying operations
  • Regularly sends heartbeat to JobManager, reporting status and resource usage

TaskManager (Slave)

Also called Worker

  • Main responsibility is receiving tasks from JobManager, deploying and starting tasks, receiving upstream data for processing
  • TaskManager is a working node executing tasks in one or more threads in JVM
  • When starting, TaskManager registers its resource information with ResourceManager (Slot count)

Roles and Functions

  • Task Execution: TaskManager is core component executing distributed data processing tasks in Flink cluster. It receives tasks distributed by JobManager, executes specific computations, and returns results to JobManager or next processing node
  • Resource Management: TaskManager manages computing resources allocated to it (CPU, memory). In Flink, each TaskManager has one or more Slots, each Slot can execute one parallel subtask (Subtask). Slot is basic unit for Flink task resource scheduling
  • Data Exchange and Caching: TaskManager is responsible for data exchange between different tasks, like Shuffle operations, and caches data to improve computing efficiency

Startup and Running

  • Register to JobManager: When TaskManager starts, it registers with JobManager, reports its available resource information (available memory and Slot count). JobManager uses this information for task scheduling and resource allocation
  • Execute Tasks: When JobManager assigns tasks to TaskManager, TaskManager starts corresponding Tasks and continuously monitors their execution status. After task completion, TaskManager reports results to JobManager
  • Fault Handling: TaskManager has certain fault recovery capability. If fault occurs during task execution, TaskManager reports to JobManager, JobManager reassigns tasks as needed

Communication Mechanism

  • Network Communication: TaskManager communicates with other TaskManagers and JobManager via network, exchanging intermediate result data. Flink provides efficient network stack to support low-latency and high-throughput distributed data stream processing
  • RPC and Heartbeat Mechanism: TaskManager and JobManager interact via RPC (Remote Procedure Call), and ensure TaskManager’s health through heartbeat mechanism. If JobManager doesn’t receive TaskManager’s heartbeat for a period, may consider TaskManager unavailable and trigger fault recovery process

Monitoring and Logging

  • Monitoring: Flink provides multiple ways to monitor TaskManager running status, like Web UI, log files, and Metrics system. Administrators can monitor each TaskManager’s resource usage, task execution progress and performance bottlenecks through these tools
  • Logging: TaskManager logs detailed log files describing task execution status and errors. These logs are very important for troubleshooting problems and tuning the system

ResourceManager

For different environments and resource providers like (YARN, Kubernetes, Standalone deployment), Flink provides different ResourceManagers. Its role is responsible for managing Flink’s processing resource units (Slot)

Roles and Functions

  • Resource Management: ResourceManager is responsible for managing entire cluster’s computing resources including CPU, memory, and network resources. It receives resource requests from JobManager, schedules and allocates these resources to start necessary number of TaskManager instances
  • Resource Request and Allocation: When Flink application starts, JobManager requests required resources from ResourceManager. ResourceManager allocates or starts TaskManager instances based on cluster’s resource status to meet these needs
  • Resource Recovery: After task completion, ResourceManager is responsible for recycling and releasing these resources so they can be reused by other tasks

Collaboration with JobManager

  • Resource Scheduling: JobManager generates task plan based on job’s parallelism and resource requirements, sends these requirements to ResourceManager. ResourceManager is responsible for deciding how to allocate these resources based on cluster’s resource situation
  • Start TaskManager: If available TaskManagers in cluster can’t meet JobManager’s needs, ResourceManager starts new TaskManager instances to process tasks. This is usually done through integrated resource management platform (Yarn, Kubernetes, or Mesos)

Resource Monitoring

  • Resource Usage Monitoring: ResourceManager monitors entire cluster’s resource usage including CPU, memory, and network bandwidth utilization. This monitoring data can help administrators optimize resource allocation and scheduling strategies
  • Logs and Metrics: ResourceManager generates detailed log files recording resource requests, allocation, recovery operations. Additionally, Flink provides multiple monitoring tools to view ResourceManager’s running status and resource usage in real-time

Dispatcher

Its role is to provide a REST interface for us to submit applications for execution. Once an application is submitted for execution, Dispatcher starts a JobManager and transfers the application to it. Dispatcher also starts a WebUI to provide information about job execution Note: Some application submission methods may not use Dispatcher

Roles and Functions

  • Job Submission and Scheduling: Dispatcher is responsible for receiving job submission requests from clients. Each time a job is submitted, Dispatcher starts a new JobManager instance to manage that job’s execution. This design ensures isolation between jobs, preventing one job’s failure from affecting other jobs
  • Multi-Job Management: Dispatcher can manage multiple jobs simultaneously. Each job has independent JobManager instance. Dispatcher is responsible for monitoring these jobs’ status and recycling resources after job completion or failure
  • REST Interface: Dispatcher provides a RESTful interface, allowing users to submit, query, and manage jobs through HTTP requests. This makes Flink easier to integrate with other systems and simplifies implementation of automated job scheduling

Relationship with JobManager

  • Independent JobManager: In architecture where Dispatcher is responsible, each submitted job starts an independent JobManager instance. Benefit is each job runs in isolation, improving cluster stability and robustness
  • Task Scheduling: After receiving job submission request, Dispatcher first decides how to allocate resources and start corresponding JobManager, then this JobManager manages and schedules specific task execution on TaskManagers

Architecture and Component Interaction

  • Resource Management Interaction: Dispatcher doesn’t directly manage cluster resources, but relies on ResourceManager to provide and schedule required resources. When submitting job, Dispatcher requests ResourceManager to start JobManager and TaskManager instances
  • Client Interaction: Dispatcher is entry point for clients to submit jobs. Clients communicate with Dispatcher via REST API to submit jobs, cancel jobs or query job status. Dispatcher is responsible for distributing these requests to corresponding JobManagers

Relationship Between Components

Flink Runtime Architecture

Flink program’s basic building blocks are streams and transformations (note: Dataset used in Flink and DataSet API is also internally a stream). Conceptually, stream is a (possibly endless) flow of data records, and transformation takes one or more streams as input and produces one or more output streams.

Above chart describes Flink application structure with three important components: Source, Transformation, Sink.

Source

Data source, defines where Flink loads data from. Flink has about 4 types of Sources for stream and batch processing:

  • Source based on local collections
  • Source based on files
  • Source based on network sockets
  • Custom sources (Apache Kafka, RabbitMQ, etc.)

Transformation

Various operations of data transformation, also called operators, including Map, FlatMap, Filter, KeyBy, Reduce, Window, etc., can transform data into desired data.

Sink

Receiver, defines where Flink sends transformed data, defines output direction of result data. Flink has several common Sinks:

  • Write to files
  • Print out
  • Write to sockets
  • Custom Sinks (Apache Kafka, RabbitMQ, MySQL, Elasticsearch, HDFS, etc.)

Task and SubTask

  • Task is collection of multiple functions with same stage, similar to TaskSet in Spark
  • SubTask is smallest execution unit in Flink, is an instance of a Java class with properties and methods completing specific computation logic. For example, executing a map operation, in distributed scenario will have multiple threads executing simultaneously, each thread executing is called a SubTask

OperatorChain

All operations in Flink are called Operators. When client submits tasks, it optimizes Operators—Operators that can be merged are merged into one Operator. Merged Operator becomes OperatorChain, actually an execution chain, each execution chain executes in an independent thread in TaskManager

During execution, tasks in application continuously exchange data. To effectively utilize network resources and improve throughput, Flink adopts buffering mechanism in task data transmission.

Task Slot and Slot Sharing

Task Slot also called Task-Slot, Slot Sharing also called Slot-Sharing

Each TaskManager is a JVM process, can execute one or more subtasks in different threads. To control how many Tasks a Worker can receive, Worker controls through TaskSlot (a Worker has at least one TaskSlot)

Task Slot

Each TaskSlot represents a fixed-size subset of resources owned by TaskManager. Generally: number of slots allocated equals CPU cores, e.g., 6 cores, allocate 6 slots. Flink divides process memory into multiple Slots. Assuming a TaskManager machine has 3 Slots, then each Slot occupies 1/3 of memory (evenly distributed).

Benefits after dividing memory into different Slots:

  • Maximum concurrent tasks TaskManager can execute is controllable: 3, because cannot exceed Slot count
  • Slots have exclusive memory space, so multiple different jobs can run in one TaskManager without affecting each other

Slot Sharing

By default, Flink allows subtasks (map[1] map[2] keyby[1] keyby[2]) to share Slots, even if they are subtasks of different tasks, as long as they come from the same job. Result is one Slot can hold entire pipeline of the job.