Flink Architecture Deep Dive: JobManager, TaskManager & C...

Flink’s Important Roles and Working Principles

Flink indeed adopts classic Master/Slave architecture design, which is very common in big data processing systems. Below is a detailed introduction to Flink’s core roles and functions:

1. JobManager (Master Node)

As cluster manager, JobManager undertakes these key responsibilities:

1.1 Task Coordination and Management

Task Scheduling: Responsible for decomposing user-submitted tasks into specific Tasks and assigning them to various TaskManagers for execution. For example, when user submits a stream processing job, JobManager decomposes it into operations like Source, Transformation, and Sink, and maps to specific Tasks
Checkpoint Coordination: Periodically triggers Checkpoint mechanism to ensure stream processing consistency. For example, under Exactly-Once semantics, JobManager coordinates all TaskManagers to simultaneously save state snapshots
Fault Recovery: When TaskManager fails, JobManager is responsible for recovering task state from latest checkpoint and rescheduling affected tasks

1.2 High Availability Design

Leader-Follower Mode: In production environment, usually configure multiple JobManager instances. Among them:
- One as Leader: Responsible for handling all client requests and task management
- Others as Standby: Keep synchronized with Leader, take over through election mechanism when Leader fails
Election Mechanism: Usually based on ZooKeeper for Leader election, ensuring fast recovery when single point fails

1.3 Application Reception

JobManager supports two main application submission methods:

Jar Package Method: User packages complete Flink application into Jar file for submission
JobGraph Method: User directly submits pre-built task topology (JobGraph)

For example, when submitting via command line:

./bin/flink run -c com.example.MyJob /path/to/myjob.jar

2. TaskManager (Slave Node)

TaskManager is the actual worker node executing tasks:

Each TaskManager contains certain number of Task Slots for executing specific Tasks
Responsible for data buffering, network transmission and other underlying operations
Regularly sends heartbeat to JobManager, reporting status and resource usage

TaskManager (Slave)

Also called Worker

Main responsibility is receiving tasks from JobManager, deploying and starting tasks, receiving upstream data for processing
TaskManager is a working node executing tasks in one or more threads in JVM
When starting, TaskManager registers its resource information with ResourceManager (Slot count)

Roles and Functions

Task Execution: TaskManager is core component executing distributed data processing tasks in Flink cluster. It receives tasks distributed by JobManager, executes specific computations, and returns results to JobManager or next processing node
Resource Management: TaskManager manages computing resources allocated to it (CPU, memory). In Flink, each TaskManager has one or more Slots, each Slot can execute one parallel subtask (Subtask). Slot is basic unit for Flink task resource scheduling
Data Exchange and Caching: TaskManager is responsible for data exchange between different tasks, like Shuffle operations, and caches data to improve computing efficiency

Startup and Running

Register to JobManager: When TaskManager starts, it registers with JobManager, reports its available resource information (available memory and Slot count). JobManager uses this information for task scheduling and resource allocation
Execute Tasks: When JobManager assigns tasks to TaskManager, TaskManager starts corresponding Tasks and continuously monitors their execution status. After task completion, TaskManager reports results to JobManager
Fault Handling: TaskManager has certain fault recovery capability. If fault occurs during task execution, TaskManager reports to JobManager, JobManager reassigns tasks as needed

Communication Mechanism

Network Communication: TaskManager communicates with other TaskManagers and JobManager via network, exchanging intermediate result data. Flink provides efficient network stack to support low-latency and high-throughput distributed data stream processing
RPC and Heartbeat Mechanism: TaskManager and JobManager interact via RPC (Remote Procedure Call), and ensure TaskManager’s health through heartbeat mechanism. If JobManager doesn’t receive TaskManager’s heartbeat for a period, may consider TaskManager unavailable and trigger fault recovery process

Monitoring and Logging

Monitoring: Flink provides multiple ways to monitor TaskManager running status, like Web UI, log files, and Metrics system. Administrators can monitor each TaskManager’s resource usage, task execution progress and performance bottlenecks through these tools
Logging: TaskManager logs detailed log files describing task execution status and errors. These logs are very important for troubleshooting problems and tuning the system

ResourceManager

For different environments and resource providers like (YARN, Kubernetes, Standalone deployment), Flink provides different ResourceManagers. Its role is responsible for managing Flink’s processing resource units (Slot)

Roles and Functions

Resource Management: ResourceManager is responsible for managing entire cluster’s computing resources including CPU, memory, and network resources. It receives resource requests from JobManager, schedules and allocates these resources to start necessary number of TaskManager instances
Resource Request and Allocation: When Flink application starts, JobManager requests required resources from ResourceManager. ResourceManager allocates or starts TaskManager instances based on cluster’s resource status to meet these needs
Resource Recovery: After task completion, ResourceManager is responsible for recycling and releasing these resources so they can be reused by other tasks

Collaboration with JobManager

Resource Scheduling: JobManager generates task plan based on job’s parallelism and resource requirements, sends these requirements to ResourceManager. ResourceManager is responsible for deciding how to allocate these resources based on cluster’s resource situation
Start TaskManager: If available TaskManagers in cluster can’t meet JobManager’s needs, ResourceManager starts new TaskManager instances to process tasks. This is usually done through integrated resource management platform (Yarn, Kubernetes, or Mesos)

Resource Monitoring

Resource Usage Monitoring: ResourceManager monitors entire cluster’s resource usage including CPU, memory, and network bandwidth utilization. This monitoring data can help administrators optimize resource allocation and scheduling strategies
Logs and Metrics: ResourceManager generates detailed log files recording resource requests, allocation, recovery operations. Additionally, Flink provides multiple monitoring tools to view ResourceManager’s running status and resource usage in real-time

Dispatcher

Its role is to provide a REST interface for us to submit applications for execution. Once an application is submitted for execution, Dispatcher starts a JobManager and transfers the application to it. Dispatcher also starts a WebUI to provide information about job execution Note: Some application submission methods may not use Dispatcher

Roles and Functions

Job Submission and Scheduling: Dispatcher is responsible for receiving job submission requests from clients. Each time a job is submitted, Dispatcher starts a new JobManager instance to manage that job’s execution. This design ensures isolation between jobs, preventing one job’s failure from affecting other jobs
Multi-Job Management: Dispatcher can manage multiple jobs simultaneously. Each job has independent JobManager instance. Dispatcher is responsible for monitoring these jobs’ status and recycling resources after job completion or failure
REST Interface: Dispatcher provides a RESTful interface, allowing users to submit, query, and manage jobs through HTTP requests. This makes Flink easier to integrate with other systems and simplifies implementation of automated job scheduling

Relationship with JobManager

Independent JobManager: In architecture where Dispatcher is responsible, each submitted job starts an independent JobManager instance. Benefit is each job runs in isolation, improving cluster stability and robustness
Task Scheduling: After receiving job submission request, Dispatcher first decides how to allocate resources and start corresponding JobManager, then this JobManager manages and schedules specific task execution on TaskManagers

Architecture and Component Interaction

Resource Management Interaction: Dispatcher doesn’t directly manage cluster resources, but relies on ResourceManager to provide and schedule required resources. When submitting job, Dispatcher requests ResourceManager to start JobManager and TaskManager instances
Client Interaction: Dispatcher is entry point for clients to submit jobs. Clients communicate with Dispatcher via REST API to submit jobs, cancel jobs or query job status. Dispatcher is responsible for distributing these requests to corresponding JobManagers

Relationship Between Components

Flink Runtime Architecture

Flink Program Structure

Flink program’s basic building blocks are streams and transformations (note: Dataset used in Flink and DataSet API is also internally a stream). Conceptually, stream is a (possibly endless) flow of data records, and transformation takes one or more streams as input and produces one or more output streams.

Above chart describes Flink application structure with three important components: Source, Transformation, Sink.

Source

Data source, defines where Flink loads data from. Flink has about 4 types of Sources for stream and batch processing:

Source based on local collections
Source based on files
Source based on network sockets
Custom sources (Apache Kafka, RabbitMQ, etc.)

Transformation

Various operations of data transformation, also called operators, including Map, FlatMap, Filter, KeyBy, Reduce, Window, etc., can transform data into desired data.

Sink

Receiver, defines where Flink sends transformed data, defines output direction of result data. Flink has several common Sinks:

Write to files
Print out
Write to sockets
Custom Sinks (Apache Kafka, RabbitMQ, MySQL, Elasticsearch, HDFS, etc.)

Task and SubTask

Task is collection of multiple functions with same stage, similar to TaskSet in Spark
SubTask is smallest execution unit in Flink, is an instance of a Java class with properties and methods completing specific computation logic. For example, executing a map operation, in distributed scenario will have multiple threads executing simultaneously, each thread executing is called a SubTask

OperatorChain

All operations in Flink are called Operators. When client submits tasks, it optimizes Operators—Operators that can be merged are merged into one Operator. Merged Operator becomes OperatorChain, actually an execution chain, each execution chain executes in an independent thread in TaskManager

Data Transmission in Flink

During execution, tasks in application continuously exchange data. To effectively utilize network resources and improve throughput, Flink adopts buffering mechanism in task data transmission.

Task Slot also called Task-Slot, Slot Sharing also called Slot-Sharing

Each TaskManager is a JVM process, can execute one or more subtasks in different threads. To control how many Tasks a Worker can receive, Worker controls through TaskSlot (a Worker has at least one TaskSlot)

Task Slot

Each TaskSlot represents a fixed-size subset of resources owned by TaskManager. Generally: number of slots allocated equals CPU cores, e.g., 6 cores, allocate 6 slots. Flink divides process memory into multiple Slots. Assuming a TaskManager machine has 3 Slots, then each Slot occupies 1/3 of memory (evenly distributed).

Benefits after dividing memory into different Slots:

Maximum concurrent tasks TaskManager can execute is controllable: 3, because cannot exceed Slot count
Slots have exclusive memory space, so multiple different jobs can run in one TaskManager without affecting each other

By default, Flink allows subtasks (map[1] map[2] keyby[1] keyby[2]) to share Slots, even if they are subtasks of different tasks, as long as they come from the same job. Result is one Slot can hold entire pipeline of the job.

Flink’s Important Roles and Working Principles

1. JobManager (Master Node)

1.1 Task Coordination and Management

1.2 High Availability Design

1.3 Application Reception

2. TaskManager (Slave Node)

TaskManager (Slave)

Roles and Functions

Startup and Running

Communication Mechanism

Monitoring and Logging

ResourceManager

Roles and Functions

Collaboration with JobManager

Resource Monitoring

Dispatcher

Roles and Functions

Relationship with JobManager

Architecture and Component Interaction

Relationship Between Components

Flink Program Structure

Source

Transformation

Sink

Task and SubTask

OperatorChain

Data Transmission in Flink

Task Slot and Slot Sharing

Task Slot

Slot Sharing