This article details Flink StateBackend three implementations and OperatorState management mechanism:
Core Content Includes:
- Three StateBackends: MemoryStateBackend (state stored in JobManager memory), FsStateBackend (state in TM memory, Checkpoint stored in file system), RocksDBStateBackend (state stored in local RocksDB, supports super large state)
- Using ManagedOperatorState: Access non-keyed state by implementing CheckpointedFunction or ListCheckpointed interface
- State Redistribution Modes: Event-split redistribution (even distribution), Union redistribution (each operator gets all state)
- Memory Data Structures: ListState corresponds to PartitionableListState (ArrayList), BroadcastState corresponds to HeapBroadcastState (HashMap)
- Configuration Method: Per-job set through StreamExecutionEnvironment, default configured in flink-conf.yaml state.backend
- Checkpoint Configuration: enableCheckpointing(), setCheckpointingMode(), setMinPauseBetweenCheckpoints() and other key parameters
Using ManagedOperatorState:
- CheckpointedFunction interface description
- ListCheckpointed interface description
- Even-split redistribution mode
- Union redistribution mode
- StateBackend preservation mechanism (MemoryStateBackend, FsStateBackend, RocksDBStateBackend)
- Configure StateBackend
- Enable Checkpoint