State Storage Methods:
- MemoryStateBackend: Stores state in TaskManager’s Java memory. Fast but limited (5MB per state default, 10MB per task). Suitable for local development and debugging with small states.
- FsStateBackend: Stores state in TaskManager memory but snapshots to filesystem (HDFS). Suitable for large states, long windows, and high availability scenarios.
- RocksDBStateBackend: Uses embedded RocksDB database. Supports very large states, long windows, and is the only backend supporting incremental Checkpoint. Limited by disk space rather than memory.
Keyed State vs Operator State:
- Operator State: Bound to parallel operator instances (e.g., Kafka Consumer maintaining partition offsets)
- Keyed State: Exists only on KeyedStream, logically bound to <parallel-operator-instance, key>
Managed State Types:
- ValueState, ListState, ReducingState, AggregatingState, MapState, FoldingState
TTL (Time-To-Live) Features:
- Available since Flink 1.6.0
- Only supports Processing Time
- TTL config not persisted to Checkpoint/Savepoint