State Storage Methods:

  • MemoryStateBackend: Stores state in TaskManager’s Java memory. Fast but limited (5MB per state default, 10MB per task). Suitable for local development and debugging with small states.
  • FsStateBackend: Stores state in TaskManager memory but snapshots to filesystem (HDFS). Suitable for large states, long windows, and high availability scenarios.
  • RocksDBStateBackend: Uses embedded RocksDB database. Supports very large states, long windows, and is the only backend supporting incremental Checkpoint. Limited by disk space rather than memory.

Keyed State vs Operator State:

  • Operator State: Bound to parallel operator instances (e.g., Kafka Consumer maintaining partition offsets)
  • Keyed State: Exists only on KeyedStream, logically bound to <parallel-operator-instance, key>

Managed State Types:

  • ValueState, ListState, ReducingState, AggregatingState, MapState, FoldingState

TTL (Time-To-Live) Features:

  • Available since Flink 1.6.0
  • Only supports Processing Time
  • TTL config not persisted to Checkpoint/Savepoint