Broadcast State Basic Concept: All parallel task instances maintain state data as persistent internal state, rather than sending it to other stream processors via broadcast. This design allows each instance to independently process state data while coordinating with events from the broadcast stream.
Applicable Scenarios:
- Low throughput data stream associated with high throughput data stream processing
- Application scenarios that need dynamic adjustment of processing logic
Typical Case: Real-time fraud detection system — trading data stream as high-throughput main data stream, rule update stream as low-throughput broadcast stream, processing engine can dynamically load latest fraud detection rules without stopping or restarting the processing pipeline.
Code Implementation Key Points:
- Use
MapStateDescriptorto define broadcast state - Use
.broadcast()method to broadcast pattern stream to all downstream operators - Use
KeyedBroadcastProcessFunctionto handle connection between keyed stream and broadcast stream
Core Methods:
processBroadcastElement(): Called for each record in broadcast stream, responsible for handling updated data in broadcast stream (like latest pattern rules)processElement(): Called for each record in keyed stream, can only read broadcast state, execute pattern matching detection
Running Test Results:
Match failed: 1003
Match success: 1001
Match failed: 1002
3> (1001,icu.wzk.BroadCastDemo$MyPattern@6d1e6dc7)