Tag: Stream Processing
16 articles
Flink CEP: Complex Event Processing & Pattern Matching
Flink CEP detailed explanation: pattern sequence, individual patterns, combined patterns, matching skip strategies and practical cases.
Flink Memory Management: Network Buffer, State Backend & Memory Model
Flink memory model detailed explanation: Network Buffer Pool, Task Heap, State Backend memory allocation, GC tuning and backpressure handling.
Big Data 99 - Flink Parallelism: Operator Chaining, Slot and Resource Scheduling
Flink parallelism detailed explanation: Operator Chaining, Slot allocation strategy, parallelism settings and resource scheduling principle.
Big Data 96 - Flink Broadcast State: BroadcastState Practice and Rule Updates
Flink Broadcast State explanation: BroadcastState principle, dynamic rule updates, state partitioning and memory management, demonstrating broadcast stream and non-broadc...
Big Data 97 - Flink State Backend: State Storage and Performance Optimization
Flink State Backend detailed explanation: HashMapStateBackend, EmbeddedRocksDBStateBackend selection, memory configuration and performance tuning.
Big Data 95 - Flink State and Checkpoint: State Management, Fault Tolerance and Savepoints
Flink stateful computation explanation: Keyed State, Operator State, Checkpoint configuration, Savepoint backup and recovery, production environment practices.
Big Data 93 - Flink Streaming Introduction: DataStream API and Program Structure
This is article 93 in the Big Data series, introducing Flink DataStream API core concepts and program structure.
Flink Window and Watermark: Time Windows, Tumbling/Sliding/Session
Comprehensive analysis of Flink Window mechanism: tumbling windows, sliding windows, session windows, Watermark principle and generation strategies, late data processing...
Big Data 90 - Apache Flink Introduction: Unified Stream-Batch Real-Time Computing
Systematic introduction to Apache Flink's origin, core features, and architecture components: JobManager, TaskManager, Dispatcher responsibilities, unified stream-batch p...
Big Data 89 - Spark Streaming with Kafka: Receiver vs Direct Mode
This is article 89 in the Big Data series, deeply comparing two core modes of Spark Streaming integration with Kafka, focusing on Direct mode production practices.
Big Data 87 - Spark DStream Transformation Operators: map, reduceByKey and transform
Systematically review Spark Streaming DStream stateless transformation operators and transform advanced operations, demonstrating three implementation approaches for blac...
Spark Streaming Window Operations & State Tracking: updateStateByKey & mapWithState
In-depth explanation of Spark Streaming stateful computing: window operation parameter configuration, reduceByKeyAndWindow hot word statistics, updateStateByKey full-stat...
Spark Streaming Introduction: From DStream to Structured Streaming
This is article 85 in the Big Data series, introducing the architecture and evolution background of Spark's two generations of streaming frameworks.
Spark Streaming Data Sources: File Stream, Socket, RDD RDD Queue
Comprehensive explanation of three Spark Streaming basic data sources: file stream directory monitoring, Socket TCP ingestion, RDD queue stream for testing simulation.
Spark RDD Deep Dive: Five Key Features
This is article 69 in the Big Data series, deeply analyzing RDD, Spark's core data abstraction, its five key features and design principles.
From MapReduce to Spark: Big Data Computing Evolution
Systematic overview of big data processing engine evolution from MapReduce to Spark to Flink, analyzing Spark in-memory computing model, unified ecosystem and core compon...