Blog
Technical exploration and thoughts · 655 articles
Spark Streaming Kafka Consumption: Offset Acquisition, St...
When Spark Streaming integrates with Kafka, Offset management is key to ensuring data processing continuity and consistency. Offset marks message position in...
Spark Streaming Integration with Kafka: Offset Management...
Offset is used to mark message position in Kafka partition. Proper management can achieve at-least-once or even exactly-once data processing semantics. By persisting Offset, application can resume ...
Spark Streaming Stateful Transformations: Window Operatio...
Window operations integrate data from multiple batches over a longer time range by setting window length and slide duration. Cases demonstrate reduceByWindow...
Spark Streaming Integration with Kafka: Receiver and Dire...
This article introduces two Spark Streaming integration methods with Kafka: Receiver Approach and Direct Approach. Receiver uses Executor-based Receiver to...
Redis Advanced Data Types: Bitmap, Geo and Stream
Deep dive into Redis three advanced data types: Bitmap, Geo (GeoHash, Z-order curve, Base32 encoding), and Stream message stream, with common commands and practical examples.
Redis Pub/Sub: Mechanism, Weak Transaction and Risks
Detailed explanation of Redis Pub/Sub working mechanism, three weak transaction flaws (no persistence, no acknowledgment, no retry), and alternative solutions in production.
Redis Single Node and Cluster Installation
Install Redis 6.2.9 from source on Ubuntu, configure redis.conf for daemon mode, start redis-server and verify connection via redis-cli.
Redis Five Data Types: Complete Command Reference and Pra...
Comprehensive explanation of Redis five data types: String, List, Set, Sorted Set, and Hash. Includes common commands, underlying characteristics, and typical usage scenarios with practical examples.
HBase Java API: Complete CRUD Code with Table Creation, I...
Using HBase Java Client API to implement table creation, insert, delete, Get query, full table scan, and range scan. Includes complete Maven dependencies and runnable code examples covering all com...
Redis Introduction: Features and Architecture
Introduction to Redis: in-memory data structure store, key-value database, with comparison to traditional databases and typical use cases.
HBase Cluster Deployment and High Availability Configuration
Complete HBase distributed cluster deployment: configure RegionServer on multiple nodes, HMaster high availability, integrate with ZooKeeper for coordination, with start/stop scripts and verificati...
HBase Shell CRUD Operations and Data Model
HBase Shell commands: create table, Put/Get/Scan/Delete operations, explain HBase data model with practical examples.
HBase Overall Architecture: HMaster, HRegionServer and Da...
Comprehensive analysis of HBase distributed database overall architecture, including ZooKeeper coordination, HMaster management node, HRegionServer data node, Region storage unit, and four-dimensio...
HBase Single Node Configuration: hbase-env and hbase-site...
Step-by-step configure HBase single node environment, explain hbase-env.sh, hbase-site.xml key parameters, complete integration with Hadoop HDFS and ZooKeeper cluster.
ZooKeeper Leader Election and ZAB Protocol Principles
Deep dive into ZooKeeper's Leader election mechanism and ZAB (ZooKeeper Atomic Broadcast) protocol, covering initial election process, message broadcast three phases, fault recovery strategy, and p...
ZooKeeper Distributed Lock Java Implementation Details
Implement distributed lock based on ZooKeeper ephemeral sequential nodes, with complete Java code, covering lock competition, predecessor node monitoring, CountDownLatch synchronization, and recurs...
ZooKeeper Watcher Principle and Command Line Practice Guide
Complete analysis of Watcher registration-trigger-notification flow from client, WatchManager to ZooKeeper server, and zkCli command line practice demonstrating node CRUD and monitoring.
ZooKeeper Java API Practice: Node CRUD and Monitoring
Use ZkClient library to operate ZooKeeper via Java code, complete practical examples of session establishment, persistent node CRUD, child node change monitoring, and data change monitoring.
ZooKeeper Cluster Configuration Details and Startup Verif...
Deep dive into zoo.cfg core parameter meanings, explain myid file configuration specifications, demonstrate 3-node cluster startup process and Leader election result verification.
ZooKeeper ZNode Data Structure and Watcher Mechanism Details
Deep dive into ZooKeeper's four ZNode node types, ZXID transaction ID structure, and one-time trigger Watcher monitoring mechanism principles and practice.