Blog
Technical exploration and thoughts · 655 articles
Kafka Producer Interceptor & Interceptor Chain
Introduction to Kafka 0.10 Producer interceptor mechanism, covering onSend and onAcknowledgement interception points, interceptor chain execution order and error isolation, with complete custom int...
Kafka Consumer: Consumption Flow, Heartbeat & Parameter T...
Detailed explanation of Kafka Consumer Group consumption model, partition assignment strategy, heartbeat keep-alive mechanism, and tuning practices for key parameters like session.timeout.ms, heart...
Apache Kudu Docker Quick Deployment: 3 Master/5 TServer P...
Apache Kudu Docker Compose quick deployment solution on Ubuntu 22.04 cloud host, covering Kudu Master and Tablet Server components,...
Java Access Apache Kudu: Table Creation to CRUD (Includin...
Java client (kudu-client 1.4.0) connects to Apache Kudu with multiple Masters (example ports 7051/7151/7251), completes full process of table creation, insert,...
Apache Kudu: Real-time Write + OLAP Architecture, Perform...
Apache Kudu in 2025 version and ecosystem integration: Latest Kudu 1.18.0 (2025/07) released, bringing segmented LRU Block Cache and RocksDB-based metadata...
Apache Kudu Architecture & Practice: RowSet, Partition & ...
Apache Kudu's Master/TabletServer architecture, RowSet (MemRowSet/DiskRowSet) write/read path, MVCC, and Raft consensus role in replica and failover; provides...
ClickHouse MergeTree Partition/TTL, Materialized View, AL...
ClickHouse beginner and operations practice, based on real cluster (h121/h122/h123) demonstrating complete process from connection to database/table creation,...
Kafka Producer Message Sending Flow & Core Parameters
Deep analysis of Kafka Producer initialization, message interception, serialization, partition routing, buffer batch sending, ACK confirmation and complete sending chain, with key parameter tuning ...
Kafka Serialization & Partitioning: Custom Implementation
Deep dive into Kafka message serialization and partition routing, including complete code for custom Serializer and Partitioner, mastering precise message routing and efficient transmission.
ClickHouse Replica Deep Dive: ReplicatedMergeTree + ZooKe...
ClickHouse replica full chain: ZK/Keeper preparation, macros configuration, ON CLUSTER consistent table creation, write deduplication & replication mechanism,...
ClickHouse Sharding × Replica × Distributed: ReplicatedMe...
ClickHouse sharding × replica × Distributed architecture: Based on ReplicatedMergeTree + Distributed, using ON CLUSTER one-click table creation on 3-shard ×...
ClickHouse MergeTree Best Practices: Replacing Deduplicat...
ClickHouse two light aggregation engines ReplacingMergeTree and SummingMergeTree, combined with minimum runnable examples (MRE) and comparative queries,...
ClickHouse CollapsingMergeTree & External Data Sources: H...
ClickHouse external data source engine guide: DDL templates, key parameters and read/write pipelines for ENGINE=HDFS, ENGINE=MySQL, ENGINE=Kafka, and distributed table configurations.
ClickHouse MergeTree Practical Guide: Partition, Sparse I...
ClickHouse MergeTree key mechanisms: batch writes form parts, background merge (Compact/Wide two part forms), ORDER BY is sparse primary index,...
ClickHouse MergeTree Deep Dive: Partition Pruning × Spars...
ClickHouse MergeTree storage and query path: column files (*.bin), sparse primary index (primary.idx), marker files (.mrk/.mrk2) and index_granularity...
Kafka Operations: Shell Commands & Java Client Examples
Covers Kafka daily operations: daemon startup, Shell topic management commands, and Java client programming (complete Producer/Consumer code) with key configuration parameters and ConsumerRebalance...
Spring Boot Integration with Kafka
Detailed guide on integrating Kafka in Spring Boot projects, including dependency configuration, KafkaTemplate sync/async message sending, and complete @KafkaListener consumption practice.
Spark Distributed Environment Setup
Step-by-step Apache Spark distributed computing environment setup, covering download and extract, environment variable configuration, slaves/spark-env.sh core config adjustments, and complete multi...
ClickHouse Cluster Connectivity Self-Check & Data Types G...
Using three-node cluster (h121/122/123) as example, first complete cluster connectivity self-check: system.clusters validation → ON CLUSTER create...
ClickHouse Table Engines: TinyLog/Log/StripeLog/Memory/Me...
Sort through ClickHouse table engines: TinyLog, Log, StripeLog, Memory, Merge principles, applicable scenarios and pitfalls, provide reproducible minimum...