Blog

Technical exploration and thoughts · 655 articles

All big-datajavaaiartificial-intelligenceprogrammer-lifemachine-learningmysqldata-engineeringbackenddistributeddata-warehouseflinkarchitecturepythonroboticssparkhivellmdistributed-systemkafkadatabasescalaembodied-aihdfsdeep-learningspringmessage-queuelangchainsystem-architecturemybatisperformance-optimizationelasticsearchmongodbhealthredisspring-bootrabbitmqmqhadoopelkflumestream-processingtransactionmessagingrpctutorialsklearncachingcachedubbojava-rabbitmqclickhousehbasekylinneo4jmicroservicessqlindextomcatprogrammermultimodalzookeeperdruidcanalmllibormiocnutritionlarge-modelrobot-armteslaindie-devnginxdataxshardingshardingspherefastdfsrocketmqtime-managementapplicationscareer-growthdockeretlguavajava-rocketmqoptimizationlearningquantizationdeploymentkudulogstashdecision-treesqoopairflowrealtime-warehousemycatstorage-engineconsistencyfat-lossgptproduct-managercoffeebusiness-analysisautomationalgorithmcareer-and-growthmiddlewarecomputer-visionautonomous-drivingfsdqwenmapreducecrudmonitoringdatabase-shardingdistributed-transactionconcurrencytransaction-pitfallsgraph-databasememcachednettyinnodbsalarycareer-developmentcold-showerrunningefficiencyluckinindustrymedicalindustriallfplfp-batterybatteryevfitnesscareer-personal-growthocrdeepseekdeepseek-ocromnicloud-nativeyarndatastreamjdbcolapknnlinear-regressionnumpyzipper-tablegriffindevopskubernetesdata-mappingdesign-patternshigh-availabilityread-write-separationsharding-jdbcsagasecurityreplica-setcqlsource-code-analysisevcacheservletaopload-balancinghandwrittenniomindfulnessmeditationexercisereinforcement-learningagentconflictevaluationmoney-managementconsumptionsavingssocial-mediadatingmemoryprice-warcottiptqqatqloraqwen2.5-vlmultivitamincalciumevolutiontechnologyindustrial-robotagriculturehardwaresimulationroslarge-language-modeldegradationslamvisual-inspectionprogramming-languagelinuxwindowraftkibanaaggregationregularizationlogistic-regressionprometheusexporteratlasstate-managementmavenacidannotation-developmentmaster-slave-replicationflexible-transactionxacap2pc3pcbsonexplainb+treeslow-queryauthenticationclusterossaliyunsource-codeasyncnetflixjmspaxosrmiengineeringphysiologyhot-showerpractical-guidemuscle-buildingtransformertensorflowreportstechnical-sharingproductentrepreneurshipmethodologyteam-collaborationconflict-resolutioncollaborationgtdtoolsusage-timehealth-managementchina-usculturemarriagepartnercoffee-beverage-trendhomemade-coffeetasteperformancefine-tuningblip-2minigpt-4llavaalibabavitaminsfish-oilvitamin-cironfolatechronic-diseasesupplementstraditional-chinese-medicinewestern-medicineintegrated-medicinedevelopment-historytech-evolutionlakehousedata-meshserverlesstalenttech-selectionhistoryunimatehydraulic-driveai-collaborationcategoriesservice-robothumanoid-robotlogisticscareerskillstrendsservicescaracobotmotorreducersensorplcmpccontroltrajectory-planningvisioncore-technologyperceptiondecision-makinghomedatamarketchallengescommercializationfuture-trendsmeta-learninglifestylenmc-batterybody-fat-percentagebody-shapingmuscle-gainstrength-trainingbody-fatmetabolismsympathetic-nerveparasympathetic-nerveautonomic-nervous-systemhrvtesting-platformapi-integrationautomotive3dmodel-yopen-sourceimitation-learningvisual-algorithmsresearchjava-21kotlingolangrustjavascriptnodek8sgeminicepsourcesinkdatasetmergetreeik-analyzerdslterm-queryfilterinverted-indexnrtgrokfilebeattezdata-miningcross-validationnormalizationevaluation-metricsridge-regressionlassogradient-descentgrafanavisualizationodsscddimension-tabledwddwsadsrealtimememory-managementparallelismharborcontaineresp32home-assistantjenkinsgitlabcicdessaywebsiteastrofrontendxml-mappingdynamic-sqlsqlsessionhigh-concurrencymhafailoverdistributed-primary-keyscalingbinding-tablessql-optimizationbinding-tabletccseatadata-maskingdistributed-databasesharding-proxysharding-strategye-r-shardingconfiguration-filetransaction-isolation-levelschema.xmlpropagationdeclarative-transactionprogrammatic-transactiontransactionalplugindatabase-operationsnosqljsonpipelinepaginationwriteconcernpagehelpergeneric-mapperb-treeuse-casesselection-guidetemplaterepositorywiredtigerinmemorycontainerizationdata-modelingembeddedreferenceoplogelectionpermissionssharded-clustergraph-theoryeuler-pathproxy-patternembedded-databasebackupaccess-controldynamic-proxycloud-storagelruconcurrenthashmapoomdistributed-cachespymemcachedactivemqblockingqueuemessage-storagequeue-indexerlanghandwritten-frameworkjdkreverse-proxyprocessconfigurationclass-loadingssljvmioheartbeat-detectionspiroutingstorage-structureundoredothread-modeltablespacebinlogreplicationclustered-indexlockmvccsortingpipofflinepandasvoice

Spark Streaming Introduction: From DStream to Structured ...

Introduction to Spark's two generations of real-time computing frameworks: DStream micro-batch processing model's architecture and limitations, and how Structured Streaming solves EventTime process...

Spark Streaming Data Sources: File Stream, Socket, RDD Qu...

Comprehensive explanation of three Spark Streaming basic data sources: file stream directory monitoring, Socket TCP ingestion, RDD queue stream for testing simulation, with complete Scala code exam...

MyBatis Deep Dive - Level 1 Cache, Code Testing, and Sour...

Detailed introduction to MyBatis level 1 cache working principles, code testing, invalidation scenarios, and source code analysis. Level 1 cache is enabled by default in MyBatis with SqlSession-lev...

MyBatis Level 2 Cache - Testing and Source Code Analysis

Detailed introduction to MyBatis level 2 cache working principles, enable configuration, code testing, and source code analysis. Level 2 cache is based on Mapper namespace, and multiple SqlSessions...

Grafana 11.3.0 Installation & Startup: YUM Install RPM, s...

For OPs/devs still using CentOS/RHEL (including compatible distributions) in 2026, provides Grafana 11.3.0 (grafana-enterprise-11.3.0-1.x86_64.rpm) direct YUM...

Data Warehouse Introduction: Four Characteristics, OLTP v...

2026 engineering practice, covering core concepts and implementation concerns for data warehouses: starting from enterprise data silos, explaining four...

Prometheus 2.53.2 Installation & Configuration Practice: ...

Prometheus 2.53.2 (still common in existing environments in 2025/2026) provides a reusable deployment process: download and extract binary on monitoring...

Prometheus Node Exporter 1.8.2 + Pushgateway 1.10.0: Down...

Common Prometheus monitoring deployment: Install node_exporter-1.8.2 on Rocky Linux to expose host metrics, integrate with Prometheus scrape config, and visualize in Grafana dashboards.

sklearn KMeans Key Attributes & Evaluation: cluster_cente...

scikit-learn (sklearn) KMeans (2026) explains three most commonly used objects: cluster_centers_ (cluster centers), inertia_ (Within-Cluster Sum of Squares),...

KMeans n_clusters Selection: Silhouette Score Practice + ...

KMeans n_clusters selection method: calculate silhouette_score and silhouette_samples on candidate cluster numbers (e.g., 2/4/6/8), determine optimal k by...

SparkSQL Statements: DataFrame Operations, SQL Queries & ...

Comprehensive guide to SparkSQL core usage including DataFrame API operations, SQL query syntax, lateral view explode, and Hive integration via enableHiveSupport for metadata and table operations.

SparkSQL Kernel: Five Join Strategies & Catalyst Optimize...

Deep dive into SparkSQL's five Join execution strategies (BHJ, SHJ, SMJ, Cartesian, BNLJ) selection conditions and use cases, along with the complete processing flow of Catalyst optimizer from SQL ...

Python Hand-Written K-Means Clustering on Iris Dataset: F...

Python K-Means clustering implementation: using NumPy broadcasting to compute squared Euclidean distance (distEclud), initializing centroids via uniform...

K-Means Clustering Practice: Self-Implemented Algorithm V...

K-Means clustering provides an engineering workflow that is 'verifiable, reproducible, and debuggable': first use 2D testSet dataset for algorithm verification...

Scikit-Learn Logistic Regression Implementation: max_iter...

When using Logistic Regression in Scikit-Learn, max_iter controls maximum iterations affecting model convergence speed and accuracy. If training doesn't...

K-Means Clustering Guide: From Unsupervised Concepts to I...

K-Means clustering algorithm, comparing supervised vs unsupervised learning (whether labels Y are needed), with engineering applications in customer...

Deep Understanding of Logistic Regression & Gradient Desc...

Logistic Regression (LR) is an important classification algorithm in machine learning, widely used in binary classification tasks like sentiment analysis,...

How to Implement Logistic Regression in Scikit-Learn and ...

As C gradually increases, regularization strength gets smaller, model performance on training and test shows upward trend, until around C=0.8, training...

SparkSQL Core Abstractions: RDD, DataFrame, Dataset & Spa...

Deep comparison of Spark's three data abstractions RDD, DataFrame, Dataset features and use cases, introduction to SparkSession unified entry, and demonstration of mutual conversion methods between...

SparkSQL Operators: Transformation & Action Operations

Systematically review SparkSQL Transformation and Action operators, covering select, filter, join, groupBy, union operations, with practical test cases demonstrating usage and performance optimizat...