Tag: sklearn
13 articles
sklearn KMeans Key Attributes & Evaluation: cluster_cente...
scikit-learn (sklearn) KMeans (2026) explains three most commonly used objects: cluster_centers_ (cluster centers), inertia_ (Within-Cluster Sum of Squares),...
KMeans n_clusters Selection: Silhouette Score Practice + ...
KMeans n_clusters selection method: calculate silhouette_score and silhouette_samples on candidate cluster numbers (e.g., 2/4/6/8), determine optimal k by...
Python Hand-Written K-Means Clustering on Iris Dataset: F...
Python K-Means clustering implementation: using NumPy broadcasting to compute squared Euclidean distance (distEclud), initializing centroids via uniform...
K-Means Clustering Practice: Self-Implemented Algorithm V...
K-Means clustering provides an engineering workflow that is 'verifiable, reproducible, and debuggable': first use 2D testSet dataset for algorithm verification...
Scikit-Learn Logistic Regression Implementation: max_iter...
When using Logistic Regression in Scikit-Learn, max_iter controls maximum iterations affecting model convergence speed and accuracy. If training doesn't...
K-Means Clustering Guide: From Unsupervised Concepts to I...
K-Means clustering algorithm, comparing supervised vs unsupervised learning (whether labels Y are needed), with engineering applications in customer...
How to Implement Logistic Regression in Scikit-Learn and ...
As C gradually increases, regularization strength gets smaller, model performance on training and test shows upward trend, until around C=0.8, training...
How to Handle Multicollinearity: Common Problems & Soluti...
When using scikit-learn for linear regression, how to handle multicollinearity in least squares method. Multicollinearity may cause instability in regression...
sklearn Decision Tree Pruning Parameters: max_depth/min_s...
Common parameters for decision tree pruning (pre-pruning) in engineering: max_depth, min_samples_leaf, min_samples_split, max_features, min_impurity_decrease...
Confusion Matrix to ROC: Complete Review of Imbalanced Bi...
Confusion matrix (TP, FP, FN, TN) with unified metrics: Accuracy, Precision, Recall (Sensitivity), F1 Measure, ROC curve, AUC value, and practical business interpretation for classification models.
Decision Tree from Split to Pruning: Information Gain/Gai...
Complete chain from 'split' to 'pruning', explain why usually uses greedy algorithm to form 'local optimum', and differences in splitting criteria between...
sklearn Decision Tree Practice: criterion, Graphviz Visua...
Complete flow of DecisionTreeClassifier on load_wine dataset from data splitting, model evaluation to decision tree visualization (2026 version). Focus on...
scikit-learn KNN Practice: KNeighborsClassifier, kneighbo...
From unified API (fit/predict/transform/score) to kneighbors to find K nearest neighbors of test samples, then using learning curve/parameter curve to select...