Tag: MLlib

8 articles

Spark MLlib GBDT Algorithm: Gradient Boosting Principles,and Applications

This article introduces the principles and applications of gradient boosting tree (GBDT) algorithm.

Spark MLlib Ensemble Learning: Random Forest, Bagging and Boosting Methods

This article systematically introduces ensemble learning methods in machine learning.

Spark MLlib Decision Tree Pruning: Pre-pruning, Post-Principles and Practice

This article systematically introduces decision tree pre-pruning and post-pruning principles, compares core differences between three mainstream algorithms.

Spark MLlib Decision Tree: Classification Principles, Gini/Entropy and Practice

This article introduces the basic concepts, classification principles, and classification principles of decision trees.

Big Data 272 - Spark MLlib Logistic Regression: Basics, Input Function, Sigmoid & Loss

This article introduces the basic principles, application scenarios, and implementation in Spark MLlib of logistic regression.

Big Data 271 - Spark MLlib Linear Regression: Scenarios, Loss Function & Optimization

Linear regression uses regression equations to model relationships between independent and dependent variables.

Big Data 271 - Spark MLlib Logistic Regression: Sigmoid, Loss Function & Diabetes Prediction Case

Logistic Regression is a classification model in machine learning. Despite having "regression" in its name, it is a classification algorithm.

Spark MLlib Linear Regression: Scenarios, Loss Function and Optimization

Linear Regression is an analytical method that uses regression equations to model the relationship between one or more independent variables and a dependent.