Member-only story

Understanding Gradient Boosting: A Data Scientist’s Guide

Published in

TDS Archive

10 min readFeb 7, 2023

Gradient boosting machine (GBM) is one of the most significant advances in machine learning and data science that has enabled us as practitioners to use ensembles of models to best many domain-specific problems. While this tool is widely available in python packages like scikit-learn and xgboost, as a data scientist, we should always look into the theory and mathematics of the model instead of using it as a black box. In this blog, we will dive into the following areas:

Different backing concepts of GBM
Step-by-step illustrated to recreate GBM
Pro’s and Con’s

Fundamentals of Gradient Boosting

1. Weak learners and ensemble learning

Weak learners and ensemble learning are the two key concepts that make gradient boosting work. A weak learner is a model that is only slightly better than random guessing. Combined with many other weak learners, they can form a robust ensemble model to make accurate predictions.

Too wordy, too complicated

Okay, imagine we are playing 10,000-piece jigsaw with two other friends. (They must be excellent friends to sign up for this) Each of us is responsible for piecing one of the…

TDS Archive

Understanding Gradient Boosting: A Data Scientist’s Guide

Fundamentals of Gradient Boosting

1. Weak learners and ensemble learning

Create an account to read the full story.

Published in TDS Archive

Written by Louis Chan

Responses (3)