TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial…

Follow publication

Image by Midjourney

Member-only story

Understanding Gradient Boosting: A Data Scientist’s Guide

Louis Chan
TDS Archive
Published in
10 min readFeb 7, 2023

Gradient boosting machine (GBM) is one of the most significant advances in machine learning and data science that has enabled us as practitioners to use ensembles of models to best many domain-specific problems. While this tool is widely available in python packages like scikit-learn and xgboost, as a data scientist, we should always look into the theory and mathematics of the model instead of using it as a black box. In this blog, we will dive into the following areas:

  • Different backing concepts of GBM
  • Step-by-step illustrated to recreate GBM
  • Pro’s and Con’s

Fundamentals of Gradient Boosting

1. Weak learners and ensemble learning

Weak learners and ensemble learning are the two key concepts that make gradient boosting work. A weak learner is a model that is only slightly better than random guessing. Combined with many other weak learners, they can form a robust ensemble model to make accurate predictions.

Too wordy, too complicated

Okay, imagine we are playing 10,000-piece jigsaw with two other friends. (They must be excellent friends to sign up for this) Each of us is responsible for piecing one of the…

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Or, continue in mobile web

Already have an account? Sign in

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Louis Chan
Louis Chan

Written by Louis Chan

Learn from your own mistakes today makes you a better person tomorrow.

Responses (3)

Write a response