🌲

XG Boost and Its Trees for Regression

Jun 27, 2024

XG Boost and Its Trees for Regression

Introduction

  • Presenter: Josh Stormer
  • Topic: XG Boost Basics - Part 1 (Regression Trees)
  • Assumes familiarity with gradient boost for regression and regularization.
  • Split into three parts: Regression (Part 1), Classification (Part 2), Mathematical Details (Part 3).

XG Boost Overview

  • XG Boost: A comprehensive ML algorithm with many components.
  • Despite complexity, individual parts are simple to understand.
  • Designed for large, complex datasets (example used is simple for explanation).

Initial Steps in XG Boost

  1. Initial Prediction: Default is 0.5 for both regression and classification.
  2. Residual Calculation: Differences between observed and predicted values.

Building XG Boost Trees for Regression

Key Differences from Gradient Boost

  • Uses unique regression trees (XG Boost Trees)
  • Each tree starts with a single leaf including all residuals

Calculating Similarity Score

  • Formula: Sum of residuals squared / number of residuals + lambda (regularization parameter)
  • Example (with lambda = 0):
    • Residuals: +/- values cancel each other out
    • For the root: (7.5, -7.5, 6.5, -10.5) -> Similarity Score = 4

Splitting Nodes

  • Process: Split observations based on dosage thresholds
  • Example: Threshold = 15
    • Left node: Residual < 15 -> Residual = 110.25
    • Right node: Residual >= 15 -> Residual sum cancels out mostly, similarity = 14.08

Gain Calculation

  • Formula: Similarity of left node + similarity of right node - similarity of root
  • Compare gains for different thresholds to choose the best split
    • Example: Dosage < 15 has highest gain (120.33)
  • Continue splitting for each node with highest gains

Pruning Trees

  • Pruning Process:
    1. Choose a threshold (gamma, e.g., 130)
    2. Calculate gain - gamma for branches
    3. Prune if result is negative;
    4. Difference at further nodes impacts removal

Regularization Parameter (Lambda)

  • Alters similarity scores and gain calculations, affecting pruning and tree structure.
  • Example: Lambda = 1 reduces similarity score, making pruning more aggressive.

Output Values

  • Formula: Sum of residuals / (number of residuals + lambda)
  • Lambda reduces sensitivity to outliers
  • Leaves are assigned output values which adjust predictions

Predictions and Iterative Process

  • New predictions: Start with 0.5, add scaled output from tree by learning rate (eta, default 0.3)
  • Residual Update: Calculate new, smaller residuals
  • Reiterate with new trees until residuals are minimal or max trees reached

Summary

  • Steps to building XG Boost trees: calculate similarity -> split for gain -> prune using gamma -> calculate output -> adjust predictions iteratively.
  • Lambda increases pruning probability and reduces tree complexity.
  • Default values: Lambda = 0, Learning rate (Eta) = 0.3, Gamma needs user selection.
  • Part 2 will cover classification trees.

Conclusion

  • Encouragement to support Stat Quest and subscribe for more content.