Ensemble Learning Methods

Jul 11, 2024

Ensemble Learning Methods

Interview Questions

What is Ensemble learning?
What are the examples of Ensemble learning?
What are boosting and bagging?
What are the advantages of bagging and boosting?
What are the differences between bagging and boosting?
Why boosting models are good?
Explain stacking?

Overview

Ensemble learning: Combining multiple weak learners to create a strong learner with better predictive performance.
General properties:
- Reduces overfitting
- More robust to data variability
- Uses voting (classification) or averaging (regression)

Bagging (Bootstrap Aggregation)

Process:
1. Create bootstrap samples from the training data.
2. Train a model on each bootstrap sample.
3. Combine predictions using voting (classification) or averaging (regression).
Example: Random Forest
- Combines decision trees using majority voting or averaging.
- High variance, low bias initially; reduces variance by averaging.

Boosting

Process:
1. Train weak learners sequentially.
2. Each learner focuses on correcting errors of the previous learner by giving more weight to misclassified examples.
3. Final prediction is a weighted result of all learners.
Example: Gradient Boosted Trees
- Learns sequentially, optimizes residual loss.
- High bias, low variance initially; reduces bias by making learners more complicated.

Differences Between Bagging and Boosting

Training
- Bagging: Independent, parallel training of learners.
- Boosting: Sequential, dependent training of learners.
Bias-Variance Tradeoff
- Bagging: Reduces variance, works best with complex models.
- Boosting: Reduces bias, works best with weak models.

Stacking

Combines outputs of base learners with a meta-learner.
Two-level process:
1. Train individual base learners.
2. Train a meta-learner on the base learners' predictions.
Example Configuration:
- Base Learners: Random Forest, Support Vector Machines
- Meta-Learner: Logistic Regression
Pros:
- Can outperform the best base learner by combining strengths.
Cons:
- Computationally expensive to train.

Full transcript