Q&A40 Questions
Ensemble Methods — Q&A
Bagging, boosting, gradient boosting, and ensemble learning strategies.
Bagging (Bootstrap Aggregating): Interview Q&A
1
What is bagging in machine learning?
âš¡ Beginner
Answer: Bagging (bootstrap aggregating) trains multiple models on bootstrapped samples of the training data and averages their predictions.
2
What is the main goal of bagging?
âš¡ Beginner
Answer: The main goal is to reduce variance and improve stability of high-variance models like decision trees.
3
What is a bootstrap sample?
âš¡ Beginner
Answer: A bootstrap sample is obtained by sampling with replacement from the original dataset, usually to the same size.
4
Why does averaging predictions reduce variance?
📊 Intermediate
Answer: If individual model errors are not perfectly correlated, averaging cancels some noise, lowering overall variance.
5
Is bagging more effective for high-bias or high-variance models?
📊 Intermediate
Answer: Bagging is most effective for high-variance, low-bias models, such as deep decision trees.
6
How do bagging and random forests relate?
📊 Intermediate
Answer: A random forest is essentially bagging of decision trees with additional feature-level randomness at each split.
7
How are predictions combined in bagging for regression and classification?
âš¡ Beginner
Answer: For regression you average predictions; for classification you usually take a majority vote.
8
What is out-of-bag (OOB) evaluation in bagging?
🔥 Advanced
Answer: OOB evaluation uses samples not included in a model’s bootstrap set as validation data, giving an internal estimate of performance.
9
Does bagging increase or decrease bias?
📊 Intermediate
Answer: Bagging typically keeps bias about the same and reduces variance; bias may increase slightly in some cases.
10
What types of base learners are commonly used in bagging?
âš¡ Beginner
Answer: Decision trees are most common, but other unstable models can also benefit from bagging.
11
Does bagging help for linear models like logistic regression?
🔥 Advanced
Answer: Usually not much, because such models are low-variance, high-bias; variance reduction brings little gain.
12
How does bagging compare to boosting in terms of bias and variance?
🔥 Advanced
Answer: Bagging mainly reduces variance, while boosting aims to reduce bias by focusing on hard examples.
13
What is the main computational cost of bagging?
📊 Intermediate
Answer: Training many models increases training time and memory usage, though training can be parallelized.
14
How does the number of estimators affect a bagging ensemble?
📊 Intermediate
Answer: More estimators generally reduce variance and improve stability up to a point, but with diminishing returns and higher cost.
15
How does bagging interact with overfitting?
🔥 Advanced
Answer: Bagging allows you to overfit individual base learners (like deep trees) and then reduce overfitting via averaging.
16
What is an example of a pure bagging algorithm?
âš¡ Beginner
Answer: The BaggingClassifier in scikit-learn wrapping decision trees is a classic example.
17
Does bagging require independent base learners?
📊 Intermediate
Answer: They need not be independent, but the more decorrelated they are, the more variance reduction you get from averaging.
18
How do you evaluate a bagging model efficiently?
🔥 Advanced
Answer: You can use OOB estimates instead of a separate validation set, plus standard cross-validation for confirmation.
19
Give a real-world use case where bagging is effective.
âš¡ Beginner
Answer: Bagging decision trees (random forests) performs well on tabular business data like credit scoring, churn prediction and risk modeling.
20
What is the key message to remember about bagging?
âš¡ Beginner
Answer: Bagging is a simple but powerful ensemble strategy for taming unstable models by trading extra computation for lower variance and better generalization.
XGBoost (Extreme Gradient Boosting): Interview Q&A
21
What is XGBoost in one sentence?
âš¡ Beginner
Answer: XGBoost is a highly optimized implementation of gradient boosted decision trees with additional regularization and engineering improvements.
22
Why is XGBoost popular in ML competitions?
âš¡ Beginner
Answer: Because it handles tabular data extremely well, offers strong performance out of the box, and is highly tunable and efficient.
23
What objective does XGBoost optimize?
🔥 Advanced
Answer: It minimizes a regularized loss function: training loss plus penalties on model complexity (e.g., number of leaves, leaf weights).
24
Which key hyperparameters control tree complexity in XGBoost?
📊 Intermediate
Answer: Important ones include max_depth, min_child_weight, gamma (min split loss), subsample, colsample_bytree.
25
What does the learning rate (eta) do in XGBoost?
📊 Intermediate
Answer: It scales each tree’s contribution; smaller eta means slower learning but often better generalization when combined with more trees.
26
How does XGBoost use regularization compared to classic gradient boosting?
🔥 Advanced
Answer: XGBoost adds L1/L2 penalties on leaf weights and explicit tree complexity penalties, giving more control over overfitting.
27
What is the role of subsample and colsample_bytree?
🔥 Advanced
Answer: They randomly sample rows (subsample) and features (colsample_bytree), adding randomization to reduce overfitting and speed up training.
28
What is early stopping and how is it used with XGBoost?
📊 Intermediate
Answer: Early stopping stops adding new trees when validation performance hasn’t improved for a set number of rounds, preventing overfitting.
29
Can XGBoost handle missing values natively?
📊 Intermediate
Answer: Yes, XGBoost can learn default directions for missing values at each split, so explicit imputation is often unnecessary.
30
What is tree_method in XGBoost and why does it matter?
🔥 Advanced
Answer: tree_method selects the algorithm for building trees (e.g., exact, approx, hist); histogram-based methods are faster and scale better.
31
How does XGBoost support different loss functions?
📊 Intermediate
Answer: It uses a general gradient boosting framework, allowing various objectives like logistic, squared error, ranking losses, etc.
32
What are some common evaluation metrics used with XGBoost?
âš¡ Beginner
Answer: Metrics depend on task: logloss, auc for classification, rmse, mae for regression, and custom metrics as needed.
33
How does XGBoost compute feature importance?
📊 Intermediate
Answer: It can report importance based on gain, cover or frequency of splits involving each feature, though permutation importance is often more robust.
34
When tuning XGBoost, which parameters do you usually start with?
📊 Intermediate
Answer: A common approach: first tune max_depth, min_child_weight, then subsample, colsample, and finally eta and number of trees.
35
Is XGBoost well-suited for sparse input data?
📊 Intermediate
Answer: Yes, it has efficient support for sparse matrices and is commonly used with one-hot encoded features.
36
How does XGBoost compare to random forests?
🔥 Advanced
Answer: Random forests use bagging of full-depth trees (variance reduction), while XGBoost uses boosting of shallow trees (bias reduction) and often achieves higher accuracy with more tuning.
37
When might XGBoost not be the best choice?
📊 Intermediate
Answer: It may not be ideal for very high-dimensional sparse text, image or sequence data, where linear models or deep learning often work better.
38
How do you handle class imbalance in XGBoost?
📊 Intermediate
Answer: Use scale_pos_weight, adjust eval metrics, and possibly resample data or tweak decision thresholds.
39
Give a real-world use case where XGBoost excels.
âš¡ Beginner
Answer: XGBoost is widely used for credit scoring, click-through rate prediction, and Kaggle competition-winning solutions on tabular data.
40
What is the key message to remember about XGBoost?
âš¡ Beginner
Answer: XGBoost is a powerful, regularized gradient boosting engine—understanding its tree parameters, learning rate and regularization terms lets you tackle many real-world ML problems effectively.