Questions tagged [boosting]
121 questions
30
votes
1 answer
Adaboost vs Gradient Boosting
How is AdaBoost different from a Gradient Boosting algorithm since both of them use a Boosting technique?
I could not figure out actual difference between these both algorithms from a theory point of view.
CodeMaster GoGo
- 808
- 1
- 7
- 15
19
votes
5 answers
How to make LightGBM to suppress output?
I have tried for a while to figure out how to "shut up" LightGBM. Especially, I would like to suppress the output of LightGBM during training (i.e. feedback on the boosting steps).
My model:
params = {
'objective': 'regression',
…
Peter
- 7,896
- 5
- 23
- 50
13
votes
1 answer
AdaBoost implementation and tuning for high dimensional feature space in R
I am trying to implement the AdaBoost.M1 algorithm (trees as base-learners) to a data set with a large feature space (~ 20.000 features) and ~ 100 samples in R. There exists a variety of different packages for this purpose; AdaBag, Ada and gbm.…
AfBM
- 131
- 3
11
votes
4 answers
Can Boosted Trees predict below the minimum value of the training label?
I am using gradient Gradient Boosted Trees (with Catboost) for a Regression task. Can GBtrees predict a label that is below the minimum (or above the max) that was seen in the training ?
For instance if the minimum value the label had is 10, would…
Yairh
- 129
- 1
- 5
10
votes
2 answers
What is a good interpretation of this 'learning curve' plot?
I read about the validation_curve and how interpret it to know if there are over-fitting or underfitting, but how can interpret the plot when the data is the error like this:
The X-axis is "Nº of examples of training"
Redline is train error
Green…
Tlaloc-ES
- 337
- 1
- 7
9
votes
1 answer
What is meant by Distributed for a gradient boosting library?
I am checking out XGBoost documentation and it's stated that XGBoost is an optimized distributed gradient boosting library.
What is meant by distributed?
Have a nice day
Tommaso Bendinelli
- 275
- 1
- 9
7
votes
1 answer
How to extract trees in XGBoost?
I want to extract each tree so that I can feed it with any data, and see the output.
dump_list=xg_clas.get_booster().get_dump()
num_t=len(dump_list)
print("Number of Trees=",num_t)
I can find number of trees like this,
xgb.plot_tree(xg_clas,…
J.Smith
- 468
- 4
- 16
6
votes
1 answer
On gradient boosting and types of encodings
I am having a look at this material and I have found the following statement:
For this class of models [Gradient Boosting Machine algorithms] [...] it is both safe and significantly
more computationally efficient use an arbitrary integer encoding…
carlo_sguera
- 161
- 3
6
votes
1 answer
Boosting with highly correlated features
I have a conceptual question. My understanding is, that Random Forest can be applied even when features are (highly) correlated. This is because with bagging, the influence of few highly correlated features is moderated, since each feature only…
Peter
- 7,896
- 5
- 23
- 50
6
votes
1 answer
Extracting encoded features after CatBoost
I have a dataset containing numerical as well as categorical variables.
After I've fit my dataset to a CatBoostClassifier, I want to extract the entire feature set, with the categorical variables encoded in whatever method the classifier decided to…
Aishwarya A R
- 239
- 2
- 7
6
votes
5 answers
GridSearch without CV
I create a Random Forest and Gradient Boosting Regressor by using GridSearchCV. For the Gradient Boosting Regressor, it takes too long for me. But I need to know which are the best parameters for the models. So I am thinking if there is a GridSearch…
ml_learner
- 357
- 1
- 5
- 11
6
votes
1 answer
How can I prevent this model to learn more(less) :)))
As you can see, GradientBoostingClassifier overfit with more training example.
These are my parameter for the model:
{'learning_rate': 0.1, 'loss': 'deviance', 'max_depth': 6, 'max_features': 0.3, 'min_samples_leaf': 80, 'n_estimators': 300}
What…
parvij
- 791
- 5
- 18
6
votes
3 answers
how does XGBoost's exact greedy split finding algorithm determine candidate split values for different feature types?
Based on the paper by Chen & Guestrin (2016)
"XGBoost: A Scalable Tree Boosting System", XGBoost's "exact split finding algorithm enumerates over all the possible splits on all the features to find the best split" (page 3). Thus, my understanding…
tvl
- 71
- 5
5
votes
1 answer
Can Boosting and Bagging be applied to heterogeneous algorithms?
Stacking can be achieved with heterogeneous algorithms such as RF, SVM and KNN. However, can such heterogeneously be achieved in Bagging or Boosting? For example, in Boosting, instead of using RF in all the iterations, could we use different…
Ahmad Bilal
- 177
- 6
5
votes
3 answers
Understanding Weighted learning in Ensemble Classifiers
I'm currently studying Boosting techniques in Machine Learning and I happened to understand that in Algorithms like Adaboost, each of the training samples is given a weight depending on whether it was misclassified or not by the previous model in…
AnonymousMe
- 235
- 3
- 8