1

I read the paper but found nothing talking about how to implement incremental learning.

Can someone share some basic or deep knowledge? not in coding way.

I know how to write code snippet to train incrementally.

When new data comes in, how to train incrementally if I use XGBRegressor? Reserve the old trees and train new data with new trees?

I found nothing talking about this in detail

code_worker
  • 173
  • 1
  • 1
  • 10

1 Answers1

3

In the fit method, parameter xgb_model can be specified to continue training an old model.

https://xgboost.readthedocs.io/en/latest/python/python_api.html#xgboost.XGBRegressor.fit

Edit: Yes, as I understand it, the old trees are preserved in this method. The new trees will fit the residuals of whatever data you pass: you could continue with only the new data, which will first be run through the existing trees and have their residuals fit by new trees; or continue with all the data, which will do the above but also use the old data when splitting (probably better, so that the model doesn't lose its fit to the old data, but maybe your use case cares about the new data substantially more?). In some cases, it may be better to just retrain from scratch (if the existing trees do a poor job on the new data).

Ben Reiniger
  • 12,855
  • 3
  • 20
  • 63