10

I implemented a custom objective and metric for a xgboost regression. In order to see if I'm doing this correctly, I started with a quadratic loss. The implementation seems to work well, but I cannot reproduce the results from a standard "reg:squarederror" objective.

Question:

I wonder if my current approach is correct (especially the implementation of the first and second order gradient)? If so, what could be a possible reason for the difference?

Gradient and Hessian are defined as:

grad <- 2*(preds-labels) 
hess <- rep(2, length(labels))

Minimal example (in R):

library(ISLR)
library(xgboost)
library(tidyverse)
library(Metrics)

Data

df = ISLR::Hitters %>% select(Salary,AtBat,Hits,HmRun,Runs,RBI,Walks,Years,CAtBat,CHits,CHmRun,CRuns,CRBI,CWalks,PutOuts,Assists,Errors) df = df[complete.cases(df),] train = df[1:150,] test = df[151:nrow(df),]

XGBoost Matrix

dtrain <- xgb.DMatrix(data=as.matrix(train[,-1]),label=as.matrix(train[,1])) dtest <- xgb.DMatrix(data=as.matrix(test[,-1]),label=as.matrix(test[,1])) watchlist <- list(eval = dtest)

Custom objective function (squared error)

myobjective <- function(preds, dtrain) { labels <- getinfo(dtrain, "label") grad <- 2*(preds-labels) hess <- rep(2, length(labels)) return(list(grad = grad, hess = hess)) }

Custom Metric

evalerror <- function(preds, dtrain) { labels <- getinfo(dtrain, "label") u = (preds-labels)^2 err <- sqrt((sum(u) / length(u))) return(list(metric = "MyError", value = err)) }

Model Parameter

param1 <- list(booster = 'gbtree' , learning_rate = 0.1 , objective = myobjective , eval_metric = evalerror , set.seed = 2020)

Train Model

xgb1 <- xgb.train(params = param1 , data = dtrain , nrounds = 500 , watchlist , maximize = FALSE , early_stopping_rounds = 5)

Predict

pred1 = predict(xgb1, dtest) mae1 = mae(test$Salary, pred1)

XGB Model with standard loss/metric

Model Parameter

param2 <- list(booster = 'gbtree' , learning_rate = 0.1 , objective = "reg:squarederror" , set.seed = 2020)

Train Model

xgb2 <- xgb.train(params = param2 , data = dtrain , nrounds = 500 , watchlist , maximize = FALSE , early_stopping_rounds = 5)

Predict

pred2 = predict(xgb2, dtest) mae2 = mae(test$Salary, pred2)

Results:

  • The custom metric yields a slightly better result MAE=199.6 compared to the standard objective MAE=203.3.

  • During boosting, the RMSE tends to be lower with the custom objective.

For the custom objective the RMSE is:

[1] eval-MyError:599.490030 
[2] eval-MyError:560.677996 
[3] eval-MyError:527.867686
[4] eval-MyError:498.216760 
[5] eval-MyError:472.167415 
...

For the standard objective the RMSE is:

[1] eval-rmse:598.144775 
[2] eval-rmse:562.479431 
[3] eval-rmse:529.981079 
[4] eval-rmse:501.730103 
[5] eval-rmse:479.081329 
Peter
  • 7,896
  • 5
  • 23
  • 50

1 Answers1

6

I have a suggestion.
Indeed the methodology is the right one but the problem comes from the definition of your functions. Since they are not the right ones, they then give the wrong Grad and Hess. The metric is also not correct.
You must use :

Objective :

$$f(preds, labels)=\frac{1}{2}(preds-labels)^2$$
$$grad=\ (preds-labels)$$
$$hess=\ 1$$

Metrics :

$$err = \frac{\sum_{i=1}^{n}}{n}(preds-labels)^2$$

My suggestions :


# Custom objective function (squared error)
myobjective <- function(preds, dtrain) {
  labels <- getinfo(dtrain, "label")
  grad <- (preds-labels)    
  hess <- rep(1, length(labels))                
  return(list(grad = grad, hess = hess))
}

Custom Metric

evalerror <- function(preds, dtrain) { labels <- getinfo(dtrain, "label") err <- (preds-labels)^2
return(list(metric = "MyError", value = mean(err)))
}

I get the same results using these functions.
More code on loss/gradient customization are available on my github https://www.github.com/kipedene/Custom_objectif.