Customising Objective Function in R

Question

Customising Objective Function in R

EB3112

2022年5月31日 07:05

I wondered if there were R packages that allow the user to customise the loss function?

For example, if a random forest package like ranger had a loss function which minimises OOB MSE. Is it somehow possible to customise this into a model which minimises negative log likelihood?

Would appreciate if someone knew of examples/code of doing this on a toy dataset

Topic r machine-learning

Category Data Science

Peter · Accepted Answer · 2022年2月18日 18:58

Using XGBoost it is relatively easy to invoke a custom loss function. There are also quite a lot of already implemented options.

It would look something like:

library(ISLR)
library(xgboost)
library(tidyverse)
library(Metrics)

# Data
df = ISLR::Hitters %>% select(Salary,AtBat,Hits,HmRun,Runs,RBI,Walks,Years,CAtBat,CHits,CHmRun,CRuns,CRBI,CWalks,PutOuts,Assists,Errors)
df = df[complete.cases(df),]
train = df[1:150,]
test = df[151:nrow(df),]

# XGBoost Matrix
dtrain <- xgb.DMatrix(data=as.matrix(train[,-1]),label=as.matrix(train[,1]))
dtest <- xgb.DMatrix(data=as.matrix(test[,-1]),label=as.matrix(test[,1]))
watchlist <- list(eval = dtest)

# Custom objective function (Huber)
# Reference for gradients: https://stackoverflow.com/questions/45006341/xgboost-how-to-use-mae-as-objective-function

myobjective <- function(preds, dtrain) {
  labels <- getinfo(dtrain, "label")
  d = preds - labels
  h = 5
  scale = 1 + (d / h)^2
  scale_sqrt = sqrt(scale)
  grad = d / scale_sqrt
  hess = 1 / scale / scale_sqrt
  return(list(grad = grad, hess = hess))
}

# Custom Metric
evalerror <- function(preds, dtrain) {
  labels <- getinfo(dtrain, "label")
  u = (preds-labels)^2
  err <- (sum(u) / length(u))^(1/2)
  return(list(metric = "MyError", value = err))
}

# Model Parameter
param <- list(booster = 'gbtree'
               , learning_rate = 0.1
               , objective = myobjective 
               , eval_metric = evalerror
               , set.seed = 2020)

# Train Model
xgb <- xgb.train(params = param
                  , data = dtrain
                  , nrounds = 500
                  , watchlist
                  , maximize = FALSE
                  , early_stopping_rounds = 5
                  ,verbose=1)

# Predict
pred = predict(xgb, dtest)
mae = mae(test$Salary, pred)
print(mae)

Craig · Accepted Answer · 2022年2月18日 18:21

I have done this before. Not with ranger. Packages that are open source, and with the appropriate license, I have changed the code for my model. The code might be in R, C, Python, Java.

I have also used xgboost for this. xgboost allow you to customize the objective function. You create the gradient and the hessian based on your function. You can call xgboost from R. The example is in Python. I am not sure if they support an R customized loss function but writing one function in Python might not be hard for your team.

I have not done this but you can use Tensorflow in R and write a customized loss function.

Customising Objective Function in R

About