Optimizing MAE degrades MAE metrics
I have run a lighgbm regression model by optimizing on RMSE and measuring the performance on RMSE:
model = LGBMRegressor(objective=regression, n_estimators=500, n_jobs=8)
model.fit(X_train, y_train, eval_metric=rmse, eval_set=[(X_train, y_train), (X_test, y_test)], early_stopping_rounds=20)
The model keeps improving during the 500 iterations. Here are the performances I obtain on MAE:
MAE on train : 1.080571 MAE on test : 1.258383
But the metric I'm really interested in is MAE, so I decided to optimize it directly (and choose it as the evaluation metric):
model = LGBMRegressor(objective=regression_l1, n_estimators=500, n_jobs=8)
model.fit(X_train, y_train, eval_metric=mae, eval_set=[(X_train, y_train), (X_test, y_test)], early_stopping_rounds=20)
Against all odds, the MAE performance decreases both on train and test:
MAE on train : 1.277689 MAE on test : 1.285950
When I look at the model's logs, it seems stuck in a local minimum and doesn't improve after about 100 trees... Do you think that the problem is linked with the non-differentiability of MAE?
Here are the learning curves:
Topic lightgbm objective-function
Category Data Science