Why is gridsearchCV.best_estimator_.score giving me r2_score even if I mentioned MAE as my main scoring metric?

Question

Why is gridsearchCV.best_estimator_.score giving me r2_score even if I mentioned MAE as my main scoring metric?

Echo

2022年2月16日 23:47

I have a lasso regression model with the following definition :

import sklearn
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import PolynomialFeatures
from sklearn.preprocessing import scale
from sklearn.feature_selection import RFE
from sklearn.linear_model import LinearRegression, Lasso
from sklearn.svm import SVR
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.model_selection import GridSearchCV
from sklearn.pipeline import make_pipeline
from sklearn.metrics import r2_score

folds = KFold(n_splits = 5, shuffle = True, random_state = 100)

# specify range of hyperparameters
hyper_params = [{
                n_features_to_select: [0.25, 0.5, 0.75, 1.0],
                estimator__alpha : [0.2, 0.5, 0.7, 1, 1.2]}]

scoring_list = ['explained_variance','neg_mean_absolute_error','r2']

# specify model
lm = Lasso()
#lm.fit(x_train,y_train)
rfe = RFE(lm)             

# set up GridSearchCV()
model_cv = GridSearchCV(estimator = rfe, 
                        param_grid = hyper_params, 
                        scoring= scoring_list, 
                        cv = folds, 
                        verbose = 3,
                        return_train_score=True,
                        refit = 'neg_mean_absolute_error')

The best estimator was found to be

RFE(estimator=Lasso(alpha=0.2), n_features_to_select=0.5)

with best score of 3.513 (MAE).

I wanted to use the best predictor to score my test dataset

model_cv.best_estimator_.score(x_test,y_test)

which gives 0.6548

I tried to use predict to check the value if it corroborates if I manually check with a scorer.

from sklearn.metrics import r2_score , mean_absolute_error
 
y_pred = model_cv.best_estimator_.predict(x_test)
mean_absolute_error(y_test,y_pred) // gives 3.4804479077256256
r2_score(y_test,y_pred) // gives 0.6548

This shows that model_cv.best_estimator_.score is giving the r2_score. My question is why is it giving the r2_score when the refit parameter is neg_mean_absolute_error .

Not given a toy data as it is data agnostic.

Topic lasso gridsearchcv score regression scikit-learn

Category Data Science

Multivac · Accepted Answer · 2022年2月16日 23:47

This is the default behavior for any Scikit-learn regressor, and as far as I know, it cannot be modified.

So for regressors, the score method will return the $R^2$ and $Accuracy$ for classifiers. (check)

If you want to evaluate the best estimator with MAE you simply have to do:

from sklearn.metrics import mean_absolute_error

mean_absolute_error(y_test, model_cv.best_estimator_.predict(x_test))

Hope it helps!

Why is gridsearchCV.best_estimator_.score giving me r2_score even if I mentioned MAE as my main scoring metric?

About