Why do I get an ValueError for an SVR model with RFE, but only when using pipeline?
I am running five different regression models to find the best predicting model for one variable. I am using a Leave-One-Out approach and using RFE to find the best predicting features.
Four of the five models are running fine, but I am running into issues with the SVR. This is my code below:
from numpy import absolute, mean, std
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
from sklearn.model_selection import cross_val_score, LeaveOneOut
from sklearn.metrics import r2_score, mean_absolute_error, mean_squared_error
from sklearn.feature_selection import RFECV
from sklearn.pipeline import Pipeline
# one hot encoding
dataset.Gender.replace(to_replace=['M','F'],value=[1,0],inplace=True)
# select predictors and dependent
X = dataset.iloc[:,12:]
y = dataset.iloc[:,2]
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X = scaler.fit_transform(X)
First I run LOOCV with all features, this runs fine
## LOOCV with all features
# find number of samples
n = X.shape[0]
# create loocv procedure
cv = LeaveOneOut()
# create model
from sklearn.svm import SVR
regressor = SVR(kernel = 'rbf')
# evaluate model
scores = cross_val_score(regressor, X, y, scoring='neg_mean_squared_error', cv=n)
# force positive
#scores = absolute(scores)
# report performance
print('MSE: %.3f (%.3f)' % (mean(scores), std(scores)))
Next, I want to include RFECV to find the best predicting features for the model, this runs fine for my other regression models.
This is the part of the code where I get the error:
# automatically select the number of features with RFE
# create pipeline
rfe = RFECV(estimator=SVR(kernel = 'rbf'))
model = SVR(kernel = 'rbf')
pipeline = Pipeline(steps=[('s',rfe),('m',model)])
# find number of samples
n = X.shape[0]
# create loocv procedure
cv = LeaveOneOut()
# evaluate model
scores = cross_val_score(pipeline, X, y, scoring='neg_mean_squared_error', cv=n)
# report performance
print('MSE: %.3f (%.3f)' % (mean(scores), std(scores)))
The errors I receive are
ValueError: when `importance_getter=='auto'`, the underlying estimator SVR should have `coef_` or `feature_importances_` attribute. Either pass a fitted estimator to feature selector or call fit before calling transform.
I am not sure what this error means?
Topic rfe svr regression machine-learning
Category Data Science