BorutaShap implementation
I want to use BorutaShap for feature selection in my model. I have my train_x
as an numpy.ndarray
and I want to pass it to the BorutaShap instance. When I try to fit I am getting error as:
AttributeError: 'numpy.ndarray' object has no attribute 'columns'
Below is my code:-
num_trans = Pipeline(steps = [('impute', SimpleImputer(strategy =
'mean')),
('scale', StandardScaler())])
cat_trans = Pipeline(steps = [('impute', SimpleImputer(strategy =
'most_frequent')),
('encode', OneHotEncoder(handle_unknown =
'ignore'))])
from sklearn.compose import ColumnTransformer
preproc = ColumnTransformer(transformers = [('cat', cat_trans,
cat_cols), ('num',
num_trans, num_cols)])
X = preproc.fit_transform(train_data1)
X_final = preproc.transform(test_data1)
from xgboost import XGBRegressor
xgbr_model = XGBRegressor(random_state = 69, tree_method = 'gpu_hist')
from sklearn.model_selection import train_test_split, cross_val_score
train_x, test_x, train_y, test_y = train_test_split(X, y, test_size =
0.2, random_state = 69)
from BorutaShap import BorutaShap
Feature_Selector = BorutaShap(model=xgbr_model,
importance_measure='shap',
classification=False)
Feature_Selector.fit(train_x, train_y, n_trials=10, random_state=69)
Any help will be appreciated!
Topic boruta shap feature-selection python
Category Data Science