boruta

BorutaShap implementation

spectre

2022年3月10日 22:07

I want to use BorutaShap for feature selection in my model. I have my train_x as an numpy.ndarray and I want to pass it to the BorutaShap instance. When I try to fit I am getting error as: AttributeError: 'numpy.ndarray' object has no attribute 'columns' Below is my code:- num_trans = Pipeline(steps = [('impute', SimpleImputer(strategy = 'mean')), ('scale', StandardScaler())]) cat_trans = Pipeline(steps = [('impute', SimpleImputer(strategy = 'most_frequent')), ('encode', OneHotEncoder(handle_unknown = 'ignore'))]) from sklearn.compose import ColumnTransformer preproc = ColumnTransformer(transformers = [('cat', …

Topic: boruta shap feature-selection python

Category: Data Science

Trouble performing feature selection using boruta and support vector regression

AI_Revolt

2021年5月11日 12:29

I was trying to select the most important features of a data set using Boruta in python. I have split the data into training and test set. Then I used SVM regressor to fit the data. Then I used Boruta to measure feature importance.The code is as follows: from sklearn.svm import SVR svclassifier = SVR(kernel='rbf',C=1e4, gamma=0.1) svm_model= svclassifier.fit(x_train, y_train) from boruta import BorutaPy feat_selector = BorutaPy(svclassifier, n_estimators='auto', verbose=2, random_state=1) feat_selector.fit(x_train, y_train) feat_selector.support_ feat_selector.ranking_ X_filtered = feat_selector.transform(x_train) But I get this …

Topic: boruta svr feature-selection python

Category: Data Science

From logistic regression to XGBoost - selecting features to run the model with

scaredy_brushwagg

2020年11月11日 12:35

I have been asked to look at XGBoost (as implemented in R, and with a maximum of around 50 features) as an alternative to an already existing but not developed by me logistic regression model created from a very large set of credit risk data containing a few thousand predictors. The documentation surrounding the logistic regression is very well prepared, and as such track has been kept of the reasons for exclusion of each variable. Among those are: automated data …

Topic: boruta xgboost logistic-regression feature-selection

Category: Data Science

The Merits of Feature Reduction Routines

mccurcio

2020年8月21日 17:52

I am interested in learning what routine others use (if any) for Feature Reduction/Selection. For example, If my data has several thousand features, I typically try {2,3,4} things right away depending on circumstances. Zero variance/Near zero variance Using R package caret, nzv I find a v.small percentage is zero variance and a few more are near zero variance. Then by using nzv$PercentUnique I may remove the bottom quartile of features depending on the range of PercentUnique's. Correlation to find multicollinearity …

Topic: boruta pca correlation feature-selection r

Category: Data Science

BorutaShap implementation

Trouble performing feature selection using boruta and support vector regression

From logistic regression to XGBoost - selecting features to run the model with

The Merits of Feature Reduction Routines

About