LIME is observing categorical features even though I am not passing any categorical features
Here is the code:
predict_fn = lambda x: xgb_model.predict_proba(x).astype(float)
feature_names = X_train.columns
for i in range(x_val.shape[0]):
# Get the explanation for Logistic Regression
val_point = x_val.values[i]
print(val_point)
print(val_point.shape)
explainer = lime.lime_tabular.LimeTabularExplainer(training_data = Xs_train_array,
feature_names = feature_names,
training_labels = y_train,
mode = 'classification',
kernel_width=5)
exp = explainer.explain_instance(val_point, predict_fn, num_features=10)
exp.as_pyplot_figure()
plt.tight_layout()
Few Notes:
- Xs_train_array is of size (103,) and is type
float
. - There are no categorical variables.
Here is the error message I'm receiving:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
ipython-input-24-ee5cbeeb11ea in module
19 #print("Validation Prediction Probabilities: {}".format(xgb_model.predict_proba(val_point)))
20
--- 21 exp = explainer.explain_instance(val_point, predict_fn, num_features=10)
22 exp.as_pyplot_figure()
23 plt.tight_layout()
~/opt/anaconda3/lib/python3.7/site-packages/lime/lime_tabular.py in explain_instance(self, data_row, predict_fn, labels, top_labels, num_features, num_samples, distance_metric, model_regressor)
335 # Preventative code: if sparse, convert to csr format if not in csr format already
336 data_row = data_row.tocsr()
-- 337 data, inverse = self.__data_inverse(data_row, num_samples)
338 if sp.sparse.issparse(data):
339 # Note in sparse case we don't subtract mean since data would become dense
~/opt/anaconda3/lib/python3.7/site-packages/lime/lime_tabular.py in __data_inverse(self, data_row, num_samples)
534 inverse = data.copy()
535 for column in categorical_features:
-- 536 values = self.feature_values[column]
537 freqs = self.feature_frequencies[column]
538 inverse_column = self.random_state.choice(values, size=num_samples,
KeyError: 87
This confuses me for a couple reasons:
- categorical_features was not passed by me so I don't know where it is getting 87 from
Any help would be much appreciated. I used the same exact code on another dataset and am not running into any errors. I can't quite figure out what is causing this.
Category Data Science