exp = explainer.explain_instance(df_val_final.Description[idx],predproba_list,num_features=5, top_labels=2) While executing the explain instance of LimeTextExplainer, the above statement keeps on executing continuously with the below warning message. Execution stops only if I interrupt the kernel C:\ProgramData\Anaconda3\lib\site-packages\fastai\torch_core.py:83: UserWarning: Tensor is int32: upgrading to int64; for better performance use int64 input warn('Tensor is int32: upgrading to int64; for better performance use int64 input') C:\ProgramData\Anaconda3\lib\site-packages\fastai\torch_core.py:83: UserWarning: Tensor is int32: upgrading to int64; for better performance use int64 input warn('Tensor is int32: upgrading to int64; for better performance …
I am trying to interpret a black box model. This model is a random forest that I am using to make predictions. I have read that LIME is a way to interpret black box models, but I don't quite know how to interpret the following graphs: If someone could help me to interpret them or tell me how to do it, it would be of great help. Thank you.
I try to explain the importance of a sentence using the following pipeline with LimeTextExplainer from LIME package. Pipeline(steps=[('vect', CountVectorizer()), ('tfidf', TfidfTransformer()), ('clf', LogisticRegression())]) When I try to explain a sentence using the code below, the "importance" of the single words is shown, while I want pairs explainer.explain_instance(text, cls.predict_proba, num_features=7) exp.show_in_notebook(text=False) Is it possible to display the importance of pairs of words?
I am working with the LIME implementation by Marco Ribeiro (https://github.com/marcotcr/lime). Specifically, I am utilizing the LimeTabularExplainer as I have a mixture of numerical and categorical features in my dataset. How would I represent categorical features that may take on ≥ 0 values in a single example? I understand that the API requires categorical features to be converted to an integer representation, but how would I represent combinations of values for one categorical feature? To illustrate the circumstance, see the …
I am working on a ML tutorial project with my own dataset. I built a ML model using training dataset and generated predictions using test dataset. Shape of test dataset is (418,10) My code for model training and predictions is below rfs_clf = forest_clf = RandomForestClassifier(n_estimators=110, max_depth= 8, max_features='auto', random_state=0, oob_score=False, min_samples_split = 2, criterion= 'gini', min_samples_leaf=2, bootstrap=False) rfs_clf.fit(X_train, y_train) y_f_predict = rfs_clf.predict_proba(X_test).astype(float) Now, am trying to explain the predictions using the Lime package available here explainer = lime.lime_tabular.LimeTabularExplainer(X_train.values,feature_names = …
I have been looking into outputting a model explainer artefact at time of training my Keras+Tensorflow Neural network. Lime seems like a great choice however my data is very big and I am reading from disk one batch at a time as it is impractical and inefficient to store in memory. Lime appears to require the whole training dataset to be inputted for it to be able to create a surrogate model. Is it appropriate to use only a sample …
I am new to the concept of model interpretability using LIME method. I am following the tutorial LIME for spectrogram classification. I am finding hard to understand the color coding -- before using LIME the important features were already visible. After applying LIME there is no way to see the colors that highlight the important features. The last set of images below the section "Compute LIME" shows the plot of spectrograms before and after LIME -- to me they look …
Is it possible to train a LIME explainer for a binary classfier on a dataset without labels? I need to understand what is the value of storing a LIME explainer object trained on the same data used to train the model. In general, does it make sense to keep a trained LIME explainer around to generate explanations during production or is it better to train the LIME explainer on production data whenever is needed? Another question. If I train a …
I am trying to use LimeTabularExplainer class and explain_instance function to find explainations of my LightGbm (lgb) model. However, the lgb model uses complex feature set which are not interpretable. I want to pass a subset of oringal features (which are interpretable) to the Lime explainer, so that my resultant explainations are also interpretable. In sections 3.1 and 3.3 of original paper, the authors talk about this https://arxiv.org/abs/1602.04938 rf = sklearn.ensemble.RandomForestClassifier(n_estimators=500) rf.fit(train, labels_train) explainer = lime.lime_tabular.LimeTabularExplainer(train, feature_names=feature_names, class_names=target_names, discretize_continuous=True) exp …
“Machine Learning Interpretability” or “Explainable Artificial Intelligence” has become quite popular in the machine learning community and in recent research. The goal is to make complex (deep learning) models explainable such that one can understand why the model made a particular decision. I had a look at various algorithms which do this (prominent ones like LIME, SHAP, Grad-Cam, but I've also skimmed over many papers that present very “special” approaches). Since I am working with image data, I am particularly …
I am exploring the use of lime for spam/ham categorisation. specifically I have a data frame having list of messages. I would need to identify which messages are spam and which ones are ham by using a set of words (100). I would need to find to test the accuracy of the model. I found some articles on towardsdatasciene and medium that helped me a bit, but I would need a really small example on what I would need (already …
Here is the code: predict_fn = lambda x: xgb_model.predict_proba(x).astype(float) feature_names = X_train.columns for i in range(x_val.shape[0]): # Get the explanation for Logistic Regression val_point = x_val.values[i] print(val_point) print(val_point.shape) explainer = lime.lime_tabular.LimeTabularExplainer(training_data = Xs_train_array, feature_names = feature_names, training_labels = y_train, mode = 'classification', kernel_width=5) exp = explainer.explain_instance(val_point, predict_fn, num_features=10) exp.as_pyplot_figure() plt.tight_layout() Few Notes: Xs_train_array is of size (103,) and is type float. There are no categorical variables. Here is the error message I'm receiving: --------------------------------------------------------------------------- KeyError Traceback (most recent call last) …