Multi-valued categorical features in LIME

I am working with the LIME implementation by Marco Ribeiro (https://github.com/marcotcr/lime). Specifically, I am utilizing the LimeTabularExplainer as I have a mixture of numerical and categorical features in my dataset. How would I represent categorical features that may take on ≥ 0 values in a single example? I understand that the API requires categorical features to be converted to an integer representation, but how would I represent combinations of values for one categorical feature? To illustrate the circumstance, see the example dataset attached as an image and consider the "Comorbidities" feature.

One approach I investigated was to consider the presence of each value as its own binary categorical feature; however, the number of features scales up fast since I have multiple features with examples taking on many combinations of values. I am concerned that my approach does not enable efficient sampling around an example to be explained. I understand that LimeTabularExplainer requires that values in categorical columns be integers, but how would I encode these "multi-valued" categorical features as integers? Thank you for considering my question!

Topic categorical-encoding lime explainable-ai

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.