XGBoost Feature Importance, Permutation Importance, and Model Evaluation Criteria

I have built an XGBoost classification model in Python on an imbalanced dataset (~1 million positive values and ~12 million negative values), where the features are binary user interaction with web page elements (e.g. did the user scroll to reviews or not) and the target is a binary retail action. My ultimate goal was not so much to achieve a model with an optimal decision rule performance as to understand which user actions/features are important in determining the positive retail action.

Now, I have read quite a bit in forums and literature about evaluating/optimizing an XGBoost model and subsequent decision rule, which I assume is required before achieving my ultimate goal. It seems that there are a lot of different ways to evaluate the decision rule part (e.g. Area Under the Precision Recall Curve, AUROC, etc) and the model (e.g. log-loss). I believe that both AUC and log-loss evaluation methods are insensitive to class balance, so I don't believe that is a concern. However, I am not quite sure which evaluation method is most appropriate in achieving my ultimate goal, and I would appreciate some guidance from someone with more experience in these matters.

Edit: I did also try permutation importance on my XGBoost model as suggested in an answer. I saw pretty similar results to XGBoost's native feature importance. Should I now trust the permutation importance, or should I try to optimize the model by some evaluation criteria and then use XGBoost's native feature importance or permutation importance? In other words, do I need to have a reasonable model by some evaluation criteria before trusting feature importance or permutation importance?

Topic predictor-importance xgboost evaluation classification

Category Data Science


So your goal is only feature importance from xgboost?

Then don't focus on evaluation metrics, but rather splitting.

I would suggest to read this. Using the default from tree based methods can be slippery.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.