Feature importance by removing all other features?

For neural network feature importance, can I zero-out all features except one in order to gauge that feature's importance? I know shuffling a feature is one approach.

For example, leaving in the 4th feature.

feature_4 = [
    [0.,0.,0.,1.15,0.]
    [0.,0.,0.,1.76,0.]
    [0.,0.,0.,2.31,0.]
    [0.,0.,0.,0.94,0.]
]

_, probabilities = model.predict(feature_4)

The non-linear output of activation functions worries me because activation of the whole is not equal to the sum of individual activations:

from scipy.special import expit #aka sigmoid

 expit(2.0)
0.8807970779778823

 expit(1.0)+expit(1.0)
1.4621171572600098

And softmax seems much less straightforward in comparison to sigmoid.

Topic feature-importances interpretation activation-function deep-learning neural-network

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.