What is "Gradient × Hidden States" explainability method? Is there any documentation about it?
I am doing a literature review on post-hoc explainability methods based on gradient. I stumbled upon one I didn't heard of to extract highlights from a trained model in this post-hoc fashion:
We compute gradients w.r.t. the hidden states of each layer, and multiply the resultant vectors by the hidden state vectors themselves: $\nabla_{H_i} × H_i \in R^{N+M}$, for $0 \leq i \leq L + 1$ - Marco V. Treviso et al Submission for the Explainable Quality Estimation Shared Task
I don't get what this method highlights in terms of explainability and I haven't found more documentation on this.
Topic explainable-ai gradient-descent
Category Data Science