Why does Keras only have 3 types of attention layers?
The Keras library list only has 3 types of attentions - keras attention layers, which are :
- MultiHeadAttention layer
- Attention layer
- AdditiveAttention layer
However, in theory there are multiple types of attentions possible, e.g. (some of these may be synonyms):
- Global
- Local
- Hard
- Bahdanau Attention
- Luong Attention
- self
- additive
- Latent
- what else?
Are other types just not practical or other types actually can be derived from existing implementation? Can someone please shed some light with examples?
Topic attention-mechanism keras deep-learning
Category Data Science