Could Attention_mask in T5 be a float in [0,1]?

I was inspecting T5 model from hf https://huggingface.co/docs/transformers/model_doc/t5 . attention_mask is presented as

attention_mask (torch.FloatTensor of shape (batch_size, sequence_length), optional) — Mask to avoid performing attention on padding token indices. Mask values selected in [0, 1]:
1 for tokens that are not masked,
0 for tokens that are masked. 

I was wondering whether it could be used something softer not only selecting the not-padding token but also selecting how much attention should be used on every token.

This question is related to the one proposed here Can the attention mask hold values between 0 and 1?

Do you know if such attention_mask vector is used in any other ways where a non integer value could harm the model?

Thank you for your precious time and advices.

Topic huggingface transformer attention-mechanism deep-learning nlp

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.