Imbalanced Dataset (Transformers): How to Decide on Class Weights?
I'm using SimpleTranformers to train and evaluate a model.
Since the dataset I am using is severely imbalanced, it is recommended that I assign weights to each label. An example of assigning weights for SimpleTranformers is given here.
My question, however, is: How exactly do I choose what's the appropriate weight for each class? Is there a specific methodology, e.g., a formula that uses the ratio of the labels?
Follow-up question: Are the weights used for the same dataset universal? I.e., if I use a totally different model, can I use the same weights or should I assign different weights depending on the model.
p.s.1. If it makes any difference, I'm using roBERTa.
p.s.2. There is a similar question here, however, I believe that my question is not a duplicate because a) the attached question is about Keras where my question is about Transformers, and b) I'm also asking about general recommendations of how weight values are decided where the attached question is not.
Topic bert transfer-learning imbalance class-imbalance
Category Data Science