Overfitting in Huggingface's TFBertForSequenceClassification

Question

Overfitting in Huggingface's TFBertForSequenceClassification

Shahad Mahmud

2022年3月26日 10:06

I'm using Huggingface's TFBertForSequenceClassification for multilabel tweets classification. During training the model archives good accuracy, but the validation accuracy is poor. I've tried to solve the overfitting using some dropout but the performance is still poor. The model is as follows:

# Get and configure the BERT model
config = BertConfig.from_pretrained(bert-base-uncased, hidden_dropout_prob=0.5, num_labels=13)
bert_model = TFBertForSequenceClassification.from_pretrained(bert-base-uncased, config=config)

optimizer = tf.keras.optimizers.Adam(learning_rate=3e-5, epsilon=0.00015, clipnorm=0.01)
loss = tf.keras.losses.CategoricalCrossentropy(from_logits=True)
metric = tf.keras.metrics.CategoricalAccuracy('accuracy')

bert_model.compile(optimizer=optimizer, loss=loss, metrics=[metric])
bert_model.summary()

The summary is as follows:

When I fit the model, the outcome is:

history = bert_model.fit(train_ds, epochs=30, validation_data = test_ds)

Topic huggingface bert overfitting

Category Data Science

SrJ · Accepted Answer · 2020年6月23日 17:05

From my experience, it is better to build your own classifier using a BERT model and adding 2-3 layers to the model for classification purpose. As the builtin sentiment classifier use only a single layer. But for better generalization your model should be deeper with proper regularization. As you have around 13 class you should use deeper model with a good number of training examples for each class.

Overfitting in Huggingface's TFBertForSequenceClassification

About