Overfitting in Huggingface's TFBertForSequenceClassification
I'm using Huggingface's TFBertForSequenceClassification for multilabel tweets classification. During training the model archives good accuracy, but the validation accuracy is poor. I've tried to solve the overfitting using some dropout but the performance is still poor. The model is as follows:
# Get and configure the BERT model
config = BertConfig.from_pretrained(bert-base-uncased, hidden_dropout_prob=0.5, num_labels=13)
bert_model = TFBertForSequenceClassification.from_pretrained(bert-base-uncased, config=config)
optimizer = tf.keras.optimizers.Adam(learning_rate=3e-5, epsilon=0.00015, clipnorm=0.01)
loss = tf.keras.losses.CategoricalCrossentropy(from_logits=True)
metric = tf.keras.metrics.CategoricalAccuracy('accuracy')
bert_model.compile(optimizer=optimizer, loss=loss, metrics=[metric])
bert_model.summary()
The summary is as follows:
When I fit the model, the outcome is:
history = bert_model.fit(train_ds, epochs=30, validation_data = test_ds)
Topic huggingface bert overfitting
Category Data Science