Is it possible to fine-tuning BERT by training it on multiple datasets? (Each dataset having it's own purpose)

BERT can be fine-tuned on a dataset for a specific task. Is it possible to fine-tune it on all these datasets for different tasks and then be utilized for these tasks instead of fine-tuning a BERT model specific to each task?

Topic bert transformer finetuning transfer-learning nlp

Category Data Science


This is possible but the BERT model will lose its purpose. Each NLP task will have its optimal loss value. If many tasks are fine-tuned on the same model, the optimal loss function for all the tasks will not be reached.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.