How long does it take to fine-tune XLNet?
XLNet takes a lot more time than BERT during pre-training. This results in XLNet performing better than BERT in over 20 NLP tasks. How long does XLNet take for fine-tuning (let's assume this is running on Google Colab)?
(Let's assume a text summarization task with around 4000 examples)
Topic pretraining bert finetuning nlp
Category Data Science