Is it okay to fine-tuning bert with large context for sequence classification?
I want to create sequence classification bert model. The input of model will be 2 sentence. But i want to fine tuning the model with large context data which consists of multiple sentences(which number of tokens could be exceed 512). Is it okay if the size of the training data and the size of the actual input data are different?
Thanks
Topic bert finetuning
Category Data Science