How Pretraining part actually work in Wav2vec models? Which data is qualify to be the adequat for fine-tuning part the model of speech2text

Pretraining and fine-tuning the algorithm of wav2vec2.0, the new one using in FAcebookAI to do speech to text for low-resource language.

I didn't actually get how the model does the pretraining part if someone can help me, I read the article https://arxiv.org/abs/2006.11477 but I ended up not getting the notion of pre-train in this regard. the question is HOW do we do pretraining?!

Note : i'm a beginner in ML, so far , i've done some project with nlp,I have an idea on Transformers but no work done on that. The simpler the answer the easier for me to understand , Thanks!

Topic transfer-learning speech-to-text unsupervised-learning semi-supervised-learning

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.