What the differences between self-supervised/semi-supervised in NLP?

GPT-1 mentions both Semi-supervised learning and Unsupervised pre-training but it seems like the same to me. Moreoever, Semi-supervised Sequence Learning of Dai and Le also more like self-supervised learning. So what the key differences between them?

Topic pretraining semi-supervised-learning nlp

Category Data Science


Semi-supervised learning is having label for a fraction of data, but in self-supervised there is no label available. Imagine a huge question/answer dataset. No one labels that data but you can learn question answering right? Because you are able to retrieve relation between question and answer from data.

Or in modeling documents you need sentences which are similar and sentences which are dissimilar in order to learn document embedding but these detailed labels are usually not available. In this case you count sentences from same document as similar and sentences from two different documents as dissimilar and train your model (example idea: you can run a topic modeling on data and make similar/dissimilar labels more accurate). It is called self training.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.