How to detect out-of-domain text input?
I have a text classifier which can classify around 40 classes. But the problem is there is no way to handle the case where if any user gives some input to the model which input doesn't match with any of the classes, the model still converges to one of the valid intents. So my question is what are the ways to identify if the input is out of domain? Right now I am using Facebook's fasttext
as the supervised classifier, but it gives me very high confidence even if the input is out-of-domain.
Please note that, right now I do not have negative class data available to me, so I cannot train a binary classifier to distinguish in-domain and out-of-domain input beforehand. So I have to do this job just by using the in-domain data (of all the valid classes), with no out of domain data. I believe it is still possible to do so without negative class data, but if not therefore I shall move to the aforementioned approach (using a binary classifier first before passing data to main classifier).
Also you are most welcome to suggest me possible algorithms and/or pipeline for achieving it and also can give me frameworks' suggestions (if any).
Topic fasttext text-classification classification
Category Data Science