How can I use all possible spelling correction of documents before clustering those documents?
I have the data set with many documents of 50 to 100 words each.
I need to clean those data by correcting misspelled words in those documents.
I have an algorithm which predicts possible correct words for misspelled word.
The problem is I need to choose or verify the predictions made by that algorithm in order to clean the spelling errors in the documents.
Can I use all the possible correct words predicted for correct spelling in word vector to perform clustering on those data?
Category Data Science