How can we perform STS (Semantic Textual Similarity) on unsupervised dataset using deep learning?
How do you implement STS(Semantic Textual Similarity) on an unlabelled dataset? The dataset column contains Unique_id
, text1
(contains paragraph), and text2
(contains paragraph).
Ex: Column representation: Unique_id | Text1 | Text2
Unique_id 0
Text1
public show for Reynolds suspension of his coaching licence. portrait Sir Joshua Reynolds portrait of omai will get a public airing following fears it would stay hidden because of an export wrangle.
Text2
then requested to do so by Spain's anti-violence commission. The fine was far less than the expected amount of about £22 000 or even the suspension of his coaching license.
Unique_id 1
Text1
Groening. Gervais has already begun writing the script but is keeping its subject matter a closely guarded secret. he will also write a part for himself in the episode. I've got the rough idea but this is the most intimidating project of my career.
Text2
Philadelphia said they found insufficient evidence to support the woman s allegations regarding an alleged incident in January 2004. The woman reported the allegations to Canadian authorities last month. Cosby s lawyer Walter m Phillips jr said the comedian was pleased with the decision.
In the above problem, I've to compare two paragraphs of texts i.e. Text1
Text2
, and then I've to compare semantic similarity between two texts. If they are semantically similar then it will print '1' if not then '0'
Any reference implementation link or any suggestions!
Thanks in advance!
Topic unsupervised-learning deep-learning nlp similarity
Category Data Science