Conceptual question about cosine similarity and clustering algorithms for word embeddings

sigma_factor

2022年1月20日 12:33

Is the following statement true? https://stats.stackexchange.com/q/256778

The value of cosine similarity between two terms itself is not indicator whether they are similar or not.

If yes then how is use of clustering algorithms like DBSCAN for word embeddings justified? From what I know DBSCAN algorithm only looks to its immediate neighbour to be included in cluster, but it seems the wrong way since maybe we need to check every word with every other word and take top ranked words.

Topic cosine-distance word-embeddings clustering

Category Data Science

Conceptual question about cosine similarity and clustering algorithms for word embeddings

About