DBSCAN getting one huge cluster with noisy points

I'm currently trying to cluster customer service email answers (NLP).

When I use DBSCAN with TF-IDF embeddings + Annoy indexes, I get good clusters.

But, when I use DBSCAN with FastText embeddings + Annoy indexes, I get good clusters except the cluster with label zero (0) which seems to include lots of noisy points (that should be labeled with -1 instead of 0).

Anyone with and idea of what this can be? I'm using an eps=0.5 for both cases.

Topic fasttext tfidf dbscan scikit-learn machine-learning

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.