Is Elastic Search recommended if attribute getting search is not a huge text document?

We are currently developing a system with MEAN stack with Mongodb at backend. We have employees name, and Ids in our system and our client wants to get pretty good (Read: Google Like) search in our system to search for employees' records. He needs our system to recommend employees even if he has misspelled the name, etc.

One of the suggestions from our development lead was that we should use elastic search but from what I have seen, elastic search is preferred especially in scenarios where we are searching in a large text documents. I am having a feeling that using elastic search in our case will only increase the overhead as the attributes getting search in our case just names, emails or ids, etc.

Currently, our system only uses some basic regrex but I believe, we can improve our search by putting in some effort and without moving towards elastic search (though the development lead feels that we will only be Reinventing the wheel and we should use technologies that do this stuff for us.)

So, I wanted some guidance (Suggestions) on whether to move towards elastic search or should we stick with our current resources to work on a more optimized search than what we have right now?

Topic search-engine mongodb search

Category Data Science


As you mentioned you are searching just names, emails or ids, etc which is not large text.

So consider a case where you have 6 documents/records having names as below then you can understand better whether how much the large text could matter.

  1. Rohit kumar Bhatnagar
  2. Shilpa Shinde
  3. Manoj kumar
  4. Rohit Bhatnagar Kandaswamy
  5. Rohit kumar shindey Bhatnagar
  6. Rohit Bhatnagar

If user comes and perform a search for Rohit Bhatnagar, then using the regex you will be showing results in two ways:

Case I: Where regex is strict match

  1. Rohit Bhatnagar Kandaswamy
  2. Rohit Bhatnagar

Case II: When regex is relax

  1. Rohit kumar Bhatnagar
  2. Rohit Bhatnagar Kandaswamy
  3. Rohit kumar shindey Bhatnagar
  4. Rohit Bhatnagar

If we examine case Ist, you are missing two things relaxation(Record 1 & 5 will be missed) and in ranking exact match should come on top that might be more relevant.

In case 2, we relaxed but still exact match is below in ranking

So if need a relevant search then yes you can use a search engine. You could also tweak like whether you need the record 5 or not because if you see that might looks irrelevant so you can control like how many words should be in between to consider a document match. If we say consider 1 word in between then record 5 will be eliminated from results.

Other than search relevance, you can horizontal scale your search if QPS is high. You could also use machine learning techniques(learn to rank), also you could apply synonyms, stemming etc. There are still many other benefits join, streaming, sharding, etc which you could learn from the documentation

Solr: http://lucene.apache.org/solr/guide/7_6/

ES: https://www.elastic.co/guide/en/elasticsearch/reference/6.4/index.html

If above mentioned is required or in near Future then you can use use ElasticSearch or Solr as search engine otherwise you should be good with the current system.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.