1

I am using a LSA/TF-IDF/BM25/Ensemble models for text search and finally calculating similarity score to rank my search. I would like to decide a threshold value for the score, below which I would not like to display anything. Eg: If my similarity score is less than 0.7, I would like to return "No Result Found".

I know there wont be any specific value that I can use here, but I would appreciate suggestions on how I can find that value?

My Thoughts: Idea 1: I can calculate the similarity scores for all past searches and try to find average of top 10 scores for each search, and take a final mean of all those value (Not sure if it would be a good idea).

Idea 2: Deploy model to production with top 15-20 search results and wait to collect users click results, so one insight after collecting result for a month could be, 95% of the time user does not go to any results having score less than 0.60 or something.

0 Answers0