3

Given a query sentence, we search and find similar sentences in our corpus using transformer-based models for semantic textual similarity.

  • For one query sentence, we might get 200 similar sentences with scores ranging from 0.95 to 0.55.

  • For a second query sentence, we might get 200 similar sentences with scores ranging from 0.44 to 0.27.

  • For a third query sentence, we might only get 100 similar sentences with scores ranging from 0.71 to 0.11.

In all those cases, is there a way to predict where our threshold should be without losing too many relevant sentences? Having a similarity score of 1.0 does not mean that two documents are 2X more similar than if the score was 0.5. Is there a way to determine the topk (how many of the top scoring sentences we should return) parameter?

DarknessPlusPlus
  • 215
  • 2
  • 10

1 Answers1

3

As far as I know there is no satisfactory answer:

  • One uses a threshold in order to avoid having to choose a specific K in a top K approach. The threshold is often selected manually to eliminate the sentences which are really not relevant. This makes this method more suitable for favouring recall, if you ask me.
  • Conversely, one uses a "top K" approach in order not to select a threshold. I think K is often selected quite low in order to keep mostly relevant sentences, i.e. it's an approach more suitable for high precision tasks.

The choice depends on the task:

  • First, the approach could be chosen based on what is done with the selected sentences: if it's something like question answering, one wants high precision usually. If it's information retrieval, one wants high recall. If it's a search engine, just rank the sentences by decreasing similarity.
  • Then for the value itself (K or threshold), the ideal case is to do some hyper-parameter tuning. i.e. testing multiple values and evaluate the results. If this is convenient or doable for the task, then look at a few examples and manually select a value which looks reasonable.
Erwan
  • 26,519
  • 3
  • 16
  • 39