2

I am working in a multilabel recommender project and I try to evaluate it as a ranking problem.

I calculate recall@k and precision@k which both looks quite well. Recall increases and Precision decreases as I try higher K values, which is expected.

However, the NDCG@K increases up to a certain K and after that it stays the same. How can we explain such a behaviour?

Tasos
  • 3,960
  • 5
  • 25
  • 54

1 Answers1

1

It is hard to give an appropriate answer regarding your specific problem (mainly the value of K around which you don't see an increase anymore), and I guess you might have moved on from this question.

But from a theoretical perspective the DCG is the ratio of relevance divided by a discount function. On a multilabel ranking problem you'll use a binary relevance function (either 0 or 1, depending if the label belongs to the ground truth label set). The discount function is by definition a decreasing function, so for large values of K, the contributions of ill ranked will vanish to 0. The most common version of DCG uses a $\frac{1}{\log(rank)}$ discount function, thus contributions of very ill ranked positive labels will end up being of $O(\frac{1}{\log K})$, so basically 0 for very large K.

This can be alleviated by changing your discount function for a function that will decrease slower (e.g by changing the base of your log, or using log(log(rank), a low power of rank).

Also in general NDCG@K converges to 1 when K goes to infinity.

See this paper for some theoretical work characterizing NDCG.