Choice of relevance label for a learning to rank model

Question

I am using learning to rank in an e-commerce application. More precisely, I use lightGBM and XGboost's lambdaMART.

I have a dataset of search queries and the list of corresponding items that were displayed, with the actions that were done on them. The actions I have in my dataset are purchased, bookmarked, clicked and viewed. I want to use these actions to assign the relevance labels to each item in my training dataset.

I could assign the following relevance score: {purchased: 3, bookmarked: 2, clicked: 1, viewed: 0}. Whilst the score does not impact how the pairs are built, it will impact the delta(NDCG). So having {purchased: 10, bookmarked: 2, clicked: 1, viewed: 0} might help putting purchased items forward. Additionally, the data is very sparse (a lot of items are viewed only), which probably leads the algorithm to see a lot of ties and does not help convergence.

Does anyone have a rigorous way of choosing relevance labels ? And how do you handle the sparsity issue?

score 1 · Answer 1 · answered Jan 28 '25 at 01:46

It might be worth investigating your dataset a little more before assigning the relevance labels. While I can't think of a "rigorous" way to assign them my general approach would be to do some statistical analysis on the features.

You could use a logistic regression model to predict the probability of a user purchasing an item based on features such as item price, category, and user demographics. The predicted probability can then be scaled to assign relevance scores.

Then try a t-test to compare the average engagement scores (e.g., clicks, purchases) of items that were bookmarked versus those that were only viewed. If the p-value is below a certain threshold (e.g., 0.05), you can conclude that bookmarking is significantly more indicative of relevance (well... reject the null hypothesis).

You could also try multiple regression analysis or classification with a random forest classifier to predict user actions, k-means clustering to group items based on user interaction patterns etc.

As for handling sparsity, evaluate if data augmentation techniques can help create a more balanced dataset by simulating additional user interactions or do some additional feature engineering on your dataset.

If you haven't already considered or investigated; regularization techniques in LightGBM and XGBoost can prevent overfitting, especially in sparse datasets.

In LightGBM, you can use L1 & L2 regularizations. Similarly, XGBoost offers alpha for L1 regularization and lambda for L2 regularization and experimenting with different ranking approaches, such as pairwise versus listwise, can help address sparsity issues.

Hopefully this gives a bit more direction for your problem, trial and error with cross-evaluation should lead you to assign your relevance labels and address those sparsity concerns.

Choice of relevance label for a learning to rank model

1 Answers1