In search, position of the search result affects the click-through rate a great deal. How do people usually deal with this ? In practice how to remove such bias to create unbiased training data for training learning to rank model ?
2 Answers
What are your output (prediction) and inputs?
The bias in itself is very important in predicting the click-through rate, but definitely less useful if you want to assess your ad copy/ page title in relation to the click-through rate.
One potential way to do that is to model the impact of the position of the ad / (or page, if it's for SEO), and try to normalize the data either by including the impact factor in your model or adjusting the result by the impact factor.
I would suspect it would be a log function, as the number of clicks tend to drop significantly after the top 1/2 of the first page.
- 494
- 3
- 14
Discounted cumulative gain (DCG) is a common way to measure the quality of search results by measuring usefulness, or gain, of a document based on its position in the result list.
- 23,131
- 2
- 29
- 113