Consider a learning to rank setting, where I'm learning from N items displayed to the user for every user query. Suppose I can quantify the probability of examination $P[E_i]$ of each position $i$ given that the first position was examined (i.e. relative to the first position). The user feedback is binary - click / no-click.
Now suppose I'm using the softmax-crossentropy listwise loss (don't ask why - suppose it's a constraint in the system) on each query. What would be the 'correct' way to incorporate the position bias information?
Intuitively, a click on a lower position is more informative than a click on a higher position. So it would be beneficial to weigh the loss inversely proportional to $P[E_i]$ for a click on position $i$. On the other hand, a non-click on a low position is less informative, so it would make sense to weigh the terms inside the soft-max in such a way that lower examination probability results in a smaller effect. But what are the correct weights? It's not obvious.