2

In my website, users can either "like" or "dislike" a posted comment. I want to put a link to sort comments by liking such that the most liked ones becomes on top.

Of course I cannot just sort by the number of likes only. I have to subtract the dislikes. But what if the difference (likes - dislikes) are the same. e.g. 10 - 8 = 2 and 5 - 3 = 2. I think in this case, the comment of 10 likes and 8 dislikes has to come before the 5 likes and 3 dislikes comment.

So is there an equation that you feed it the number of likes and number of dislikes and then it gives you a meaningful rating number that I can sort with?

  • Why can't you code this directly? Given a list pick largest "like-dislike" and if two are equal sort by largest "like"? – Timothy Wagner Nov 27 '10 at 17:07
  • Yes, your criterion defines a perfectly good total order relation. Doesn't your programming language allow you to sort using arbitrary user-defined comparison functions? –  Nov 27 '10 at 18:23
  • 1
    ...Although you should really look at How Not To Sort By Average Rating by Evan Miller, which solves your underlying problem and not just the symptom. –  Nov 27 '10 at 18:58
  • There are many algorithms used for this problem because there are many ways to interpret and to use the data. One very different strategy would be to put comments on top if they've been rated many times, with a mix of likes and dislikes. That means people are interested in the comment and are likely to respond. (This could lead to intelligent debate or to name-calling, but both can result in more people visiting your site.) – Jonas Kibelbek Nov 28 '10 at 05:48
  • Jonas Kibelbek, what would that algorithm look like? –  Nov 29 '10 at 19:34
  • A friend suggested (L*(L-D))/(L+D+1). I don't know where he got it from. Would it work? –  Nov 29 '10 at 19:46

3 Answers3

2

You have to define what criterion will say that a pair $(L_1,D_1)$ for comment 1 is better than $(L_2,D_2)$ for comment 2. One way is just to subtract, so $(L_1,D_1) \geq (L_2,D_2)$ if $L_1-D_1 \geq L_2-D_2$. But on stackexchange there are many more upvotes than downvotes, so maybe you want $(L_1,D_1) \geq (L_2,D_2)$ if $L_1-10*D_1 \geq L_2-10*D_2$ or some such. Maybe you want to compare on $\frac{L-D}{L+D}$. There are many choices, and you need to consider your audience and their behavior to select one. Any such function will map (L,D) to some number, which you can then sort.

Ross Millikan
  • 383,099
1

You can try $$f(L,D) = L - D + \frac{1}{D+2}.$$ This would sort first according to $L - D$ (ascending) and then according to $D$ (descending).

Yuval Filmus
  • 57,953
0

Here's what governs (or at least used to govern) the Reddit "best" comment sorting:

http://www.evanmiller.org/how-not-to-sort-by-average-rating.html

The lower bound of the Wilson score confidence interval represents an estimate of "at least" how good a comment should be.

However, if you want to entertain people into conversations and also show "new", "untested" comments on top, you might want to consider the upper bound (replace the "+/-" with a "+").

That way, the comments are sorted by the optimistic potential they have ("at most"), given current votes.

So, use the lower bound to see proven-good comments on top, and the upper bound if you want new comments (of unknown quality) to be above.

danuker
  • 111