2

I have two lists where each item in the first list has a rating for each item in the second. I need to determine an optimal matching (or the best x scenarios) where items are matched, but each item can only be matched once.

For example, there are 150 movies available, 100 people, and each person has rated each movie between 0 and 100. Once a movie is given to a person, it is no longer available for another person.

In this example, I'd like to find the scenario where the person/movie with the lowest rating is the best rating possible. Then, the average of all ratings is the best possible. Ideally, no one person would be unhappy with their movie (data permitting). There might be multiple optimal solutions, so I'd like to determine and rank the top 5 scenarios.

I plan to implement this in node.js and Javascript, so the memory footprint of data structure(s) is a factor.

What would be the optimal data structure and approach to solve something like this?

Raphael
  • 73,212
  • 30
  • 182
  • 400
Gary
  • 121
  • 2

1 Answers1

2

Your problem is closely related to the assignment problem, and can be solved efficiently using techniques derived from that literature. I'll show how to solve it, by building up a solution step by step.

Maximizing the lowest rating

Here is how you could efficiently find the assignment that maximizes the lowest rating. First, focus on the associated decision problem:

Input: movie ratings; an integer $t$
Goal: find an assignment where each person is assigned to a movie they rate at least $t$ or higher

This decision problem is an instance of bipartite matching, and can be solved using standard algorithms for it. Namely, look for any assignment that assigns each person a movie that they related at least $t$.

Now, to find the assignment that maximizes the lowest rating, use binary search over $t$.

This doesn't solve your problem yet, because it doesn't take into account that you want to resolve ties by maximizing the average rating.

Resolving ties by maximizing the average rating

First, find the largest value $t$ such that there exists an assignment where everyone is assigned a movie they rate $\ge t$. (See above for how to do it.) Now if there are multiple such assignments, we want to find one of them where the average rating is maximized.

How do we do that? Answer: use a standard algorithm for the Hungarian algorithm.

In more detail, we do several steps:

  1. Delete any (person,movie) rating where the person has rated the movie something less than $t$. It's not OK to assign that person that movie.

  2. Now, you have an instance of the assignment problem, where each person has rated some of the movies with scores that are $t$ or larger. You want to find a matching assignment) that maximizes the average of the scores of assigned movies. Since every solution must assign 1 movie to each of the 100 people, the size of every assignment is the same, so maximizing the average is equivalent to maximizing the sum. Maximizing the sum of the ratings is exactly the assignment problem. Therefore, you can use any standard algorithm for solving the assignment problem (e.g., the Hungarian algorithm).

This will find an optimal solution: i.e., an assignment that maximizes the lowest rating, and that resolves ties by maximizing the average rating.

If you want to find the top-5 solutions, you can use an algorithm for enumerating solutions to the assignment problem in a streaming/output-sensitive fashion. See, e.g., Algorithm for a list of best solutions to the Assignment problem and Does Ford-Fulkerson always produce the left-most min-cut.

D.W.
  • 167,959
  • 22
  • 232
  • 500