4

Imagine we have a multiple-choice exam with N questions. Suppose we have a set of K answer sheets to the exam and their total scores (1 for a correct answer on a question, 0 for incorrect).

How much information can we extract from this set about the true answers to the exam, and what is the shape of the space of possible answers after extracting this information? (e.g., is it convex in some sense? is it a linear subspace? or something like that)

E.g. if we have at least one answer sheet with a score of N, then we know all the correct answers. If we have an answer sheet with a score of 0, then we can exclude at least one answer from each question. [If we have two different answer sheets both with a score of N-1, we know the correct answers to all questions in which they agree] - (on further thought, this is actually incorrect..). Etc.

What's this problem called? Is it solvable in polynomial time? (looks like it's reducible to integer programming)

jkff
  • 2,269
  • 1
  • 14
  • 17

1 Answers1

2

Hardness

Recovering the correct answers to the exam is NP-hard. I'll show how to reduce one-in-three 3SAT to it.

Suppose we have a 3CNF formula $\varphi$ on $N$ variables $x_1,\dots,x_N$. The $i$th question on the exam is "Is $x_i$ true?" There is one answer sheet per clause of $\varphi$. If $\varphi$ contains a clause mentioning variables $x_i,x_j,x_k$, say $x_i \lor \neg x_j \lor x_k$, then there's a corresponding answer sheet that has answered "Yes" to question $i$, "No" to question $j$, "Yes" to question $k$, and left all other questions blank, and this answer sheet received a score of exactly 1.

Now if you can extract the correct answer to all of the questions, then you can extract a satisfying assignment for the corresponding one-in-three 3-SAT problem -- a task that is NP-hard.

Therefore, you should not expect to find any efficient algorithm for your task (i.e., to extract the correct answers, given the scores on the answer sheets), at least in the worst case.

That's the bad news. Now for the good news:

Algorithms

In practice, a good way to approach your problem might be to throw a SAT solver or ILP solver at it. This sounds like the kind of problem that they might be effective at, in practice, if $N$ is not too large. Your problem can be readily formulated as an instance of integer linear programming (ILP), or indeed as an instance of SAT.

D.W.
  • 167,959
  • 22
  • 232
  • 500