10

I would be surprised if this isn't a well-studied problem, but I'm not sure what else to search for at this point: you're given a set of binary $n$-vectors $S \subset \{0,1\}^n$. The problem is to find another set of binary $n$-vectors $B \subset \{0,1\}^n$, with minimal size $|B|$, such that every vector in $S$ can be expressed by the XOR results of some subset of $B$ (so $B$ is essentially a basis for $S$ using XOR instead of addition and allowing only binary coefficients in the linear combination).

In a way, this is a form of PCA for binary vectors. While searching for literature on this problem, I came across the Discrete Basis Problem also discussed in this PhD thesis, which seems closely related. Instead of XOR it uses OR, and here $|B|$ is an additional input (and the task is it to minimise the error in representing $S$ with vectors from $B$). This problem is NP-hard. Does the same apply to the problem I've presented above, or is there an efficient solution? Any pointers to existing literature would be much appreciated.

Raphael
  • 73,212
  • 30
  • 182
  • 400
Martin Ender
  • 663
  • 6
  • 16

1 Answers1

12

If you treat your vectors as over the field $GF(2)$ rather than over the set $\{0,1\}$, then what you ask is to find a basis for the span of a set of vectors. This is a well-studied problem in linear algebra, which you probably know the solution for. (One option is Gaussian elimination.)

Yuval Filmus
  • 280,205
  • 27
  • 317
  • 514