8

Let the universe be the set $U$ and a set of subsets $S$ be such that $\cup_{s \in S} s = U$. I am interested in computing the longest sequence of sets $s_1, ..., s_k$ such that:

  1. $s_i \in S$ $\forall i \in [k]$

  2. $s_i \not\subseteq \cup_{j=1}^{i-1} s_j$ (so each $s_i$ adds at least one new element on top of the existing $i-1$ sets)

Is there a polynomial time algorithm that can compute this longest sequence of sets given a $(U, S)$ instance? Or is this known to be hard (a reference paper on this would be much appreciated)?

2 Answers2

0

Quoted from https://cstheory.stackexchange.com/a/27695/48912:

Here I show that the problem is NP-complete.

We convert a CNF to an instance of your problem as follows. Suppose that the variables of the CNF are $n$ $x_i$'s and the clauses are $m$ $C_j$'s, where $n<m$. Let $U=\cup_i (A_i\cup B_i\cup Z_i)$ where all sets in the union are completely disjoint. In fact, $A_i=\{a_{i,j}\mid x_i\in C_j\}\cup\{a_{i,0}\}$ and $B_i=\{b_{i,j}\mid x_i\in C_j\}\cup\{b_{i,0}\}$, while $Z_i$ is any set of cardinality $k=2n+1$. Also denote $Z=\cup_i Z_i$ and fix for every $Z_i$ an increasing family of length $k$ inside it, denoted by $Z_{i,l}$ for $l=1..k$. For every variable $x_i$, we add $2k$ sets to $\mathcal F$, every set of the form $A_i \cup Z_{i,l}$ and $B_i \cup Z_{i,l}$. For every clause $C_j$, we add one set to $\mathcal F$, which contains $Z$, and for every $x_i\in C_j$ element $\{a_{i,j}\}$ and for every $\bar x_i\in C_j$ element $\{b_{i,j}\}$.

Suppose that the formula is satisfiable and fix a satisfying assignment. Then pick the $k$ sets of the form $A_i \cup Z_{i,l}$ or $B_i \cup Z_{i,l}$, depending on whether $x_i$ is true or not. These are $nk$ incremental sets. Now add the $m$ sets corresponding to the clauses. These also keep increasing the size, as the clauses are satisfiable. Finally, we can even add $k$ more sets (one for each variable) to make the sequence cover $U$.

Now suppose that $n(k+1)+m$ sets are put in an incremental sequence. Notice that at most $k+1$ sets corresponding to $x_i$ can be selected for each $x_i$. Thus, if there are no clause sets in the incremental sequence, at most $n(k+1)$ can be selected, which is too few. Notice that as soon as a clause set is selected, we can pick at most two sets corresponding to each $x_i$, a total of at most $2n$ sets. Therefore, we have to pick at least $n(k-1)$ variable sets before any clause set is picked. But as we can pick at most $k+1$ for each $x_i$, this means that for each we have picked at least $1$, as $k=2n+1$. This determines the "value" of the variable, thus we can pick only "true" clauses.

xskxzr
  • 7,613
  • 5
  • 24
  • 47
-1

Consider the decision version of the set-cover problem: given a collection of sets $S_i \subset U$, we need to determine whether there exists a cover of size at least $k$.

Given such an instance, we create $k$ dummy elements $e_{d_1}, \dots, e_{d_k}$. Now, for each of the subsets $S_i$, we create $k$ copies for each such $e_{d_j}$.

This serves as a reduction to the set-cover variant given in the question. More specifically, if we can get a sequence of length $k$ (at least), then we do indeed have an original set-cover of size $k$. The formal proof is left for the reader.

Hence, the considered problem is NP-hard.

codeR
  • 1,983
  • 7
  • 17