I read a claim that
there are formulas for which any equivalent CNF has exponential length.
Can you show me an example for such a boolean formula? I have been trying to build it myself and failed.
I read a claim that
there are formulas for which any equivalent CNF has exponential length.
Can you show me an example for such a boolean formula? I have been trying to build it myself and failed.
Try parity. Parity of $n$ variables has formulas of size $O(n^2)$, but every clause in the CNF must be of width $n$, so there must be at least $2^{n-1}$ of them.
Here are some more details. The $O(n^2)$ formula for parity is obtained by a recursive construction. For simplicity, let's assume that $n = 2^m$ and only count leaves. We use the following formulas: $$ \begin{align*} \operatorname{parity}(x_1,\ldots,x_{2n}) = &(\operatorname{parity}(x_1,\ldots,x_n) \land \lnot \operatorname{parity}(x_{n+1},\ldots,x_{2n})) \\ \lor &(\lnot \operatorname{parity}(x_1,\ldots,x_n) \land \operatorname{parity}(x_{n+1},\ldots,x_{2n})) \end{align*}$$ and $\operatorname{parity}(x_1) = x_1$. Let $P(n)$ be the number of leaves in $\operatorname{parity}(x_1,\ldots,x_n)$. Then $$ P(2n) = 4P(n), \quad P(1) = 1. $$ The solution to this recurrence is $P(2^n) = 4^n$. Since parity of $n$ variables can be obtained from parity of $2^{\lceil \log n \rceil} < 2n$ variables, in total parity for $n$ variables can be realized using a formula containing at most $4n^2$ leaves. The length of a binary encoding of the formula is $O(n^2\log n)$ (the $\log n$ factor comes from the fact that the indices have length $\log n$).
Regarding CNFs, let's consider DNFs instead; the situation is symmetric. Suppose $\phi = T_1 \lor \cdots \lor T_\ell$ is a DNF for $\operatorname{parity}$. Each $T_i$ must contain $n$ variables: given the values of even $n-1$ variables, you cannot be certain that $\operatorname{parity}$ is true. So each term "catches" one $1$-input of $\operatorname{parity}$, of which there are $2^{n-1}$.
There might be a difference between equivalence and equisatisfiability that you are getting caught up on.
If you want an example where an equivalent conversion into CNF has an exponential growth, see wikipedia on Conjunctive normal form: Conversion into CNF:
$$(X_1 \wedge Y_1) \vee (X_2 \wedge Y_2) \vee \dots \vee (X_n \wedge Y_n).$$
This formula leads to exponential length CNF formulas, if the conversion must be equivalent.
The article continues on about how by introducing new variables, one can obtain an equisatisfiable formula that grows only linearly. These types of conversions usually create new models that contain the old model, so while the new formula is not equivalent, a satisfying assignment to the new formula will directly give a satisfying assignment to the old formula.
Though, there might be conversions that do not contain the old model, I've not seen one. As a side-note, Boolean satisfiabilty is self-reducible, which means that an algorithm that solves the decision problem, can be used to solve the optimization problem, via forcing one variable, and repeated calls to the solver. This means that even if you have a conversion that generates another, equisatisfiable formula, but for which the models are very different, you can find the satisfying assignment of the original formula (in time bounded by a polynomial of the runtime of the solver).