Counting the number of words accepted by an acyclic NFA

Question

Let $M$ be an acyclic NFA.

Since $M$ is acyclic, $L(M)$ is finite.

Can we compute $|L(M)|$ in polynomial time?

If not, can we approximate it?

Note that the number of words is not the same as the number of accepting paths in $M$, which is easily computable.

Let me mention one obvious approach that doesn't work: convert the NFA to a DFA (which will also be acyclic), then count the number of accepting paths in the DFA. This doesn't result in a polynomial-time algorithm, since the conversion can cause an exponential blowup in the size of the DFA.

score 4 · Answer 1 · edited Oct 09 '20 at 16:17

Here's one approach that I expect should give you a multiplicative-factor approximation, with polynomial time running time.

Let $L$ be a regular language that is a subset of $\{0,1\}^n$, e.g., $L=L(M) \cap \{0,1\}^n$. We will try to compute the approximate size of $L$.

At a high level, our approach to approximate $|L|$ will look something like this:

Pick a fraction $p$, where $0<p<1$.
Choose a regular language $R$ such that, roughly speaking, $R$ is a random subset of $\{0,1\}^n$ of size approximately $p2^n$ (i.e., $|R|\approx p2^n$).
Check whether $L \cap R$ is non-empty. Note that this check can be done in polynomial time.

Repeatedly perform steps 1-3 for various values of $p$. This gives you some information that will let you approximate $|L|$.

In particular, if $|L|=m$, then we'd expect

$$\Pr[L \cap R = \emptyset] = (1-p)^m \approx e^{-pm}.$$

So, if you happen to choose $p=1/m$ and repeat steps 1-3 a bunch of times, you should expect to see an empty intersection about 37% of the time. If you see an empty intersection significantly more often then that, then increase $p$ and try again. If you see an empty intersection significantly less often then that, you might decrease $p$ and try again.

In this way, using something like binary search, you should be able to approximate $|L|$ to within a multiplicative approximation factor.

You'll still need to pick some way to choose $R$ so that it is regular but also behaves like a random subset. There are many possibilities, but one might good way might be to choose a random 2-universal hash $h:\{0,1\}^m \to \{0,1,2,\dots,k-1\}$, pick $y \in \{0,1,\dots,k-1\}$ randomly, and let $R=\{x \in \{0,1\}^n : h(x)=y\}$. Choosing $k=\lceil 1/p \rceil$ gives you a random set $R$ of approximately the right size, and because $h$ is 2-universal, all of the mathematics above should work out properly.

This should solve your problem for the case where all strings in the NFA have the same length, say $n$. If they have varying lengths, then you can handle each possible length separately. Since $M$ is acyclic, the maximum lengths of any string in $L(M){}$ is at most the number of states in $M$, so this does not increase the runtime too much.

(This construction might remind you of the Valiant-Vazirani theorem about unambiguous SAT.)

score 1 · Answer 2 · answered Nov 13 '14 at 20:39

Assume that you can count in polynomial time the number of words of a language given by an acyclic NFA. In this case, considering two acyclic NFAs $A_1$ and $A_2$, you can compute in polynomial time the cardinal $n_1$ (resp. $n_2$) of the language of $A_1$ (resp. $A_2$). By a direct product (preserving acyclicity) you can also compute in polynomial time the cardinal $n_3$ of the intersection of these two languages. The two automata accepts the same language iff $n_1=n_2=n_3$. Therefore you can test the equality of two finite languages given by acyclic automata in polynomial, which is known to be an NP-complete problem. So, unless $P=NP$, you can't solve your problem in polynomial time.

Counting the number of words accepted by an acyclic NFA

2 Answers2

Linked