Smallest DFA that accepts given strings and rejects other given strings

Question

Given two sets $A,B$ of strings over alphabet $\Sigma$, can we compute the smallest deterministic finite-state automaton (DFA) $M$ such that $A \subseteq L(M)$ and $L(M) \subseteq \Sigma^*\setminus B$?

In other words, $A$ represents a set of positive examples. Every string in $A$ needs to be accepted by the DFA. $B$ represents a set of negative examples. No string in $B$ should be accepted by the DFA.

Is there a way to solve this, perhaps using DFA minimization techniques? I could imagine creating a DFA-like automaton that has three kinds of states: accept states, reject states, and "don't-care" states (any input that ends in a "don't-care" state can be either accepted or rejected). But can we then find a way to minimize this to an ordinary DFA?

You could think of this as the problem of learning a DFA, given positive and negative examples.

This is inspired by Is regex golf NP-Complete?, which asks a similar questions for regexps instead of DFAs.

score 8 · Answer 1 · answered Jan 15 '14 at 00:39

There is a lot of literature on learning DFAs given positive and negative samples. If $A$ and $B$ are finite I don't see how the problem would ever be undecidable though. If $A \cap B = \emptyset$ then obviously the DFA that accepts only the strings in $A$ satisfies your requirement and one can simply enumerate all smaller DFAs. If $A \cap B \neq \emptyset$ then clearly no such DFA exists.

Finding the minimum DFA consistent with a given set of strings is NP-complete. This result appears as Theorem 1 in Angluin's paper On the complexity of minimum inference of regular sets. So clearly your problem is also NP-complete.

For lots of good links and discussion on learning regular languages check out the CSTheory blogpost On Learning Regular Languages.

Shaull · Accepted Answer · 2016-10-05T17:24:12.407

7

A DFA as you describe is called a separating DFA. There is some literature on this problem when $A$ and $B$ are regular languages, such as Learning Minimal Separating DFA’s for Compositional Verification, by Yu-Fang Chen, Azadeh Farzan, Edmund M. Clarke, Yih-Kuen Tsay, Bow-Yaw Wang

Note that as @reinierpost states, without any restrictions on A and B, the problem may become undecidable.

edited Oct 05 '16 at 17:24

answered Jan 13 '14 at 14:39

Shaull

17,814
1
41
67

Smallest DFA that accepts given strings and rejects other given strings

2 Answers2

Linked

Related