How to prove every context-free language over a unary alphabet is regular?

Question

How can I show that every context-free language over a unary alphabet is regular?

Yuval Filmus · Answer 1 · 2018-06-25T14:51:40.170

The following proof follows Pighizzini, Shallit and Wang, Unary Context-Free Grammars and Pushdown Automata, Descriptional Complexity and Auxiliary Space Lower Bounds.

Let $L$ be a unary context-free language. Assume for simplicity that $\epsilon \notin L$, and consider a grammar $G = \langle V,\{a\},P,S \rangle$ for $L$ in Chomsky normal form. Denote $h = |V|$.

In the sequel, whenever we say parse tree, we mean parse tree in $G$.

Let $\Pi$ be the collection of all triples $(U,i,j)$ such that $U$ is a parse tree rooted at some nonterminal $A$ which represents a derivation of the sentential form $a^i A a^j$, where $0 < i+j < 2^h$.

Let $(U,i,j) \in \Pi$, and let $A$ be the label of the root of $U$. Given a parse tree $S$ containing a node $v$ labeled $A$, we can "pump" $S$ by $U$ by replacing $v$ with a copy of $U$, attaching the children of $v$ to the leaf of $U$ labeled $A$.

Lemma. If $\ell > 2^{h-1}$ and $a^\ell \in L$ and $T$ is a parse tree for $a^\ell$, then there exists a triple $(U,i,j) \in \Pi$ and a parse tree $S$ for $a^{\ell-i-j}$ such that $T$ is obtained from $S$ by pumping by $U$.

Proof. Since $\ell > 2^{h-1}$, $T$ must have depth at least $h+1$ (recall that the last level corresponds to productions of the form $A \to a$), and so a path of length $h+1$ edges. This path contains $h+1$ nonterminals, one of which must repeat. Consider such a repetition within the last $h+1$ nonterminals of the path. The repetition corresponds to a triple $(U,i,j) \in \Pi$ (note $i+j < 2^h$ since we chose a repetition within the last $h+1$ nonterminals, and $i+j > 0$ since the grammar is in Chomsky normal form). By "pumping out" this derivation, we obtain the parse tree $S$ for $a^{\ell-i-j}$. $\quad\square$

Corollary. Every parse tree in $L$ can be obtained in the following way:

Start with a parse tree for some $a^\ell$, where $\ell \leq 2^{h-1}$.
Repeatedly pump by $U$ for some $(U,i,j) \in \Pi$.

Using this, we can construct an NFA for $L$:

Guess a parse tree for some $a^\ell$, where $\ell \leq 2^{h-1}$, and read the word $a^\ell$.
Set $X$ to be the set of nonterminals appearing in the parse tree.
Perform the following operation an arbitrary number of times:
1. Guess $(U,i,j) \in \Pi$ such that the label of the root of $U$ appears in $X$.
2. Read $a^{i+j}$.
3. Add all nonterminals in $U$ to $X$.

This shows that $L$ is regular.

Parikh's theorem can be proved in the same way.

score 3 · Answer 2 · answered Oct 21 '13 at 16:34

3

You could use the more general fact (due to Parikh) that the commutative image of a context-free language of $A^*$ is a rational subset of $\mathbb{N}^{|A|}$. For a unary alphabet, this gives your statement.

answered Oct 21 '13 at 16:34

J.-E. Pin

6,219
21
39

score 0 · Answer 3 · answered Mar 02 '24 at 10:11

0

This follows easily from Parikh's theorem, but there is also a relatively short proof using the pumping lemma (which is easier to prove than Parikh's theorem).

answered Mar 02 '24 at 10:11

Mati

198
1
5

score -1 · Answer 4 · answered Oct 21 '13 at 14:30

This was my first attempt at it:

First let $L$ be our context free language. Using the pumping lemma for context free grammars: the pumping constant is $p$ and $m \ge p$.

We have a string $s = 1^m = uvwxy$, and we say $a_m = |uwy|$ and $b_m = |vx|$ such that $s = 1^{a_m}1^{b_m}$ where $1 \le b_m \le p$

Because our language was context free, we can say this about a string. Now let us define two more languages:

$$Mod = \{m \in \mathbb{N} | 1^m \in L\}$$

$$ L' = \{x \in L| |x| < p\}$$

Now we can construct out language $L$ from of finite union of regular languages, meaning it is regular:

$$ L = L' \cup \bigcup_{m \in Mod} 1^{a_m}1^{b_m} = L' \cup \bigcup_{m \in Mod} 1^{a_m}(1^{b_m})^* $$

How to prove every context-free language over a unary alphabet is regular?

4 Answers4

Linked