1

Consider a context-free grammar where all rules produce at most one non-terminal (i.e., there is at most one non-terminal on the right-hand side of a rule). What is the class of languages which are accepted by such a grammar?

If you like, we may assume that all rules are one of the following forms, where $V, V'$ denote non-terminals and $a$ denotes a terminal:

(i) $V \to aV'$

(ii) $V \to V' a$

(iii) $V \to \epsilon$

If all rules are of the form (i) or (iii) we get the class of regular languages. On the other hand, we can also recognize more languages, such as the language of palindromes.

Therefore, it is somewhere between regular languages and context-free. Does it perhaps coincide with unambiguous context-free or deterministic context-free?

Caleb Stanford
  • 7,298
  • 2
  • 29
  • 50

1 Answers1

6

A grammar where every production has at most a single non-terminal is called a linear grammar. If the non-terminal is rightmost in all productions, it is a right linear grammar; if the non-terminal is leftmost in all productions, it is a left linear grammar. Left and right linear grammars produce regular languages (and every regular language can be described by a right and a left linear grammar), but unrestricted linear grammars produce a proper superset of regular languages ($a^nb^n$ is linear), are not necessarily deterministic (the set of palindromes is linear), and are a proper subset of context free languages (the Dyck language [Note 1] has no linear grammar. The Dyck language has a deterministic grammar, so linear languages​ are not a superset of deterministic languages, either.)

It is not decidable whether a CFG has a linear grammar, a result proven by Sheila Greibach in 1966 (http://dl.acm.org/citation.cfm?doid=321356.321365)


Notes

  1. The Dyck language is the set of all strings of square brackets where the brackets are balanced, given by the deterministic CFG:

$$S \to \epsilon \mid \mathbb{[}\; S\; \mathbb{]} \; S$$

rici
  • 12,150
  • 22
  • 40