16

I was thinking of grammars for indendation-sensitive languages and it looks like CF grammars would do the trick if combined with parameters. As an example, consider this fragment for simplified Python grammar in ANTLR-like format:

// on top-level the statements have empty indent
program  
    : statement('')+
    ;

// let's consider only one compound statement and one simple statement for now
statement(indent) 
    : ifStatement(indent)
    | passStatement(indent)
    ;

passStatement(indent)
    : indent 'pass' NEWLINE
    ;

// statements under if must have current indent plus 4 spaces
ifStatement(indent)
    : indent 'if' expression ':' NEWLINE (statement(indent '    ')+)
    ;

My question: Does this kind of grammars (CFG with parameters) have a name?

It looks that it wouldn't be hard to write a recursive descent parser for this grammar (parameters should be basically parsers). What could be the difficulties with this approach?

Does addition of parameters raise the supported language class above context-free?

Aivar
  • 285
  • 1
  • 7

3 Answers3

15

Affix grammars (parameterised context-free grammars) were studied extensively by the eminent Dutch computer scientist Cornelis HA Koster, starting with his 1962 paper "Basic English, a generative grammar for a part of English", co-written with LGLT Meertens. In 1970, he produced a formalism of the concept; a useful overview is available in his 1971 paper "Affix Grammars for Programming Languages", a version of which I found on Citeseer.

In that paper, Koster compares his formalism (and another similar one) with Van Wijngaarden two-level grammars, and finds them to be very similar.

Dick Grune's invaluable annotated bibliography of parsing techniques includes a large number of other useful references for affix grammars and other non-Chomskyian formalisms. (See section 18.2.6 of the bibliography, although there are useful papers in other sections.) Grune covers affix grammars briefly in ยง15.3.2 of the second edition of Parsing Techniques: A Practical Guide (and even more briefly in the first edition, available online) mentioning the fact that it is easy to adapt top-down (and other) parsing techniques.

Unless the domain of parameters is finite, in which case they can be removed by statically generating all possible productions resulting in a CFG, the class of affix grammars is certainly a strict superset of the set of context-free grammars. (An affix grammar for $a^n b^n c^n$ can be found in Dick Grune's Parsing Techniques reference above.)

Koster, who was also an editor of the Algol 68 report, was the original developer of the Compiler Description Language (CDL), based on his ideas about affix grammars. This toolkit and its later derivatives were used in production for many years. This page, which I found with a Google search and whose permanence I cannot guarantee, has links to the manual and download site for CDL3.

rici
  • 12,150
  • 22
  • 40
6

Take the pumping lemma for CFGs:

Take the grammar

S -> A("")
A(p) -> p 
      | p '\n' A(p"*") '\n' p 

This describes a star triangle:

*
**
***
**
*

There is no way to split a star triangle up in 5 parts $uvwxy$ such that $\{uv^nwx^ny|n>0\} $ is also a star triangle (with $vx$ non empty).

This means that the star triangle is not a context free language.

Or a simpler example:

S-> B("")
B(p)-> p 'a' p 'a' p
     | B(p 'b')

This describes the language $\{b^nab^nab^n | n\ge0 \}$ which is not context free.

ratchet freak
  • 4,696
  • 1
  • 18
  • 15
3

I've never seen this formalism presented (even in something like Grune's Parsing Techniques), depending on the details on how you define exactly "parameters should be basically parsers" it looks mappable to van Wijngaarden two level grammars, which have the same power as unrestricted phase structure grammar (i.e. more powerful than context sensitive, you could write a VW grammar which gives all halting programs).

AProgrammer
  • 3,099
  • 18
  • 20