3

i am trying to find a cfg for this cfl

L = $\{ w \mid w \text{ has an equal number of 0's and 1's} \}$

is there a way to count the number of 0's or 1's in the string?

John L.
  • 39,205
  • 4
  • 34
  • 93
Doc
  • 51
  • 1
  • 1
  • 2

2 Answers2

9

The question defining a context-free grammar for $\{\in\{0,1\}^*:\#_0(w)=\#_1(w)\}$ restricts its answers to one particular grammar and the proof for its correctness is somewhat involved. Here I would like to show a different grammar that is easier to figure out.

Since the requirement is the same number of 0's and 1's, we can grow the start non-terminal $S$ by a pair of 0 and 1 in all sorts of way. (This is more or less what your idea of "counting" should be). We will have the following production rules, $$S\to S01\mid S10\mid 01S\mid 10S\mid 01S\mid 10S\mid \epsilon.$$

However, if we check closely, the above grammar does not generate 00111100. What happens is that the above grammar will generate the first part 0011 as a whole, missing the second part 1100. That suggests us to add another rule $S\to SS$. Once we have added that rule, rules $S\to S01\mid S10\mid 01S\mid 10S$ become redundant. The final grammar is $$S\to SS\mid 0S1\mid 1S0\mid \epsilon.$$


It is clear that the word generated by that grammar has the same number of 0's and 1's.

Are all such words generated by that grammar?

The rule $S\to\epsilon$ show that unique word of length 0 can be generated.

As the induction hypothesis, suppose all words of length less than $2n\le2$ can be generated.

Let $w$ be such a word of length $2n$.

Let $d(k)$ be the difference of the number of 0's and the number of 1's in the first $k$ letter of $w$. In particular, $d(2n)=0$. Since the situation is symmetric with respect to 0 and 1, WLOG assume $w$ starts with a 0, i.e., $d(1)=1$.

  • If $w$ ends with a 1, then $w=0w'1$ for some word $w'$. $S$ generates $w$ since $w'$ have the same number of 0's and 1's and $w'$ is shorter than $w$. So $S\Rightarrow0S1\Rightarrow0w'1=w$.
  • If $w$ ends with a 0, then $d(2n-1)=-1$. Since $d(x)$ changes 1 by 1, $d(k)=0$ for some $1\lt k\lt 2n-1$. Let $w_1$ be the first $k$ letters of $w$ and $w_2$ the rest of $w$. By induction hypothesis, $S\Rightarrow w_1$ and $S\Rightarrow w_2$. Hence $S\Rightarrow SS\Rightarrow w_1w_2=w$.

We have completed mathematical induction. All such words can be generated.


Exercise. $L$ cannot be generated by a context-free grammar that has no non-terminal other than the start symbol and has no production rule in which the start symbol appears on the right-hand side more than once. In other words, a context-free grammar for $L$ must contain either at least two non-terminals or a production rule that contains two $S$s on the right hand side, such as $S\to SS$ or $S\to 0S1S$.

John L.
  • 39,205
  • 4
  • 34
  • 93
-1

S → B1S | S1B | 1SB | 1BS | BS1 | SB1 | ɛ

B → 0

Basically, this is it. For every 1 it puts, it says that there have to be a derived 0 in the string, so it stays balanced. Every order of 1's and 0's can be derived.

Or251
  • 1