53

I remember coming across the following question about a language that supposedly is context-free, but I was unable to find a proof of the fact. Have I perhaps misremembered the question?

Anyway, here's the question:

Show that the language $L = \{xy \mid |x| = |y|, x\neq y\}$ is context free.

Raphael
  • 73,212
  • 30
  • 182
  • 400
Dave Clarke
  • 20,345
  • 4
  • 70
  • 114

1 Answers1

46

Claim: $L$ is context-free.

Proof Idea: There has to be at least one difference between the first and second half; we give a grammar that makes sure to generate one and leaves the rest arbitrary.

Proof: For sake of simplicity, assume a binary alphabet $\Sigma = \{a,b\}$. The proof readily extends to other sizes. Consider the grammar $G$:

$\qquad\begin{align} S &\to AB \mid BA \\ A &\to a \mid aAa \mid aAb \mid bAa \mid bAb \\ B &\to b \mid aBa \mid aBb \mid bBa \mid bBb \end{align}$

It is quite clear that it generates

$\qquad \mathcal{L}(G) = \{ \underbrace{w_1}_k x \underbrace{w_2v_1}_{k+l}y\underbrace{v_2}_l \mid |w_1|=|w_2|=k, |v_1|=|v_2|=l, x\neq y \} \subseteq \Sigma^*;$

the suspicious may perform a nested induction over $k$ and $l$ with case distinction over pairs $(x,y)$.

The length of a word in $\mathcal{L}(G)$ is $2(k+l+1)$. The letters $x$ and $y$ occur on positions $k+1$ and $2k+l+2$, respectively. When we split the word in half, i.e. after $(k+l+1)$ letters, then the first half contains the letter $x$ on position $k+1$ and the second half has the letter $y$ on position $k+1$.

Therefore, $x$ and $y$ have the same position (in their respective half), which implies $\mathcal{L}(G) = L$ because $G$ imposes no other restrictions on its language.


The interested reader may enjoy two follow-up problems:

Exercise 1: Come up with a PDA for $L$!

Exercise 2: What about $\{xyz \mid |x|=|y|=|z|, x\neq y \lor y \neq z \lor x \neq z\}$?

NerdOnTour
  • 127
  • 6
Raphael
  • 73,212
  • 30
  • 182
  • 400