Here are three brand new problems about context-free language. (
Sipser 2.22 - 24) There problems are pretty tricky, and needed a little time to work them out, some solutions are taken from online study materials.
Let $C=\{xay|x,y \in \{0,1\}^* \textrm{ and }x \neq y\}$. Show that $C$ is context free.
We are going to design an NPDA for this language. Note that a string of the form $xay$ is a member of language $C$ if and only if $x \neq y$, and the latter holds if and only if for some position $i, x_i \neq y_i$. A PDA can recognize language $C$ as
follows:
- Skip input symbols and use the stack to remember the number of symbols skipped, until a position in $x$ is nondeterministically guessed.
- Remember the symbol at this position in its nite control, and skip all the symbols until it seems $a$.
- Using the stack, skip an equal number of symbols as in Step 1.
- Check if the current symbol is diff erent from the symbol remembered in Step 2.
More precisely, the PDA does the following:
- Read the next input symbol, and push a 1 onto the stack.
- Nondeterministically jump to either Step 1 or Step 3.
- Remember the current input symbol in the nite control.
- Read the next input symbols until $a$ is read.
- Read the next symbol, and pop the stack.
- If the stack is empty, go to Step 7, else go to Step 5.
- Accept if the current input symbol is di fferent from the symbol remembered in Step 3. Otherwise reject.
Another problem of the same type is as follows:
Let $D=\{xy|x,y \in \{0,1\}^* \textrm{ and }|x| = |y|\}$. Show that $D$ is context free.
We construct a PDA that recognizes language D. A string w is a member of language
$D$ if and only if $|w|$ is even and $w$ can be written as $w = xy$ where $|x|= |y|$and $x_i \neq y_i$ for some
position $i$. The challenge is to guess the right positions in $x$ and $y$. We make the key observation
that if $w = xy$ where $|x| = |y|$, then for each index $i$, the number of symbols in w from $x_{i+1}$ to
$y_i$, is exactly equal to the number of all other symbols in $w$. A PDA can recognize language D by
guessing a place in $x$ and a place in $y$, and using the stack to check whether the number of symbols
between the two places equals the number of other symbols. More precisely, the PDA does the
following:
- Read the next input symbol and push a 1 onto the stack.
- Nondeterministically jump to either Step 1 or Step 3.
- Remember the current input symbol in the finite control.
- Read the next input symbol and pop the stack. Repeat until the stack is empty.
- Read the next input symbol and push a 1 onto the stack.
- Nondeterministically jump to either Step 5 or Step 7.
- Reject if the current symbol is the same as the symbol remembered in Step 3.
- Read the next input symbol and pop the stack. Repeat until the stack is empty.
- Accept if the input is empty.
The stack memory goes through two maximas (in general case - without considering the boundary condition) and the distance between the two peaks is the distance between $x_i$ and $y_i$. At the end of the input string, the stack is
necessarily empty. So the only accepting condition is the empty stack condition.
Another problem worth discussing is as follows:
Let $E=\{a^ib^j| i \neq j \textrm{ and } 2i \neq j\}$. Show that $E$ is context free language.
This is pretty interesting, as this problem is the conglomeration of three different problems. To satisfy the second constraint, we can write $E$ as \[E=\{a^ib^j| j < i \textrm{ OR } i < j<2i \textrm{ OR } j>2i\}\]
Designing the CFG for the the two boundary cases ($ j < i $ and $j>2i$) is easy, and need not to be discussed. The interesting part is to design the CFG for the $i < j <2i$ part. A key observation is for this constraint to hold, $i$ and $j$ can be rewritten as $i=h+l$ and $j=h+2l$ for some $h,l > 0$. Then the language becomes $\{a^la^hb^hb^{2l}\}$, which is definitely a CFL. So union of these three CFLs is a CFL indeed. (
Thanx to course material of NYU).