'what is the grammar G={V,L,S,P} behind the set of all strings containing an unequal number of 0s and 1s?

Find a phrase-structure grammar for each of these languages. g)the set of all strings containing an unequal number of 0s and 1s.



Solution 1:[1]

First, we can break this problem up by recognizing that an unequal number of 0s and 1s means that there are either more 0s, or more 1s. This suggests a grammar that can go either way:

S := R | T
R := (more 0s than 1s)
T := (more 1s than 0s)

The expressions for R and T should probably be pretty similar and just have the symbols reversed.

How can we guarantee there are more 0s than 1s, or vice versa? Well, we can insert at least one and maybe multiple 0s or 1s and then pad with strings that have the same number of 0s as 1s:

R := E0R | E0E
T := E1R | E1E

These will produce intermediate forms like E0E0E0E, E1E, etc. The idea here is that any string with exactly k more 0s than 1s (or 1s than 0s) can be written as k 0s (or 1s) separated by substrings with equal numbers of 0s and 1s. This seems reasonable but really should be proven (left as an exercise).

All that remains is to give productions for strings with the same number of 0s and 1s:

E := EE | 0E1 | 1E0 | e

To see this works, we can use induction. Base cases can include the shortest strings e, 01 and 10, which we can see pretty easily work. For the induction step, just note that there must be some longest prefix with the numbers of 0s and 1s equal and that its first and last symbols must be different; if it's the whole string then it can be obtained from a string of length two less byy the productions 0E1 or 1E0, otherwise, it can be obtained by production EE on two shorter strings (the prefix and suffix are shorter and covered by the induction hypothesis).

The whole grammar turns out looking like:

S := R | T
R := E0R | E0E
T := E1R | E1E
E := EE | 0E1 | 1E0 | e

Is this the shortest, most efficient, most unambiguous, etc. grammar for this language? Who knows, but it should work!

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Patrick87