'Construct an NFA/DFA/Regular Expression

Hey so im trying to make a NFA/DFA or Regular expression for this language. l = {Even-length Strings over the alphabet {0,1} of at least length 6 that begin and end with the same symbol.}

This is the NFA i have so far enter image description here



Solution 1:[1]

The proposed NFA isn't doing what you want. For example it accepts a string of nine 1's. It will also accept nine 1's followed by eight 0's.

For a regex, suppose for now we're only concerned with the "starting and ending with 1" case. We want always to add characters in groups of 2. Adding a second character is 1(0|1). The minimum length string has 6 characters and ends with 1. That suggests two 1's with four of any character in the middle: 1(0|1)(0|1)(0|1)(0|1)1. Now we need to allow any number of pairs of characters added in the middle so the length always remains even. This suggests 1 (0|1) (0|1) (0|1) (0|1) ((0|1)(0|1))* 1. To recap, this is all even length strings of length at least 6 that start and end in 1.

The same for first/last character 0 is just substitution: 0 (0|1) (0|1) (0|1) (0|1) ((0|1)(0|1))* 0

So an answer is these two "or"ed with alternation:

1 (0|1) (0|1) (0|1) (0|1) ((0|1)(0|1))* 1 | 0 (0|1) (0|1) (0|1) (0|1) ((0|1)(0|1))* 0

Note I've added spaces for readability in all regexes above.

The idea for an NFA is similar. If you are familiar with Thompson's algorithm, this is roughly what it produces from the regex.

NFA

Note this is nearly a DFA. There are a couple of states with non-deterministic transitions. You can apply one of the well-known NFA to DFA conversion algorithms to get it. Or you can just reason it out. A good exercise.

I've been assuming "pure" classical regexes. If you allow the normal regex library syntax, the final result can be shorter: 1[01]{2}([01]{2})+1|0[01]{2}([01]{2})+0

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1