'Construct Deterministic Accepter for binary string containing at any position

do you have any idea of designing a deterministic accepter where the set of all these binary strings contains at any position? The string is: 0100101

So, the accepted input could be: 0100101 or 01001011 or 10100101 or 0100101

I appreciate any help you can provide.



Solution 1:[1]

We could apply algorithms to go from the RE to an NFA, and then from the NFA to a DFA, and from that to a minimal DFA; or, we could attempt to just write out a good DFA directly. We'll try the latter as it's not too bad here and probably illustrative of how this can be done generally.

Our strategy will be to remember how much of the target string we've currently seen using states of the DFA. If we ever reach the state where we've seen the whole thing, well, that means we saw the substring in our input, and we an accept no matter what comes after. If we run out of input without making it to that state, we reject.

Any DFA needs an initial state. Let's call ours q0. We have seen no (useful) prefix of the target string so far in this state.

From q0, we can receive as input either 0 or 1. If we see 0, we've now seen the first symbol of the target string, so we're closer to the goal. We need a new state to record this fact; we can call this q1. If we see a 1 in q0, we are no closer to seeing our substring, since our substring must start with 0. So, we might as well stay in q0: we have seen no (useful) prefix of the target string so far.

From q1, we can see either 0 or 1. In the first case, we get no closer to finding our substring, but don't get farther away either: we can stay in q1. In the latter case, we get one step closer and need a new state, q2, to represent our progress.

From q2, seeing a 0 gets us a little closer and takes us to a new state, q3. Seeing a 1 though is bad: the substring 011 is not a prefix of our target at all, so we are back where we started, having seen no useful prefix. We must return to q0 and start over.

From q3, seeing a 0 gets us a little closer and takes us to a new state, q4. Seeing a 1 is bad but doesn't take us all the way back to the start; the last two symbols we saw were 01 and 01 is a prefix of our target, so we can safely return to q2 here.

Continuing on thusly, we get a DFA with the following transition table:

q    s    q'   comment
---  ---  ---  ----------------------------------
q0   0    q1   most recent prefix seen is empty
q0   1    q0   
q1   0    q1   most recent prefix seen is 0
q1   1    q2
q2   0    q3   most recent prefix seen is 01
q2   1    q0
q3   0    q4   most recent prefix seen is 010
q3   1    q2
q4   0    q1   most recent prefix seen is 0100
q4   1    q5
q5   0    q6   most recent prefix seen is 01001
q5   1    q0
q6   0    q1   most recent prefix seen is 010010
q6   1    q7
q7   0    q7   we saw 0100101 at least once
q7   1    q7

We don't need to transition out of q7 once we get there because getting there means we saw the whole target string at least once. We can accept anything after the target and, as long as the other transitions correctly remember partial matches as we go along, anything before the target as well.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Patrick87