'Find a string when "hidden" in other characters

I want to search for strings that have been obfuscated in larger strings. But only to a limited extent. Possibly within 10-15 characters, and case-insensitive.

I found a solution that I think might start do the trick for finding the strings, but it searches the entire target string, when I only want to find results that are close together.

def f(orig, word):
    idx = 0
    for letter in word:
        x = orig.find(letter, idx)
        if x != -1:
            idx = x
        else:
            return False
    return True

Lets say I have an original string of "4X5G" And I have the following targets:

  1. "Ipsum 47 loreix 5-g blue scuba rock." 4X5G are 13 characters apart, I would want to return this a potential candidate.

  2. ""Ipsum 47 loreix blue scuba 5-g rock." 4X5G are 24 characters apart, I would NOT want to return this a potential candidate.

I'm worried that this might become incredibly computationally costly when it comes to large target strings?



Solution 1:[1]

string = '4X5G'
list_of_chars = list(string)
candidate = 'Ipsum 47 loreix 5-g blue scuba rock.'.lower()

pos = 0
first_pos = None
last_pos = 0
for el in list_of_chars:
    if el.lower() in candidate:
        pos = candidate.index(el.lower(), pos)
        if first_pos == None:
            first_pos = pos
        last_pos = pos
    else:
        raise Exception

if last_pos - first_pos < 15:
    print(candidate)

First we need to check if every character is in candidate, changing start index each time we find char. Then just checking first and last poses, if their difference is acceptable.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Dzidzoiev Artem