'How to find a string that match a substring in any order?

Assuming a list as follows:

list_of_strings = ['foo', 'bar', 'soap', 'seo', 'paseo', 'oes']

and a sub string

to_find = 'eos'

I would like to find the string(s) in the list_of_strings that match the sub string. The output from the list_of_strings should be ['seo', 'paseo', 'oes'] (since it has all the letters in the to_find sub string)

I tried a couple of things:

a = next((string for string in list_of_strings if to_find in string), None) # gives NoneType object as output

&

result = [string for string in list_of_strings if to_find in string] # gives [] as output

but both the codes don't work.

Can someone please tell me what is the mistake I am doing?

Thanks



Solution 1:[1]

Do you need the letters of to_find to be next to each other or just all the letters should be in the word? Basically: does seabco match or not?

[Your question does not include this detail and you use "substring" a lot but also "since it has all the letters in the to_find", so I'm not sure how to interpret it.]

If seabco matches, then @Tim Biegeleisen's answer is the correct one. If the letters need to be next to each other (but in any order, of course), then look below:


If the to_find is relatively short, you can just generate all permutations of letters (n! of them, so here (3!) = 6: eos, eso, oes, ose, seo, soe) and check in.

import itertools
list_of_strings = ['foo', 'bar', 'soap', 'seo', 'paseo', 'oes']
to_find = 'eos'

result = [string for string in list_of_strings if any("".join(perm) in string for perm in itertools.permutations(to_find))]

https://docs.python.org/3/library/itertools.html#itertools.permutations

We do "".join(perm) because perm is a tuple and we need a string.

>>> result = [string for string in list_of_strings if any("".join(perm) in string for perm in itertools.permutations(to_find))]
>>> result
['seo', 'paseo', 'oes']

Less-obvious but better complexity would be to just get 3-character substrings of our strings (to keep them next to each other) and set-compare them to set of to_find.

list_of_strings = ['foo', 'bar', 'soap', 'seo', 'paseo', 'oes']
to_find = 'eos'

result = [string for string in list_of_strings if any(set(three_substring)==set(to_find) for three_substring in zip(string, string[1:], string[2:]))]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1