'Separate string by multiple separators and return separators and separated strings
I want to separate strings by separators constist of more than one char saved in the variable sep_list.
My aim then is to receive the last separated string s1 and the last separator which has s1 on his right hand side.
sep_list = ['→E', '¬E', '↓I']
string1 = "peter →E tom ¬E luis ↓I ed"
string2 = "sigrid →E jose l. ¬E jose t."
Applied on string1 the algorithm should return the string s1:
"↓I, ed"
and applied on string2 the algorithm should return the string s1:
"¬E, jose t."
What is a way to do that with python?
Solution 1:[1]
Another way to do so using regex:
import re
sep_list = ['?E', '¬E', '?I']
string1 = "peter ?E tom ¬E luis ?I ed"
string2 = "sigrid ?E jose l. ¬E jose t."
def separate_string(data, seps):
pattern = "|".join(re.escape(sep) for sep in seps)
start, end = [m.span() for m in re.finditer(pattern, data)][-1]
return f"{data[start:end]},{data[end:]}"
print(separate_string(string1, sep_list)) # ?I, ed
print(separate_string(string2, sep_list)) # ¬E, jose t.
- We create a regex pattern by separating each keyword with
|. - For each match in the string, we use
m.span()to retrieve the start and end of the match. We only keep the last match. data[start:end]is the separator, whiledata[end:]is everything after.
Solution 2:[2]
Assuming the separators may exist in any order (or not at all), you could do this:
sep_list = ['?E', '¬E', '?I']
string1 = "peter ?E tom ¬E luis ?I ed"
string2 = "sigrid ?E jose l. ¬E jose t."
def process(s):
indexes = []
for sep in sep_list:
if (index := s.find(sep)) >= 0:
indexes.append((index, sep))
if indexes:
indexes.sort()
t = indexes[-1]
return f"{t[1]},{s[t[0]+len(t[1]):]}"
print(process(string1))
print(process(string2))
Output:
?I, ed
¬E, jose t.
Solution 3:[3]
Update: This solution does not need the re module! Update #2: Shorter solution.
def run(string):
sep_lst = ['?E', '¬E', '?I']
tokens = string.split()
result = None
for i,token in enumerate(tokens):
if token in sep_lst:
result = f'{tokens[i]}, {" ".join(tokens[i+1:])}'
return result
print(run("peter ?E tom ¬E luis ?I ed"))
print(run("sigrid ?E jose l. ¬E jose t."))
Output:
?I, ed
¬E, jose t.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Cubix48 |
| Solution 2 | Albert Winestein |
| Solution 3 |
