'Python regular expression how to deal with multiple back slash \
I’m dealing with text data and having problem erasing multiple back slashes. I found out that using .sub works quite well. So I coded as below to erase back slash+r n t f v
temp_string = re.sub(r"[\t\n\r\f\v]"," ",string)
However, the code above can’t deal with the string below.
string = '\\\\r \\\\nLove the filtered water and crushed ice in the door.'
So coded as this:
temp_string = re.sub(r"[\\\\t\\\\n\\\\r\\\\f\\\\v]"," ",string)
temp_string
But it’s showing result like this..
I don’t know why this happens.
Erasing all the v,f,n and so on..
I found out using .replace(“\\\\r”,” ”) works!
However,in this way, i should go like..
.replace(“\\\\r”,” ”)
.replace(“\\\r”,” ”)
.replace(“\\r”,” ”)
.replace(“\r”,” ”)
.replace(“\\\\t”,” ”)
…
I’m pretty sure there’d be better way..
Solution 1:[1]
Since escape characters are not the same as characters with a backslash before them, you will need to define a mapping for the escape characters you want to replace.
string = '\\\\r \\\\\nLove the \nfiltered \\twater \\and crushed ice in the door.'
esc_map = {'\\n': '\n',
'\\t': '\t',
'\\r': '\r'}
# replace characters that should be escaped characters
for key, value in esc_map.items():
string = string.replace(key, value)
# group escape character that might have backslashes prefixed
re_str = r'\\*({})'.format(r'|'.join(esc_map.values()))
# remove extra backslashes
string = re.sub(re_str,r'\1',string)
# replace an escape character with a space
string = re.sub(re_str,r' ',string)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | blackdrumb |
