'Python regular expression how to deal with multiple back slash \

I’m dealing with text data and having problem erasing multiple back slashes. I found out that using .sub works quite well. So I coded as below to erase back slash+r n t f v

temp_string = re.sub(r"[\t\n\r\f\v]"," ",string)

However, the code above can’t deal with the string below.

string = '\\\\r \\\\nLove the filtered water and crushed ice in the door.'

So coded as this:

temp_string = re.sub(r"[\\\\t\\\\n\\\\r\\\\f\\\\v]"," ",string)
temp_string

But it’s showing result like this..

I don’t know why this happens.

Erasing all the v,f,n and so on..

I found out using .replace(“\\\\r”,” ”) works! However,in this way, i should go like..

.replace(“\\\\r”,” ”)

.replace(“\\\r”,” ”)

.replace(“\\r”,” ”)

.replace(“\r”,” ”)

.replace(“\\\\t”,” ”)

…

I’m pretty sure there’d be better way..



Solution 1:[1]

Since escape characters are not the same as characters with a backslash before them, you will need to define a mapping for the escape characters you want to replace.

string = '\\\\r \\\\\nLove the \nfiltered \\twater \\and crushed ice in the door.'

esc_map = {'\\n': '\n',
           '\\t': '\t',
           '\\r': '\r'}

# replace characters that should be escaped characters
for key, value in esc_map.items():
    string = string.replace(key, value)

# group escape character that might have backslashes prefixed 
re_str = r'\\*({})'.format(r'|'.join(esc_map.values()))
# remove extra backslashes
string = re.sub(re_str,r'\1',string)
# replace an escape character with a space
string = re.sub(re_str,r' ',string)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 blackdrumb