'Error tokenizing remove pattern re.findall

data

I have error like this while cleaning text, i just tried to following code from web

def remove_pattern(text, pattern):
    r = re.findall(pattern, text)
    for i in r:
        text = re.sub(i, '', text)
    return text

df['remove_user'] = np.vectorize(remove_pattern)(df['Comment'], "@[\w]*")

And I got this error:

Error



Solution 1:[1]

Use str.replace here:

df["remove_user"] = df["Comment"].str.replace(r'\W+', '', regex=True)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Tim Biegeleisen