'Replace all the occurrences of specific words

Suppose that I have the following sentence:

bean likes to sell his beans

and I want to replace all occurrences of specific words with other words. For example, bean to robert and beans to cars.

I can't just use str.replace because in this case it'll change the beans to roberts.

>>> "bean likes to sell his beans".replace("bean","robert")
'robert likes to sell his roberts'

I need to change the whole words only, not the occurrences of the word in the other word. I think that I can achieve this by using regular expressions but don't know how to do it right.



Solution 1:[1]

If you use regex, you can specify word boundaries with \b:

import re

sentence = 'bean likes to sell his beans'

sentence = re.sub(r'\bbean\b', 'robert', sentence)
# 'robert likes to sell his beans'

Here 'beans' is not changed (to 'roberts') because the 's' on the end is not a boundary between words: \b matches the empty string, but only at the beginning or end of a word.

The second replacement for completeness:

sentence = re.sub(r'\bbeans\b', 'cars', sentence)
# 'robert likes to sell his cars'

Solution 2:[2]

If you replace each word one at a time, you might replace words several times (and not get what you want). To avoid this, you can use a function or lambda:

d = {'bean':'robert', 'beans':'cars'}
str_in = 'bean likes to sell his beans'
str_out = re.sub(r'\b(\w+)\b', lambda m:d.get(m.group(1), m.group(1)), str_in)

That way, once bean is replaced by robert, it won't be modified again (even if robert is also in your input list of words).

As suggested by georg, I edited this answer with dict.get(key, default_value). Alternative solution (also suggested by georg):

str_out = re.sub(r'\b(%s)\b' % '|'.join(d.keys()), lambda m:d.get(m.group(1), m.group(1)), str_in)

Solution 3:[3]

This is a dirty way to do this. using folds

reduce(lambda x,y : re.sub('\\b('+y[0]+')\\b',y[1],x) ,[("bean","robert"),("beans","cars")],"bean likes to sell his beans")

Solution 4:[4]

"bean likes to sell his beans".replace("beans", "cars").replace("bean", "robert")

Will replace all instances of "beans" with "cars" and "bean" with "robert". This works because .replace() returns a modified instance of original string. As such, you can think of it in stages. It essentially works this way:

 >>> first_string = "bean likes to sell his beans"
 >>> second_string = first_string.replace("beans", "cars")
 >>> third_string = second_string.replace("bean", "robert")
 >>> print(first_string, second_string, third_string)

 ('bean likes to sell his beans', 'bean likes to sell his cars', 
  'robert likes to sell his cars')

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3
Solution 4 Kevin London