'Making multiple "any" more efficient
I am using any to see if a string in a longer string (description) matches with any strings across several lists. I have the code working, but I feel like it's an inefficient way of doing a comparison, and would like feedback on how I can make it more efficient.
def convert_category(description):
categoryFood = ['COUNTDOWN', 'BAKE', 'MCDONALDS', 'ST PIERRE', 'PAK N SAVE', 'NEW WORLD']
categoryDIY = ['BUNNINGS', 'MITRE10']
containsFood = any(keyword in description for keyword in categoryFood)
containsDIY = any(keyword in description for keyword in categoryDIY)
if(containsFood):
return 'Food and Groceries'
elif(containsDIY):
return 'Home and DIY'
return ''
Solution 1:[1]
I would use a regular expression. They are optimized for this kind of problem - searching for any of multiple strings - and the hot part of the code is pushed into a fast library. With big enough strings you should notice the difference.
import re
foodPattern = '|'.join(map(re.escape, categoryFood))
diyPattern = '|'.join(map(re.escape, categoryDIY))
containsFood = re.search(foodPattern, description) is not None
containsDiy = re.search(diyPattern, description) is not None
You can easily extend this with word boundary or similar features to make the keyword matching be smarter/only match whole words.
Solution 2:[2]
The only way to make this faster is some negligible work to return some statements easier from the sounds of things. Marking as answered and closing.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Jay J |
