'Python match on EXACT word, no more, no less
Have simplied the code I am working on below to present the issue I am facing. I am new to python, so this might shine through in my code, so please bear with me.
import re
vlan_names = ['PL-BB', 'PL-BB-VoIP']
vlan_lines = ['create vlan "PL-BB"', 'configure vlan PL-BB tag 135', 'create vlan "PL-BB-VoIP"', 'create vlan "PL-BB-VoIP"']
vlan_config_lines = []
for vlan_line in vlan_lines:
for vlan_name in vlan_names:
if re.search(r'\b' + vlan_name + r'\b', vlan_line):
vlan_config_lines.append(vlan_line.strip("\n"))
print (vlan_config_lines)
The result of this is below:
['create vlan "PL-BB"', 'configure vlan PL-BB tag 135', 'create vlan "PL-BB-VoIP"', 'create vlan "PL-BB-VoIP"', 'create vlan "PL-BB-VoIP"', 'create vlan "PL-BB-VoIP"']
The issue is when running through the iteration the regex search is matching the 'PL-BB' in the word 'PL-BB-VOIP' and why lines like these are repeated twice:
'create vlan "PL-BB-VoIP"'
What I am struggling with is a solution to stop that problem i.e. I need to have an exact match on the comparison, no more and no less, which should hopefully stop the repetition and wondering if anyone can help.
Many thanks in advance
Solution 1:[1]
How about trying with negative lookahead. We match given vlan_name
and check if the following part doesn't mess with our desired pattern.
import re
vlan_names = ['PL-BB', 'PL-BB-VoIP']
vlan_lines = ['create vlan "PL-BB"', 'configure vlan PL-BB tag 135', 'create vlan "PL-BB-VoIP"', 'create vlan "PL-BB-VoIP"']
vlan_config_lines = []
for vlan_line in vlan_lines:
for vlan_name in vlan_names:
if re.search(r'\b' + vlan_name + r'(?![-])\b', vlan_line):
vlan_config_lines.append(vlan_line.strip("\n"))
print(f"Found '{vlan_name}' in '{vlan_line.strip()}'")
print(vlan_config_lines)
Explained here on your example: https://regex101.com/r/GHNF1e/1
Solution 2:[2]
If you are not looking for characters but a complete word, use split function. It converts a string into list of words.
vlan_names = ['PL-BB', 'PL-BB-VoIP']
vlan_names2 = vlan_names + [f'"{name}"' for name in vlan_names]
vlan_lines = ['create vlan "PL-BB"', 'configure vlan PL-BB tag 135', 'create vlan "PL-BB-VoIP"', 'create vlan "PL-BB-VoIP"']
matching_lines = [line for line in vlan_lines for word in line.split() if word in vlan_names2]
Output:
matching_lines=
['create vlan "PL-BB"',
'configure vlan PL-BB tag 135',
'create vlan "PL-BB-VoIP"',
'create vlan "PL-BB-VoIP"']
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | K4jt3c |
Solution 2 |