'regex. group front and surname from the Pathstring with Python
I need to extract the Names from the following strings (folder_names). I made them into raw strings. some examples:
'.\\\\Jens, Jensen\\\\Rechnungen\\\\Rechnungen 2020\\\\somefoldername'
'.\\Harald, Hardraala\\Rechnungen 2017'
'.\\A - H\\Johan, Johanson\\Rechnungen 2017'
'.\\\\Jens-Haudraf, Johan\\\\Rechnungen\\\\Rechnungen 2020\\\\anotherfoldername'
'.\\A - H\\Funke, Felix'
I want the Names in one group. I can't do it. This is what I came up with
r'\\*(\w*\-{0,1},{0,1} {0,1}\w*)'
Solution 1:[1]
The following code will extract the names assuming the format remains the same i.e. one word name (possibly with hyphen) + comma + another one word name (possibly with hyphen).
import re
strings = ['.\\\\Jens, Jensen\\\\Rechnungen\\\\Rechnungen 2020\\\\somefoldername',
'.\\Harald, Hardraala\\Rechnungen 2017',
'.\\A - H\\Johan, Johanson\\Rechnungen 2017',
'.\\\\Jens-Haudraf, Johan\\\\Rechnungen\\\\Rechnungen 2020\\\\anotherfoldername',
'.\\A - H\\Funke, Felix']
matches = [ re.search("[\w-]+, [\w-]+",s).group() for s in strings ]
print(matches)
>>>
['Jens, Jensen', 'Harald, Hardraala', 'Johan, Johanson', 'Jens-Haudraf, Johan', 'Funke, Felix']
Solution 2:[2]
You could match a backslash followed by word characters with an optional hyphened part. Then match a space and again word characters.
The value is in the first capturing group.
Pattern
\\(\w+(?:-\w+)?, \w+)
In parts
\\Match\(Capture group 1\w+(?:-\w+)?Match 1+ word chars with an optional - and 1+ word chars, \w+Match a comma, space and 1+ word chars
)Close group 1
Example code
import re
regex = r"\\(\w+(?:-\w+)?, \w+)"
strings = [
'.\\\\Jens, Jensen\\\\Rechnungen\\\\Rechnungen 2020\\\\somefoldername',
'.\\Harald, Hardraala\\Rechnungen 2017',
'.\\A - H\\Johan, Johanson\\Rechnungen 2017',
'.\\\\Jens-Haudraf, Johan\\\\Rechnungen\\\\Rechnungen 2020\\\\anotherfoldername',
'.\\A - H\\Funke, Felix'
]
for s in strings:
matches = re.search(regex, s)
if matches:
print(matches.group(1))
Output
Jens, Jensen
Harald, Hardraala
Johan, Johanson
Jens-Haudraf, Johan
Funke, Felix
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Patrick von Glehn |
| Solution 2 |
