'Using Regex to move some letter of a string to a new location in the same string in a Series of strings in python
I have a list of 4000 strings. The naming convention needs to be changed for each string and I do not want to go through and edit each one individually.
The list looks like this:
data = list()
data = ['V2-FG2110-EMA-COMPRESSION',
'V2-FG2110-SA-COMPRESSION',
'V2-FG2110-UMA-COMPRESSION',
'V2-FG2120-EMA-DISTRIBUTION',
'V2-FG2120-SA-DISTRIBUTION',
'V2-FG2120-UMA-DISTRIBUTION',
'V2-FG2140-EMA-HEATING',
'V2-FG2140-SA-HEATING',
'V2-FG2140-UMA-HEATING',
'V2-FG2150-EMA-COOLING',
'V2-FG2150-SA-COOLING',
'V2-FG2150-UMA-COOLING',
'V2-FG2160-EMA-TEMPERATURE CONTROL']
I need all each 'SA' 'UMA' and 'EMA' to be moved to before the -FG.
Desired output is:
V2-EMA-FG2110-Compression
V2-SA-FG2110-Compression
V2-UMA-FG2110-Compression
...
The V2-FG2 does not change throughout the list so I have started there and I tried re.sub and re.search but I am pretty new to python so I have gotten a mess of different results. Any help is appreciated.
Solution 1:[1]
You can rearrange the strings.
new_list = []
for word in data:
arr = word.split('-')
new_word = '%s-%s-%s-%s'% (arr[0], arr[2], arr[1], arr[3])
new_list.append(new_word)
Solution 2:[2]
You can replace matches of the following regular expression with the contents of capture group 1:
(?<=^[A-Z]\d)(?=.*(-(?:EMA|SA|UMA))(?=-))|-(?:EMA|SA|UMA)(?=-)
The regular expression can be broken down as follows.
(?<=^[A-Z]\d) # current string position must be preceded by a capital
# letter followed by a digit at the start of the string
(?= # begin a positive lookahead
.* # match >= 0 chars other than a line terminator
(-(?:EMA|SA|UMA)) # match a hyphen followed by one of the three strings
# and save to capture group 1
(?=-) # the next char must be a hyphen
) # end positive lookahead
| # or
-(?:EMA|SA|UMA) # match a hyphen followed by one of the three strings
(?=-) # the next character must be a hyphen
(?=-) is a positive lookahead.
Evidently this may not work for versions of Python prior to 3.5, because the match in the second part of the alternation does not assign a value to capture group 1: "Before Python 3.5, backreferences to failed capture groups in Python re.sub were not populated with an empty string.. This quote is from
@WiktorStribi?ew 's answer at the link. For what it's worth I confirmed that Ruby has the same behaviour ("V2-FG2110-EMA-COMPRESSION".gsub(rgx,'\1') #=> "V2-EMA-FG2110-COMPRESSION").
One could of course instead replace matches of (?<=^[A-Z]\d)(-[A-Z]{2}\d{4})(-(?:EMA|SA|UMA))(?=-)) with $2 + $1. That's probably more sensible even if it's less interesting.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Mael_Jourdain |
| Solution 2 |
