'Pandas: substring search and replace column headers using a RegEx pattern in replace()
I have the following Pandas dataframes:
foo = {
"country" : ["United States", "Canada", "Japan", "Australia"],
"code" : [7, 2, 1, 4]
}
bar = {
"country_code" : ["France", "Germany", "Mexico"],
"code" : [3, 6, 8]
}
baz = {
"country_code" : ["China", "Thailand", "Israel"],
"code" : [9, 10, 11]
}
df_1 = pd.DataFrame(foo)
df_2 = pd.DataFrame(bar)
df_3 = pd.DataFrame(baz)
df_1
country code
0 United States 7
1 Canada 2
2 Japan 1
3 Australia 4
df_2
country_code code
0 France 3
1 Germany 6
2 Mexico 8
df_3
country_id code
0 China 9
1 Thailand 10
2 Israel 11
If a column header contains the substring country, I would like to be able to replace the column name as simply country. For example, "country_code" would be replaced "country" and "country_id" would be replaced with "country".
I can do this easily enough with the following:
df_2.rename(columns={'country_code' : 'country'}, inplace=True)
df_3.rename(columns={'country_id' : 'country'}, inplace=True)
Or, as follows:
col_dict = {'country_code': 'country', 'country_id': 'country'}
df_2.columns = [col_dict.get(x, x) for x in df_2.columns]
df_3.columns = [col_dict.get(x, x) for x in df_3.columns]
These approaches work, but they presume that I know the column names beforehand (which I may not).
I tried using RegEx in the .replace() method:
df_3.rename(columns={col: col.replace('[(?i)country]', 'country') for col in df_3.columns}, inplace=True)
But, this failed with this error message:
TypeError: str.replace() takes no keyword arguments
Is it even possible to use RegEx in this way? Or, is there a more elegant approach?
Any assistance would be greatly appreciated. Thanks!
UPDATE:
The following approach works and is more robust than previous attempts:
for elem in df_3.columns:
if 'country' in elem:
df_3.rename(columns={elem : 'country'}, inplace=True
)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
