'keep only one column during merge when column names are similar
I perform a join like this:
df = df.merge(df2, left_on=["code"], right_on=["countryCode"],
how='left')
Both of my dataframes have a column with the same name money. After the join, the new df has two columns
money_x and money_y
one from each table joined. How can I alter this such that for any duplicate columns, only the second one stays with the original one. For example, in this case, only money_y from the second df should stay and the original name money should be kept instead of money_y.
Solution 1:[1]
This will make it where the 2nd df in the merge will essentially be the same while the first df in the merge will now have a '_y' included
df = df.merge(df2, left_on=["code"], right_on=["countryCode"],
how='left', suffixes=('_y', ''))
Solution 2:[2]
Select only the unique columns from the first DataFrame before join:
df1 = df[list(set(df.columns).difference(set(df2.columns)))]
output = df1.join(df2, left_on="code", right_on="countryCode"])
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | ArchAngelPwn |
| Solution 2 | not_speshal |
