'Pandas Merging on two columns
I am new to pandas and want to merge two tables with the help of two columns. A row can only be identified with both columns in combination.
Example:
Table 1. Table2.
Index A B C D. Index A B C
1. a a c d. 1. a b j
2. a b e f. 2. a c k
3. a c g h
Result:
Table
Index A B C D E
1. a a c d na
2. a b e f j
3. a c g h k
I tried something like:
df_new = df_1.merge(df_2, on=[‘A’,’B’]
But I got the error B is not unique
(In the real case the table contain every value in a and b multiple times, but the combination is unique.)
Many thanks in advance.
Solution 1:[1]
Take the columns you wish to experiment with first, and then use this code as an example.
a_dataframe["AB"] = a_dataframe["A"] + a_dataframe["B"]
Then add the rest of the columns. There could be a simpler solution.
Solution 2:[2]
In my case it works:
import pandas as pd
df1 = pd.DataFrame({"A":["a","a","a"],
"B":["a", "b", "c"],
"C":["c", "e", "g"],
"D":["d", "f", "h"]})
df2 = pd.DataFrame({"A":["a", "a"],
"B":["b", "c"],
"C":["j", "k"]})
Output:
pd.merge(df1, df2, on=["A", "B"], how="left").rename(columns={"C_x":"C", "C_y":"E"})
A B C D E
0 a a c d NaN
1 a b e f j
2 a c g h k
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | marc_s |
| Solution 2 | Marco_CH |
