'Pandas Merging on two columns

I am new to pandas and want to merge two tables with the help of two columns. A row can only be identified with both columns in combination.

Example:
Table 1.               Table2.  
Index A B C D.         Index A B C 
 1.   a a c d.         1.    a b j
 2.   a b e f.         2.    a c k
 3.   a c g h


Result:

Table
Index A B C D E
 1.   a a c d na
 2.   a b e f j
 3.   a c g h k


I tried something like:

df_new = df_1.merge(df_2, on=[‘A’,’B’]

But I got the error B is not unique

(In the real case the table contain every value in a and b multiple times, but the combination is unique.)

Many thanks in advance.



Solution 1:[1]

Take the columns you wish to experiment with first, and then use this code as an example.

a_dataframe["AB"] = a_dataframe["A"] + a_dataframe["B"]

Then add the rest of the columns. There could be a simpler solution.

Solution 2:[2]

In my case it works:

import pandas as pd

df1 = pd.DataFrame({"A":["a","a","a"], 
                    "B":["a", "b", "c"], 
                    "C":["c", "e", "g"],
                    "D":["d", "f", "h"]})

df2 = pd.DataFrame({"A":["a", "a"], 
                    "B":["b", "c"], 
                    "C":["j", "k"]})

Output:

pd.merge(df1, df2, on=["A", "B"], how="left").rename(columns={"C_x":"C", "C_y":"E"})

    A   B   C   D   E
0   a   a   c   d   NaN
1   a   b   e   f   j
2   a   c   g   h   k

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 marc_s
Solution 2 Marco_CH