'Values from two columns appear as a single column in another dataframe

One dataframe looks like this:

Thing       Number1      Number2     
   43           STK         
   64           BOX          
   32           STK          BOX
   46           THG          
   34           BOX          THG
...

And another one like this:

Thing        Package       Value1      Value2
   43            STK            1           1
   64            BOX            2           1
   64            STK            1           1
   32            STK            1           1
   32            BOX            2           3
   42            THG            1           1
   34            BOX            1           1
   34            THG            1           3
   34            STK            2           1
...

I want to eliminate the rows in the second dataframe that do not appear in the first (like for 64 I only have BOX). I merged the two and eliminated the rows where the value in Package could not be found in either of Number1 or Number2, but is there any more efficient way of doing it?



Solution 1:[1]

I think your solution is efficient.

df = (df2.merge(df1, on='Thing')
        .query("Package == Number1 | Package == Number2")
        .drop(['Number1','Number2'], axis=1))
print (df)
   Thing Package  Value1  Value2
0     43     STK       1       1
1     64     BOX       2       1
3     32     STK       1       1
4     32     BOX       2       3
5     34     BOX       1       1
6     34     THG       1       3

Added alternative for multiple Number columns in df1.

Use DataFrame.merge with DataFrame.melt:

df = df2.merge(df1.melt('Thing', value_name='Package')[['Thing','Package']])
print (df)
   Thing Package  Value1  Value2
0     43     STK       1       1
1     64     BOX       2       1
2     32     STK       1       1
3     32     BOX       2       3
4     34     BOX       1       1
5     34     THG       1       3

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1