'comparing two list of lists with a dataframe column python

I want to compare two list of lists with a dataframe column.
list1=[[r2,r4,r6],[r6,r7]]
list2=[[p4,p5,p8],[p86,p21,p0,p94]]

Dataset:

rid pid value
r2 p0 banana
r2 p4 chocolate
r4 p89 apple
r6 p5 milk
r7 p0 bread

Output:

[[chocolate,milk],[bread]]

As r2 and p4 occur in the list1[0], list2[0] and in the same row in dataset, so chocolate must be stored. Similarly r6 and p5 occur in both lists at same position and in the same row in dataset,milk must be stored.



Solution 1:[1]

Answer

result = []
for l1, l2 in zip(list1, list2):
    res = df.loc[df["rid"].isin(l1) & df["pid"].isin(l2)]["value"].tolist()
    result.append(res)
[['chocolate', 'milk'], ['bread']]

Explain

  • zip will combine the two lists, equivalent to
for i in range(len(list1)):
    l1 = list1[i]
    l2 = list2[i]
  • df["rid"].isin(l1) & df["pid"].isin(l2) will combine the condition with and operator &

Attation

  • The length of list1 and list2 must be equal, otherwise, zip will ignore the rest element of the longer list.

Solution 2:[2]

You can do it as follows:

from itertools import product

df = pd.DataFrame({'rid': {0: 'r2', 1: 'r2', 2: 'r4', 3: 'r6', 4: 'r7'},
 'pid': {0: 'p0', 1: 'p4', 2: 'p89', 3: 'p5', 4: 'p0'},
 'value': {0: 'banana', 1: 'chocolate', 2: 'apple', 3: 'milk', 4: 'bread'}})
list1 = [['r2','r4','r6'],['r6','r7']]
list2 = [['p4','p5','p8'],['p86','p21','p0','p94']]

# Generate all possible associations.
associations = (product(l1, l2) for l1, l2 in zip(list1, list2))

# Index for speed and convenience of the lookup.
df = df.set_index(['rid', 'pid']).sort_index()

output = [[df.loc[assoc, 'value'] for assoc in assoc_list if assoc in df.index] 
          for assoc_list in associations]

print(output)
[['chocolate', 'milk'], ['bread']]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 FavorMylikes
Solution 2