'Python Pandas check dataframe groupby, how many people have the same book combinations
So I have a list of people, each of them are given more than 2 books, 4 books are possible. I want to do a groupby and check frequency of combination of book received such as [ID, books] such ID: 1, he has Books: A, B I want to know how many people had received book combination of A and B.
Technically if someone has books A,B,C; he will have combination of (A,B),(A,C),(B,C),(A,B,C).
Input:
df = pd.DataFrame({'user': [1, 1, 2, 2, 3, 3, 3],
'disease': ['a', 'b', 'b', 'c', 'a', 'b', 'c']})[enter image description here][1]
Solution 1:[1]
You can use set
operations.
Identify users with a given target combination:
target = {'a', 'b'}
df.groupby('user')['disease'].agg(lambda x: target.issubset(x))
Output:
user
1 True
2 False
3 True
Name: disease, dtype: bool
Count the number of users that match the target:
target = {'a', 'b'}
df.groupby('user')['disease'].agg(lambda x: target.issubset(x)).sum()
Output: 2
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | mozway |