'pandas dataframe groupby apply multi columns and get count

I have a excel like this:

year a b
2021 12 23
2021 31 0
2021 15 21
2021 14 0
2022 32 0
2022 24 15
2022 28 29
2022 33 0

I wanna get count of condition: a>=30 and b==0 group by year the final output like this:

2021 1

2022 2

I wanna use pandas dataframe to implement this, can anyone help? I'm quite new to python



Solution 1:[1]

For count matched rows chain both conditions by & for bitwise AND and aggregate sum, Trues are processing like 1 and False like 0:

df1 = ((df.a>=30) & (df.b==0)).astype(int)
           .groupby(df['year']).sum().reset_index(name='count')
print (df1)
   year  count
0  2021      1
1  2022      2

Similar idea with helper column:

df1 = (df.assign(count = ((df.a>=30) & (df.b==0)).astype(int))
         .groupby('year', as_index=False)['count']
         .sum())

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1