'Filter out entire group if all values in group are zero

Using pandas, I want to filter out all groups that contain only zero values

So in pseudo-code something like this

df.groupby('my_group')['values'].filter(all(iszero))

Example input dataframe could be something like this

df = pd.DataFrame({'my_group': ['A', 'B', 'C', 'D']*3, 'values': [0 if (x % 4 == 0 or x == 11) else random.random() for x in range(12)]})
   my_group    values
0         A  0.000000
1         B  0.286104
2         C  0.359804
3         D  0.596152
4         A  0.000000
5         B  0.560742
6         C  0.534575
7         D  0.251302
8         A  0.000000
9         B  0.445010
10        C  0.750434
11        D  0.000000

Here, group A contains all zero values, so it should be filtered out. Group D also has a zero value in row 11, but in other rows it has non-zero values, so it shouldn't be filtered out



Solution 1:[1]

IIUC use a condition to keep the rows. For this if any value in the group is not equal (ne) to zero, then keep the group:

df2 = df.groupby('my_group').filter(lambda g: g['values'].ne(0).any())

output:

   my_group    values
1         B  0.286104
2         C  0.359804
3         D  0.596152
5         B  0.560742
6         C  0.534575
7         D  0.251302
9         B  0.445010
10        C  0.750434
11        D  0.000000

Or to get only the indices:

idx = df.groupby('my_group')['values'].filter(lambda s: s.ne(0).any()).index

output: Int64Index([1, 2, 3, 5, 6, 7, 9, 10, 11], dtype='int64')

Solution 2:[2]

You can use:

>>> df[df.groupby('my_group')['values'].transform('any')]
   my_group    values
1         B  0.507089
2         C  0.846842
3         D  0.953003
5         B  0.085316
6         C  0.482732
7         D  0.764508
9         B  0.879005
10        C  0.717571
11        D  0.000000

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 mozway
Solution 2 Corralien