'How to simplify a pandas dataframe based on treshold value

Here's my dataframe

      A       B        C        D
1     0    0.41     0.35     0.61
2     0    0.41     0.35        0
3     0    0.21        0        0
4  0.11     0.4     0.53        0

I want to only display columns or rows that contains value more than 0.5, like this

       C        D
1    0.35     0.61
4    0.53        0

How suppose I should do that



Solution 1:[1]

Use DataFrame.gt for test greater values with DataFrame.any for test if match at least one value and filter in DataFrame.loc:

m = df.gt(0.5)
df1 = df.loc[m.any(axis=1), m.any()]
print (df1)
     C     D
1  0.35  0.61
4  0.53  0.00

Solution 2:[2]

You can use a double boolean indexing. Compute a boolean mask with gt. Then check if any value is True on each axis and use this for selection using loc

m = df.gt(0.5)
df.loc[m.any(1), m.any(0)]

output:

      C     D
1  0.35  0.61
4  0.53  0.00
Intermediates

m:

       A      B      C      D
1  False  False  False   True
2  False  False  False  False
3  False  False  False  False
4  False  False   True  False

m.any(1):

1     True
2    False
3    False
4     True
dtype: bool

m.any(0):

A    False
B    False
C     True
D     True
dtype: bool

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 jezrael
Solution 2