'Python Dataframe delete rows after comparing multiple column values with a value

I have data frame of many columns consisting float values. I want to delete a row if any of the columns have value below 20.

code:

xdf = pd.DataFrame({'A':np.random.uniform(low=-50, high=53.3, size=(5)),'B':np.random.uniform(low=10, high=130, size=(5)),'C':np.random.uniform(low=-50, high=130, size=(5)),'D':np.random.uniform(low=-100, high=200, size=(5))})

xdf =  
           A          B           C           D
0  -9.270533  42.098425   91.125009  148.350655
1  17.771411  55.564825  106.396381  -89.082831
2 -22.602563  99.330643   17.590466   73.985202
3  15.890920  76.011631   52.366311  194.023063
4  35.202379  41.973846   32.576890  100.523902

# my code
xdf[xdf[cols].ge(20).all(axis=1)]

Out[17]: 
           A          B         C           D
4  35.202379  41.973846  32.57689  100.523902

Expected output: drop a row if any column has below 20 value

xdf =  
           A          B           C           D
4  35.202379  41.973846   32.576890  100.523902 

Is this the best way of doing it?



Solution 1:[1]

As numpy is lighter and therefore faster in terms of calculations with numbers, try this:

a = np.array([np.random.uniform(low=-50, high=53.3, size=(5)),
    np.random.uniform(low=10, high=130, size=(5)),
    np.random.uniform(low=-50, high=130, size=(5)),
    np.random.uniform(low=-100, high=200, size=(5))])

print(a[np.all(a > 20, axis=1)])

If you want to stick with pandas, another idea would be:

xdfFiltered = xdf.loc[(xdf["A"] > 20) & (xdf["B"] > 20) & (xdf["C"] > 20) & (xdf["D"] > 20)]

Solution 2:[2]

You can use the numpy equivalent of .ge instead:

xdf.loc[np.greater(xdf,20).all(axis=1)]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Daniel Seger
Solution 2