'How to remove columns with greater than 90% of zeros in pandas dataframe

Unable to trace the issue with the code. Basically the problem statement is to remove all the columns with greater than 90% of zeros. Following is the code :

num_vars = data.select_dtypes(include=['float64', 'int64'])
num_vars.shape 
(1904, 500)
# Removing variables with >90% 0's
for i in num_vars.columns:
    if ((len(num_vars[i].loc[num_vars[i]==0])/len(num_vars))>0.9): #checking if 90% data is zero
        num_vars.drop(i,axis=1,inplace=True) #delete the column
num_vars.shape 
(1904, 500)

As seen above even after running the loop function to remove the columns with > 90% 0's, num_vars.shape still remains the same. Not sure where is the issue. Please guide.



Solution 1:[1]

This should do the job:

mask = (num_vars == 0).sum()/len(num_vars) < 0.9
new_num_vars = num_vars[num_vars.columns[mask]]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 D.Manasreh