'Filter dataframe rows based on column values percentile
I have a pandas dataframe df like this:
ID | Weight | A |
---|---|---|
a | 0.15 | 1 |
a | 0.25 | 3 |
a | 0.02 | 2 |
a | 0.07 | 3 |
b | 0.01 | 1 |
b | 0.025 | 5 |
b | 0.07 | 7 |
b | 0.06 | 4 |
b | 0.12 | 2 |
I want to remove rows based on the ID column and Percentile of weight column such that, for df['ID'] = a, there are four rows. But if I want to keep at least 80% (it can vary) weight, I have to keep only rows with 0.15 and 0.25 weights (81.6%, whenever adding a weight crosses 80%, rest of the rows with the same 'ID' will be removed).
After the operation, df will become like this:
ID | Weight | A |
---|---|---|
a | 0.15 | 1 |
a | 0.25 | 3 |
b | 0.07 | 7 |
b | 0.06 | 4 |
b | 0.12 | 2 |
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|