'How to identify top 30% salary in python dataframe
I have dataframe as follows:
| Employee | Salary |
|---|---|
| Tony | 50 |
| Alan | 45 |
| Lee | 60 |
| David | 35 |
| Steve | 65 |
| Paul | 48 |
| Micky | 62 |
| George | 80 |
| Nigel | 64 |
| John | 42 |
The question is to identify:
- Top 30% gets a value “high”
- The next 40% gets “average”
- the Rest as "Low" -and put it in a new column as the corresponding value
it would be easy to identify top N of them but top 30% is something I am unable to understand how to go about the %. Can anyone help me with python code for this??
Solution 1:[1]
If you think about what a percentage actually is, it only shows the proportion of something. It depends on the amount of people in your list.
Therefore, the top 30% can actually be translated into a number of people.
Assume your data has N employees. Taking the top 30% salaries is the same as taking the 30xN/100 people that have the biggest wage.
If you order your data, then the only thing you actually have to do is setting "high" for these 30xN/100 people, "average" for the 40x100/N next, and "low" for the rest
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Maxime Bouhadana |
