'label rows in top 20% sql
Assuming this sample data in a sql table:
people income
Mary 100
Joe 60
Sam 70
Bob 85
Sarah 85
I am trying to create a column which will identify each row as within the top 20% of income. So, the expected output is:
people income performers
Mary 100 Y
Joe 60 N
Sam 70 N
Bob 85 Y
Sarah 85 Y
I have tried TOP (20) PERCENT and a CASE statement, no luck. Just tried even extracting the top 20% and kept getting errors as well.
In Pandas, I got this, no problem. In SQL, it's like underwater brain surgery using oven mitts.
Solution 1:[1]
If you're looking to compare each record to the max value of income, you can use a CASE expression to compare the current record:
SELECT people, income,
CASE
WHEN income >= (SELECT MAX(income) FROM Foo) * 0.8
THEN 'Y' ELSE 'N'
END AS performers
FROM Foo;
people income performer
Mary 100 Y
Joe 60 N
Sam 70 N
Bob 85 Y
Sarah 85 Y
Solution 2:[2]
You can use a case statement
select
people,
income,
case when income >= (select max(income) from Table) * .8 then 'Yes' else 'No' end as performer
from Table
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 |
