'Select Value from largest index for each year [duplicate]
New to the Python world and I'm working through a problem where I need to pull a value for the largest index value for each year. Will provide a table example and explain further
Year | Index | D_Value |
---|---|---|
2010 | 13 | 85 |
2010 | 14 | 92 |
2010 | 15 | 76 |
2011 | 9 | 68 |
2011 | 10 | 73 |
2012 | 100 | 94 |
2012 | 101 | 89 |
So, the desired output would look like this:
Year | Index | D_Value |
---|---|---|
2010 | 15 | 76 |
2011 | 10 | 73 |
2012 | 101 | 89 |
I've tried researching how to apply max() and .loc() functions, however, I'm not sure what the optimal approach is for this scenario. Any help would be greatly appreciated. I've also included the below code to generate the test table.
import pandas as pd
data = {'Year':[2010,2010,2010,2011,2011,2012,2012],'Index':[13,14,15,9,10,100,101],'D_Value':[85,92,76,68,73,94,89]}
df = pd.DataFrame(data)
print(df)
Solution 1:[1]
You can use groupby + rank
df['Rank'] = df.groupby(by='Year')['Index'].rank(ascending=False)
print(df[df['Rank'] ==1])
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | dkantor |