'Pandas groupby max not returning max value for some columns
The program is:
import numpy as np
import pandas as pd
p = {'item' : ['apple','apple','orange','orange','guns','guns','guns'],'Days' : ['Mon' , 'Tue' , 'Wed' , 'Thu' , 'Fri' , 'Sat' , 'Sun'] ,'sales' : [100 , 80 , 200 , 100 , 5 , 10 , 5]}
df = pd.DataFrame(p)
print(df)
x = df.groupby('item')
print(x.max())
But the output is:

The max day of guns happened in Sat, so why does pandas show Sun?
Solution 1:[1]
max, when called on a groupby, computes the max per-column. So 10 is the largest of [5, 10, 5], and Sun is the largest (alphabetically) of ['Fri', 'Sat', 'Sun'].
I think you want to use idxmax and .loc:
filtered = df.loc[df.groupby('item')['sales'].idxmax()]
Output:
item Days sales
0 apple Mon 100
5 guns Sat 10
2 orange Wed 200
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | richardec |
