'check to see if first row of a group consist of said groups greatest value in pandas and subset data as such

So let's say I have the following df

pd.DataFrame(
    {
        'store':['a','a','a','a','b','b','b','b','c','c','c','c'],
        'avg_sales':[100,20,30,25,50,60,70,80,90,60,40,50],
        'month':[0,1,2,3,0,1,2,3,0,1,2,3]
     
     })

    store   avg_sales   month
0   a   100 0
1   a   20  1
2   a   30  2
3   a   25  3
4   b   50  0
5   b   60  1
6   b   70  2
7   b   80  3
8   c   90  0
9   c   60  1
10  c   40  2
11  c   50  3

What I am trying to do is find all stores where their first month (month==0) is the greatest avg_sales value. Thus, my desired output is to just keep store a and c since their stores had the greatest avg_sale on month==0:

    store   avg_sales   month
0   a   100 0
1   a   20  1
2   a   30  2
3   a   25  3
8   c   90  0
9   c   60  1
10  c   40  2
11  c   50  3

I actually started out with a datetime for each month and used the following code to get the cumcounts:

df['month'] = df.sort_values(by=['store','date']).groupby('store').cumcount()

Now I just need to figure out the logic that only keeps stores where their month==0 is their greatest avg_sales number. Obviously you can do something simple like .loc[], but it seems like you need to do another subset within the loc to find out if month==0 is actually the greatest sales number for each group? I'm a bit confused. I could just use np.where() but that wouldn't account for the multiple groups out-of-the-box...

I can use the following code to make a column that consists of the max avg_sales number for each store:

df['max_sales'] = df.groupby('store').avg_sales.transform('max')


store   avg_sales   month   max_sales
0   a   100 0   100
1   a   20  1   100
2   a   30  2   100
3   a   25  3   100
4   b   50  0   80
5   b   60  1   80
6   b   70  2   80
7   b   80  3   80
8   c   90  0   90
9   c   60  1   90
10  c   40  2   90
11  c   50  3   90

Now I am stuck on the last step of the logic.

EDIT: just fixing grammar

python pandas

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'check to see if first row of a group consist of said groups greatest value in pandas and subset data as such

Sources

Related Questions