'How to group one column and rank values by grouped totals of another column

Situation

I have a dataframe of tweets pulled from the Twitter API. Each tweet has an author_id parameter, and a retweets and an engagements parameter.

I'd like to group the tweets by author_id, as some authors have posted more than one tweet, and then rank them by the total number of engagements from the sum of each author's tweets.

Example Dataframe

index text author_id retweets engagement
2022-03-20 Max Muncy is great! 100 3 2
2022-03-21 The worst Dodger is Max muncy. 101 1 8
2022-03-24 Max Muncy was great, but not so much now. 101 1 4
2022-03-28 What a fantastic guy, that Max muncy. 102 4 5
2022-03-31 Max Muncy, my vote for MVP. 103 8 17

What I tried

df01a.groupby(['author_id']).sum()

That returned this:

author_id text retweets engagement
100 Max Muncy is great! 3 2
101 The worst Dodger is Max muncy.Max Muncy was great, but not so much now. 11 84
102 What a fantastic guy, that Max muncy. 4 5
103 Max Muncy, my vote for MVP. 8 17

Assessment

The first thing I notice is that the retweet and engagement figures are concatenating instead of summing, and I've not been able to sort out why, as the sum method seems pretty straight-forward.

Thank you for any input on what I might be doing wrong here.

UPDATE - 4/5

Just noticed the note on datatypes so posting that here:

text          object
author_id     object
retweets      object
engagement    object
dtype: object


Solution 1:[1]

First make sure the engagement column is numerical and not a string:

df['engagement'] = pd.to_numeric(df['engagement'])

From what you are saying you do not need to use groupby here. Simply sort by author_id and engagment.

df.sort_values(['author_id', 'engagement'], ascending=[True, True])

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 le_camerone