'Efficient way to get the N largest values of a column

I need to get the w highest values of a column groupying by Country.

The code below is working:

w = 100
df.groupby('country').apply(lambda x: x.sort_values('x', ascending=False).head(w)

Is there a way to make this code more efficient? My dataset is huge, like 30kk rows.

Solution 1:^[1]

w = 100
df.groupby('country').nlargest(w)

According to the doc

Faster than .sort_values(ascending=False).head(n) for small n relative to the size of the Series object.

Since your w=100 is small relative to 30kk, it will be faster.

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Solution	Source
Solution 1	Ynjxsjmh