'Pandas GroupBy error occurs only in large dataset

I use such code to select rows with max value in gropus:

set_f = set.loc[set.reset_index().groupby(['Scan Number'])['dda246displmils'].idxmax()]

and this works perfectly fine with dataset od ~1M rows but i get this error when try to group 38M rows:

KeyError: 'Passing list-likes to .loc or [] with any missing labels is no longer supported, see https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#deprecate-loc-reindex-listlike'

What is the reason? Is there any other option for bigger dataset?

Thanks, Paulina



Solution 1:[1]

Problem is you want select original index values by new created by reset_index, so raise error.

Solution is reassign back before loc:

df = set.reset_index()
set_f = df.loc[df.groupby(['Scan Number'])['dda246displmils'].idxmax()]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 jezrael