'Pandas GroupBy error occurs only in large dataset
I use such code to select rows with max value in gropus:
set_f = set.loc[set.reset_index().groupby(['Scan Number'])['dda246displmils'].idxmax()]
and this works perfectly fine with dataset od ~1M rows but i get this error when try to group 38M rows:
KeyError: 'Passing list-likes to .loc or [] with any missing labels is no longer supported, see https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#deprecate-loc-reindex-listlike'
What is the reason? Is there any other option for bigger dataset?
Thanks, Paulina
Solution 1:[1]
Problem is you want select original index values by new created by reset_index, so raise error.
Solution is reassign back before loc:
df = set.reset_index()
set_f = df.loc[df.groupby(['Scan Number'])['dda246displmils'].idxmax()]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | jezrael |
