'How can I grab rows with max date from Pandas dataframe?

I have a Pandas dataframe that looks like this:

and I want to grab for each distinct ID, the row with the max date so that my final results looks something like this:

My date column is of data type 'object'. I have tried grouping and then trying to grab the max like the following:

idx = df.groupby(['ID','Item'])['date'].transform(max) == df_Trans['date']
df_new = df[idx]

However I am unable to get the desired result.

python pandas

Solution 1:^[1]

The last bit of code from piRSquared's answer is wrong.

We are trying to get distinct IDs, so the column used in drop_duplicates should be 'ID'. keep='last' would then retrieve the last (and max) date for each ID.

df.sort_values(['ID', 'date']).drop_duplicates('ID', keep='last')

Solution 2:^[2]

My answer is a generalization of piRSquared's answer:

manykey indicates the keys from which the mapping is desired (many-to)
onekey indicates the keys to which the mapping is desired (-to-one)

sortkey is sortable key and it follows asc set to True (as python standard)

def get_last(df:pd.DataFrame,manykey:list[str],onekey:list[str],sortkey,asc=True):
     return df.sort_values(sortkey,asc).drop_duplicates(subset=manykey, keep='last')[manykey+onekey]

In your case the answer should be:

get_last(df,["id"],["item"],"date")

Note that I am using the onekey explicitly because I want to drop the rest of the keys (if they are in the table) and create a mapping.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	Nimantha
Solution 2	Nimantha

'How can I grab rows with max date from Pandas dataframe?

Solution 1:[1]

Solution 2:[2]

Sources

Related Questions

Solution 1:^[1]

Solution 2:^[2]