'How do I get the maximum value of a column in pandas, for a given value in another column? e.g what is the max date per category?

I have a csv file with one date column and a category column, and I want to create a second table showing only the maximum value for each category for each day. I can get the value in a single line, but it is the max for the entire dataset, not per category per day and I need a new pandas dataframe to store all the results.

example below:

import pandas as pd

dict1 = {'Category': ['A', 'A', 'A',
                    'B', 'B', 'B',
                    'B',
                    'A', 'A', 'A',
                    'B', 'B', 'B',
                    'B',],

         'Date': ['2018-01-02', '2018-01-02', '2018-01-02', '2018-01-02', '2018-01-02', '2018-01-02', '2018-01-02', '2018-01-03', '2018-01-03',
                  '2018-01-03', '2018-01-03', '2018-01-03', '2018-01-03', '2018-01-03'],

         'Ending Time': ['2018-01-02 20:51:54', '2018-01-02 20:58:54' , '2018-01-02 21:01:02', '2018-01-02 22:01:02', '2018-01-02 21:01:02', '2018-01-02 22:01:02', '2018-01-02 23:01:02',
                         '2018-01-03 12:01:02','2018-01-03 13:01:02','2018-01-03 15:22:02','2018-01-03 16:23:02',
                         '2018-01-03 17:01:02','2018-01-03 18:01:02','2018-01-03 19:01:02']}


df = pd.DataFrame(dict1)

df['Date'] = pd.to_datetime(df['Date'], format='%Y %m %d')
df['Ending Time'] = pd.to_datetime(df['Ending Time'], format='%Y-%m-%d %H:%M:%S')

print(df.head())
print(df[df['Ending Time'] == df['Ending Time'].max()]) ```

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'How do I get the maximum value of a column in pandas, for a given value in another column? e.g what is the max date per category?

Sources

Related Questions