'How do I get the maximum value of a column in pandas, for a given value in another column? e.g what is the max date per category?
I have a csv file with one date column and a category column, and I want to create a second table showing only the maximum value for each category for each day. I can get the value in a single line, but it is the max for the entire dataset, not per category per day and I need a new pandas dataframe to store all the results.
example below:
import pandas as pd
dict1 = {'Category': ['A', 'A', 'A',
'B', 'B', 'B',
'B',
'A', 'A', 'A',
'B', 'B', 'B',
'B',],
'Date': ['2018-01-02', '2018-01-02', '2018-01-02', '2018-01-02', '2018-01-02', '2018-01-02', '2018-01-02', '2018-01-03', '2018-01-03',
'2018-01-03', '2018-01-03', '2018-01-03', '2018-01-03', '2018-01-03'],
'Ending Time': ['2018-01-02 20:51:54', '2018-01-02 20:58:54' , '2018-01-02 21:01:02', '2018-01-02 22:01:02', '2018-01-02 21:01:02', '2018-01-02 22:01:02', '2018-01-02 23:01:02',
'2018-01-03 12:01:02','2018-01-03 13:01:02','2018-01-03 15:22:02','2018-01-03 16:23:02',
'2018-01-03 17:01:02','2018-01-03 18:01:02','2018-01-03 19:01:02']}
df = pd.DataFrame(dict1)
df['Date'] = pd.to_datetime(df['Date'], format='%Y %m %d')
df['Ending Time'] = pd.to_datetime(df['Ending Time'], format='%Y-%m-%d %H:%M:%S')
print(df.head())
print(df[df['Ending Time'] == df['Ending Time'].max()]) ```
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
