'Pandas: Resample dataframe column, get discrete feature that corresponds to max value

Sample data:

import pandas as pd
import numpy as np
import datetime

data = {'value': [1,2,4,3], 'names': ['joe', 'bob', 'joe', 'bob']}
start, end = datetime.datetime(2015, 1, 1), datetime.datetime(2015, 1, 4)
test = pd.DataFrame(data=data, index=pd.DatetimeIndex(start=start, end=end, 
       freq="D"), columns=["value", "names"])

gives:

          value names
2015-01-01  1   joe
2015-01-02  2   bob
2015-01-03  4   joe
2015-01-04  3   bob

I want to resample by '2D' and get the max value, something like:

df.resample('2D')

The expected result should be:

          value names
 2015-01-01 2   bob
 2015-01-03 4   joe

Can anyone help me?



Solution 1:[1]

Use apply and return the row with maximal value. It will get labeled via the resample

test.resample('2D').apply(lambda df: df.loc[df.value.idxmax()])

            value names
2015-01-01      2   bob
2015-01-03      4   joe

Solution 2:[2]

The idxmax works well unless there are missing values in the dates. For example, if you resample every day, and one day has no values, instead of returning Nan, idxmax will raise an error.

The following is how to overcome the problems

def map_resample_columns(original_df, resample_df, key_col, cols):
    """
    The function will add the col back to resampled_df
    input: resample_df is resampled from original df based on key_col
    cols: list of columns from original_df to be added back to resample_df    
    """
    for col in cols:
        record_info = []
        for idx, row in resample_df.iterrows():
            val = row[key_col]
            if not np.isnan(val):
                record_info.append(original_df[original_df[key_col] == val][col].tolist()[0])
            else:
                record_info.append(np.nan)
        resample_df[col] = record_info
    return resample_df

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Jinbo