'Calculate the average date every x rows

Previously I posted a question to calculate the average value for every 10 rows. It was successful in Zero's coding. Calculating the average value for every 10 cells in each column by pandas

However, there is an error in calculating the average value of the time date

import numpy as np

location2='C:\\Users\\Poon\\Downloads\\20211014_SBS_BEMS\\20211014_SBS_BEMS\\Test1044.csv'
csvfiles2=glob.glob(location2)

df3=pd.DataFrame()

for file_new_2 in csvfiles2: 
    df3=pd.read_csv(file_new_2)

    df4=pd.concat([pd.to_datetime(df3.iloc[:,0]), df3.iloc[:, 1:].apply(pd.to_numeric)], axis = 1)
    df4.dropna(inplace = True)
    df4= df4.groupby(np.arange(len(df4))//10).mean()

print(df4)

The error code is

Unable to parse string "2019-05-19 00:00:00" at position 0

I guess the commend pd.to_datetime cannot be summed up then divided by 10?

Here are some of the data from my excel, but totally there are 100k rows.

19/5/2019 0:00  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:01  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:02  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:03  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:04  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:05  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:06  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:07  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:08  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:09  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:10  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:11  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:12  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:13  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:14  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:15  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:16  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:17  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:18  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:19  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:20  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:21  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:22  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:23  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:24  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:25  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:26  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:27  8840    20  237 64.93   82.35   16.15   46.88


Solution 1:[1]

You can use resample:

>>> (df4.assign(**{'dt': pd.to_datetime(df.iloc[:, 0])})[1:]
        .resample('10T', on='dt').mean())

                          1     2      4      5      6
dt                                                    
2019-05-19 00:00:00  8840.0  20.0  82.35  16.15  46.88
2019-05-19 00:10:00  8840.0  20.0  82.35  16.15  46.88
2019-05-19 00:20:00  8840.0  20.0  82.35  16.15  46.88

It can be easier if your columns have names.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Corralien