'Pandas Time difference of each column

I have this dataframe:

ID STEP 1 STEP 2 ... STEP 40
1 2022-03-08 09:23:35 2022-03-08 10:23:35 ... 2022-03-19 09:23:35
2 2022-03-08 09:23:35 2022-03-08 11:23:35 ... 2022-03-18 09:23:35

I want to turn it into this:

ID Time 1-2 Time 1-3 ... Time 39-40
1 296 days 08:08:05 297 days 08:08:05 ... 297 days 08:08:05
2 296 days 08:08:05 297 days 08:08:05 ... 297 days 08:08:05

I created a new dataframe:

index=[]
for item1 in range (1,40):
    for item2 in range (1,40):
        if item1  < item2:
            index.append("Time from Step "+ str(item1)+" to step "+ str(item2))

df = pd.DataFrame(-1, index=np.arange(len(caseList)), columns=index)

But I don´t know how to fill in the times.



Solution 1:[1]

Use vectorial code.

If you don't have datetime type, uncomment the second step:

out = (
 df.set_index('ID')
   #.apply(lambda c: pd.to_datetime(c))
   .diff(axis=1).iloc[:, 1:]
   .set_axis([f'Time {i+1}-{i+2}' for i in range(len(df.columns)-2)], axis=1)
   .reset_index()
)

output:

   ID        Time 1-2         Time 2-3
0   1 0 days 01:00:00 10 days 23:00:00
1   2 0 days 02:00:00  9 days 22:00:00

NB. I used "STEP 40" as "STEP 3" for the example here

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 mozway