'Create interval column, cut time interval and add remaining time as a new line to pandas dataframe

I have a dataframe with two datetime columns (StartTime and EndTime). Based on these cols I would like to create intervals. For example if the value of StartTime is 16:08:23 and the value of EndTime is 16:30:00 create a column with the label "13-17" .

# create df
df = pd.DataFrame(
    {"StartTime": ["2011-01-01 16:08:23"], "EndTime": ["2011-01-01 16:30:00"]}
)

# format datetime
df["StartTime"] = pd.to_datetime(df["StartTime"], format="%Y-%m-%d %H:%M:%S")
df["EndTime"] = pd.to_datetime(df["EndTime"], format="%Y-%m-%d %H:%M:%S")

# extract time
df["UsageFrom_hour"] = df["StartTime"].dt.hour
df["UsageUntil_hour"] = df["EndTime"].dt.hour

# extract time
df["UsageFrom_min"] = df["StartTime"].dt.minute
df["UsageUntil_min"] = df["EndTime"].dt.minute

# add leading zero
df["UsageFrom_min"] = df["UsageFrom_min"].astype(str).str.zfill(2)
df["UsageUntil_min"] = df["UsageUntil_min"].astype(str).str.zfill(2)

# combine hours and minutes and convert as float
df["From_hour_min"] = (df.UsageFrom_hour.astype(str) + "." + df.UsageFrom_min).astype(
    float
)
df["Until_hour_min"] = (
    df.UsageUntil_hour.astype(str) + "." + df.UsageUntil_min
).astype(float)

# assign labels in new col
conditions = [
    (df["From_hour_min"] > 13)
    & (df["From_hour_min"] <= 17)
    & (df["Until_hour_min"] > 13)
    & (df["Until_hour_min"] <= 17),
]

values = ["13-17"]

df["interval_label"] = np.select(conditions, values)

Output:

    StartTime           EndTime             interval_label
0   2011-01-01 16:08:23 2011-01-01 16:30:00 13-17

This works so far. However, if the value of StartTime is 16:08:23 and the value of EndTime is 17:35:00, I need to somehow truncate the time and add the time after 17:00:00 to a new row and assign that row to the label "17-18".

Needed Output:

    StartTime           EndTime             interval_label
0   2011-01-01 16:08:23 2011-01-01 17:00:00 13-17
1   2011-01-01 17:00:01 2011-01-01 17:35:00 17-18

It is even more complicated when I have a time segment that goes over different intervals. For example: 16:30:00-23:05:00. Here I have to cut off several times and add new lines. The labels I have to assign are also not always equal to 1 hour (e.g. "6-9").

Does anyone have an idea how I can solve this?

Thanks a lot.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source