'How to turn time from string form to something pandas can recognize as time?

1 hour 29 mins
46 secs
1 min 47 secs
2 mins 19 secs
6 days 18 hours ...
How to I turn data like these in string form to something pandas can recognise? I've been thinking of something like regular expressions but it seems a bit too farfetched. Would appreciate if you could help. Stay safe.
Solution 1:[1]
Regex is actually a good candidate to solve this. Using your test dataset and slightly generalizing for months and years like so...
df = pd.DataFrame(
columns=["raw_time"],
data=[
"1 hour 29 mins",
"46 secs",
"1 min 47 secs",
"2 mins 19 secs",
"6 days 18 hours",
"2 years 2 months 3 hours",
"1 year 1 month 1 day 1 hours 1 min 1 sec",
"3 years 4 months 2 days 7 hours 38 mins 42 secs",
],
)
...we can use the snippet below to parse each string and converts it to seconds. From there it should be easy to convert to any time object you need.
# Watch out accuracy of this constant
N_SECONDS = {
"years": 12 * 30 * 24 * 3600,
"months": 30 * 24 * 3600,
"days": 24 * 3600,
"hours": 3600,
"minutes": 60,
"seconds": 1,
}
pattern = (
r"((?P<years>\d+)(\syear[s]?))? ?((?P<months>\d+)(\smonth[s]?))? "
r"?((?P<days>\d+)(\sday[s]?))? ?((?P<hours>\d+)(\shour[s]?))? "
r"?((?P<minutes>\d+)(\smin[s]?))? ?((?P<seconds>\d+)(\ssec[s]?))?"
)
def parse_string_to_seconds(time_str: str) -> int:
match = re.match(pattern, time_str)
if not match:
return None
times_match = {k: int(v) if v else 0 for k, v in match.groupdict().items()}
return sum(times_match[k] * N_SECONDS[k] for k in N_SECONDS)
df["time_seconds"] = df["raw_time"].apply(parse_string_to_seconds)
df
>>> raw_time time_seconds
>>> 0 1 hour 29 mins 5340
>>> 1 46 secs 46
>>> 2 1 min 47 secs 107
>>> 3 2 mins 19 secs 139
>>> 4 6 days 18 hours 583200
>>> 5 2 years 2 months 3 hours 67402800
>>> 6 1 year 1 month 1 day 1 hours 1 min 1 sec 33786061
>>> 7 3 years 4 months 2 days 7 hours 38 mins 42 secs 103880322
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | swimmer |
