'Efficient way of converting year_week to datetime in pandas
I have a pandas df with two columns year and week_number.
df = pd.DataFrame({'year': [2019, 2020, 2021, 2022], 'week_number':[3,12,38,42]})
df
year week_number
0 2019 3
1 2020 12
2 2021 38
3 2022 42
I know I can apply something like following to each row and convert them to datetime values however, I want to know if there is more efficient way to do this for the big dataframes and store the results in third column?
import datetime
single_day = "2013-26"
converted_date = datetime.datetime.strptime(single_day + '-1', "%Y-%W-%w")
print(converted_date)
Solution 1:[1]
I wouldn't say your way is inefficient, but if you want a fully vectorized way, without having to import another library, and which appends your dataframe, this might be what you're looking for
import pandas as pd df = pd.DataFrame({'year': [2019, 2020, 2021, 2022], 'week_number':[3,12,38,42]}) df['date'] = pd.to_datetime((df['year']*100+df['week_number']).astype(str) + '0', format='%Y%W%w') df
Solution 2:[2]
If you are on Python >= 3.8, use datetime.date.fromisocalendar. Also works for datetime.
# 11. May 2022 is a Wednsesday in the 19h week
>>> date.fromisocalendar(2022, 19, 3)
datetime.date(2022, 5, 11)
As new Column:
df['date'] = df[['year', 'week_number']].apply(lambda args: date.fromisocalendar(args[0], args[1], 1), axis=1)
Solution 3:[3]
Use apply to loop over rows (axis=1) and a lambda function that concatenates the two columns as a string and then do exactly the thing you did it above :) Perhaps this wasn't the answer you were looking for thou, since you looking for the most efficent solution. However, this does the job!
df['convert_date']=df.apply(lambda x: datetime.strptime(f"{x.year}-{x.week_number}" + '-1', "%Y-%W-%w"), axis=1)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | DataMonkey |
| Solution 2 | ivvija |
| Solution 3 |
