'Check if timestamp is in separate DataFrame
I have a DataFrame with a large amount of datetime stamps in it - called df:
Time Charge PV Energy
1 27/06/2021 04:45 NaN 1.5 1.8
2 28/06/2021 04:46 NaN 1.7 1.5
3 21/07/2021 04:47 NaN 2.5 3.1
4 29/07/2021 04:48 NaN 1.9 2.1
5 05/08/2021 04:49 NaN 1.4 1.1
I want to add a column that checks if the date is within a separate data frame (to see if they day is a holiday). This is the DataFrame containing the holidays (called df_hols)
Date Holiday
0 2021-05-29 1
1 2021-07-21 1
2 2021-08-22 1
3 2021-08-23 1
4 2021-08-30 1
5 2021-08-30 1
6 2021-09-09 1
7 2021-09-19 1
I am currently using:
for date in df_hols['Date']:
df['Holiday'] = np.where(df['Time'].dt.date == date, True, False)
and this works but is very slow.
I have tried
df['Holiday'] = np.where(df['Time'].dt.date.isin(df_hols['Date']), True, False)
But this returned all False.
Solution 1:[1]
You could get the series of of the holidays, and use isin on the dates extracted from the Time column to check inclusion:
df['Holiday'] = (pd.to_datetime(df['Time']).dt.date.astype(str)
.isin(df_hols['Date'])
)
output:
Time Charge PV Energy Holiday
1 27/06/2021 04:45 NaN 1.5 1.8 False
2 28/06/2021 04:46 NaN 1.7 1.5 False
3 21/07/2021 04:47 NaN 2.5 3.1 True
4 29/07/2021 04:48 NaN 1.9 2.1 False
5 05/08/2021 04:49 NaN 1.4 1.1 False
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | mozway |
