'Check if timestamp is in separate DataFrame

I have a DataFrame with a large amount of datetime stamps in it - called df:

    Time                Charge  PV      Energy
1   27/06/2021 04:45    NaN     1.5     1.8
2   28/06/2021 04:46    NaN     1.7     1.5
3   21/07/2021 04:47    NaN     2.5     3.1
4   29/07/2021 04:48    NaN     1.9     2.1
5   05/08/2021 04:49    NaN     1.4     1.1

I want to add a column that checks if the date is within a separate data frame (to see if they day is a holiday). This is the DataFrame containing the holidays (called df_hols)

    Date        Holiday
0   2021-05-29  1
1   2021-07-21  1
2   2021-08-22  1
3   2021-08-23  1
4   2021-08-30  1
5   2021-08-30  1
6   2021-09-09  1
7   2021-09-19  1

I am currently using:

for date in df_hols['Date']:
    df['Holiday'] = np.where(df['Time'].dt.date == date, True, False)

and this works but is very slow.

I have tried

df['Holiday'] = np.where(df['Time'].dt.date.isin(df_hols['Date']), True, False)

But this returned all False.



Solution 1:[1]

You could get the series of of the holidays, and use isin on the dates extracted from the Time column to check inclusion:

df['Holiday'] = (pd.to_datetime(df['Time']).dt.date.astype(str)
                   .isin(df_hols['Date'])
                 )

output:

               Time  Charge   PV  Energy  Holiday
1  27/06/2021 04:45     NaN  1.5     1.8    False
2  28/06/2021 04:46     NaN  1.7     1.5    False
3  21/07/2021 04:47     NaN  2.5     3.1     True
4  29/07/2021 04:48     NaN  1.9     2.1    False
5  05/08/2021 04:49     NaN  1.4     1.1    False

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 mozway