'How to set the value of a cell in a dataframe based on values in another dataframe?
I have these two dataframes:
ALL_df
Out[20]:
value dow doy ... start to date_helper
timestamp ...
2019-01-01 00:00:00+00:00 0.015262 1 1 ... 1 1 2019-01-01
2019-01-01 01:00:00+00:00 0.016534 1 1 ... 1 1 2019-01-01
2019-01-01 02:00:00+00:00 0.013991 1 1 ... 1 1 2019-01-01
2019-01-01 03:00:00+00:00 0.038156 1 1 ... 1 1 2019-01-01
2019-01-01 04:00:00+00:00 0.021622 1 1 ... 1 1 2019-01-01
... ... ... ... ... .. ...
2019-12-30 19:00:00+00:00 0.181823 0 364 ... 2 2 2019-12-30
2019-12-30 20:00:00+00:00 0.144769 0 364 ... 2 1 2019-12-30
2019-12-30 21:00:00+00:00 0.052565 0 364 ... 1 1 2019-12-30
2019-12-30 22:00:00+00:00 0.084449 0 364 ... 1 1 2019-12-30
2019-12-30 23:00:00+00:00 0.076693 0 364 ... 1 2 2019-12-30
[192192 rows x 19 columns]
ferientage_df
Out[19]:
Unnamed: 0 Bundesland year Ferien dates
0 0 Baden-W�rttemberg 1995 Osterferien 1995-04-18
1 1 Baden-W�rttemberg 1995 Osterferien 1995-04-19
2 2 Baden-W�rttemberg 1995 Osterferien 1995-04-20
3 3 Baden-W�rttemberg 1995 Osterferien 1995-04-21
4 4 Baden-W�rttemberg 1995 Osterferien 1995-04-22
... ... ... ... ...
45125 45125 Schleswig-Holstein 2023 Weihnachtsferien 2024-01-02
45126 45126 Schleswig-Holstein 2023 Weihnachtsferien 2024-01-03
45127 45127 Schleswig-Holstein 2023 Weihnachtsferien 2024-01-04
45128 45128 Schleswig-Holstein 2023 Weihnachtsferien 2024-01-05
45129 45129 Schleswig-Holstein 2023 Weihnachtsferien 2024-01-06
[45130 rows x 5 columns]
I want to add a column, called "Holiday" to ALL_df based on if the date in date_helper exists in the column dates of the ferientage_df (German holidays)
Is it possible to do this in 1-2 lines or do I have to loop through the whole thing somehow?
Solution 1:[1]
GIven the following toy dataframes:
import pandas as pd
ALL_df = pd.DataFrame(
{
"value": {
0: "0.016534",
1: "0.013991",
2: "0.038156",
3: "0.021622",
},
"date_helper": {
0: "2019-01-01",
1: "1995-04-18",
2: "2019-01-01",
3: "2024-01-03",
},
}
)
ferientage_df = pd.DataFrame(
{
"Ferien": {
0: "Osterferien",
1: "Osterferien",
2: "Weihnachtsferien",
3: "Weihnachtsferien",
},
"dates": {
0: "1995-04-18",
1: "1995-04-19",
2: "2024-01-03",
3: "2024-01-04",
},
}
)
You can do this:
ALL_df = ALL_df.assign(
Holiday=lambda df: df["date_helper"].isin(ferientage_df["dates"])
)
print(ALL_df)
# Output
value date_helper Holiday
0 0.016534 2019-01-01 False
1 0.013991 1995-04-18 True
2 0.038156 2019-01-01 False
3 0.021622 2024-01-03 True
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Laurent |
