'How to fill a new Dataframe based on another Dataframe [duplicate]
Below is a dataframes for example :
Employee= pd.DataFrame(
{"EmployeeKey": ["1", "2", "3"],
"End_Contract": [None, "2022-02-02", None],
"Contract": ["35", "35", "35"]}
)
I use a calendar like the following :
Calendar = pd.DataFrame({'Year':[2022,2022,2022],
'Month ':[1,2,3],
'Working day':[21,20,21]})
I try to fill a new dataframe from the last two dataframes.
the final result I'm trying to find
| EmployeeKey | EndContract | Contract| Year | Month | Working day |
| -------- | ----------- |------- |------- | ------- | --------------
| 1 | None | 35 | 2022 | 1 | 21 |
| 1 | None | 35 | 2022 | 2 | 20 |
| 1 | None | 35 | 2022 | 3 | 21 |
| 2 | 2022-02-02 | 35 | 2022 | 1 | 21 |
| 2 | 2022-02-02 | 35 | 2022 | 2 | 20 |
| 2 | 2022-02-02 | 35 | 2022 | 3 | 21 |
| 3 | None | 35 | 2022 | 1 | 21 |
| 3 | None | 35 | 2022 | 2 | 20 |
| 3 | None | 35 | 2022 | 3 | 21 |
I can't find a way to get this result !
Solution 1:[1]
Use:
cal = pd.DataFrame({'year':[2022,2021], 'month':[2,3], 'working days':[21,21]})
Employee['End_Contract']=pd.to_datetime(Employee['End_Contract'])
Employee['y']=Employee['End_Contract'].dt.year
Employee['m']=Employee['End_Contract'].dt.month
Employee.merge(cal, left_on=['y', 'm'], right_on=['year', 'month'])
Based on your comment, if I understand you well, you can use the how parameter with outer value. From pandas documentation:
outer: use union of keys from both frames, similar to a SQL full outer join; sort keys lexicographically.
So:
Employee.merge(cal, left_on=['y', 'm'], right_on=['year', 'month'], how = 'outer')
The result:
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |

