'How do I make a new data frame from two existing data frames in Pandas?
I have two data frames df_att_transposed and df_1. df_1 has Supervisor and Name columns. The second data frame, df_att_transposed has attendance for all the names. My first dataframe constructor df_1 is -
{'Sup': {0: 'Sup-1', 1: 'Sup-1', 2: 'Sup-1', 3: 'Sup-1', 4: 'Sup-2',....},'Name': {0: 'Name-1', 1: 'Name-2', 2: 'Name-3'...}}
My second data frame constructor df_att_transposed is -
{0: {0: Timestamp('2022-01-31 00:00:00'), 1: Timestamp('2022-02-01 00:00:00'), 2: Timestamp('2022-02-02 00:00:00'),...} 1: {0: 'P', 1: 'P', 2: 'P', 3: 'P', 4: 'P'...}}
In my new data frame I am adding the date (the 0th column) and the attendance column of the respective person from df_att_transposed. From df_1 I am taking a row at a time containing the name and supervisor in each iteration of a for loop and adding it to all the rows in the data frame. This is how I am making a new dataframe -
rows = df_1.shape[0]
for i in range(rows):
new_df = df_att_transposed.iloc[:,[0,i+1]].copy()
new_df['Sup'] = df_1.iloc[i]['Sup']
new_df['Name'] = df_1.iloc[i]['Name']
The above code adds the data but it only keeps the data from the last iteration in new_df. For every iteration the previous data gets replaced. How can I keep the data from all the iterations together in the new data frame?
Edit: (i)The output should look like this -
Sup Name 0 1
Sup-1 Name-1 28/3/2022 P
Sup-1 Name-1 27/3/2022 P
Sup-1 Name-1 26/3/2022 P
.....
Sup-2 Name-2 28/3/2022 P
Sup-2 Name-2 27/3/2022 P
Sup-2 Name-2 26/3/2022 P
......
The code I have written takes a name/row from df_1 and two columns from df_att_transposed, writes the data in new_df. Then in the next iteration, another row is taken from df_1 and two columns from df_att_transposed again, and the data that was previously stored in new_df gets overwritten. I want the data to be appended in every iteration not replaced.
(ii) df_att_transposed looks like this(for say 5 employees) -
0 1 2 3 4 5
0 28/3/2022 P P P P P
1 27/3/2022 P P P P P
2 26/3/2022 P P P P P
....
Solution 1:[1]
Are you looking for a cross merge?
new_df = df_1.merge(df_att_transposed, how='cross')
print(new_df)
# Output
Sup Name 0 1
0 Sup-1 Name-1 2022-01-31 P
1 Sup-1 Name-1 2022-02-01 P
2 Sup-1 Name-1 2022-02-02 P
3 Sup-1 Name-2 2022-01-31 P
4 Sup-1 Name-2 2022-02-01 P
5 Sup-1 Name-2 2022-02-02 P
6 Sup-1 Name-3 2022-01-31 P
7 Sup-1 Name-3 2022-02-01 P
8 Sup-1 Name-3 2022-02-02 P
9 Sup-2 Name-4 2022-01-31 P
10 Sup-2 Name-4 2022-02-01 P
11 Sup-2 Name-4 2022-02-02 P
12 Sup-2 Name-5 2022-01-31 P
13 Sup-2 Name-5 2022-02-01 P
14 Sup-2 Name-5 2022-02-02 P
Setup
>>> df_1
Sup Name
0 Sup-1 Name-1
1 Sup-1 Name-2
2 Sup-1 Name-3
3 Sup-2 Name-4
4 Sup-2 Name-5
>>> df_att_transposed
0 1
0 2022-01-31 P
1 2022-02-01 P
2 2022-02-02 P
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Corralien |
