'python for loop, appending data from 12 columns to 1 column
I am trying to determine what this loop does and if it can be done in a clearer way.
My understanding: For each iteration of the loop, it's taking one of the user ID fields(Col1 to Col 12) and writing that to the ID_final field. And then it tries to use that to join to "data" dataframe. Then appends everything to final_hr dataframe.
column_list = pd.DataFrame(['Col1','Col2','Col3','Col4', 'Col5',
'Col6','Col7','Col8','Col9','Col10',
'Col11','Col12'])
final_hr = pd.DataFrame()
for column in range(len(column_list)):
hr_new=hr.copy()
#Drop rows containing NAN for column 'col4' for new merge
hr_new.dropna(subset=[column_list.iloc[column,0]], inplace = True)
#Creating a new column for merge
hr_new['ID_final']=hr_new[column_list.iloc[column,0]]
#case folding
hr_new['ID_final']=hr_new['ID_final'].str.strip().str.upper()
#Merge data
merged_data = pd.merge(hr_new, data, how='left', left_on='ID_final', right_on ='OtherID')
#Concatinating all data together
final_hr = final_hr.append(merged_data)
HR_attestation = final_hr.drop_duplicates()
I have trouble understanding how this exactly works.
In this output, i am unsure how the user "BADAWI6" from Col4 becomes the ID_Final. Why isnt Col1 or Col7 becoming the ID_Final here?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|

