'Python - Sum each column into a row

I need help with looping/if statements in Python. I have a df called "main" and another df that I created called "estimates". I would like to sum the columns based on certain conditions in main and populate it in "estimates" df.

(main)

ID Score W B H A PE F1 F2 F3 F4
RDL 2 9 4 5 5 0 0 0 0 0
RDL 4 8 2 5 2 0 1 0 1 1
BAP 6 7 5 8 5 0 0 0 0 1
BAP 8 10 9 2 7 1 1 0 0 1
RDL 10 3 0 1 7 1 1 0 0 0

Then I have another df (estimates) that is empty but would like to fill based on some conditions from the above:

ID Type W B H A Yes/No

if ID=RDL and PE=0, sum all and put "RON" in Type and "Yes" in Yes/No so the results in "estimates" should be:

ID Type W B H A Yes/No
RDL RON 19 6 10 7 Yes

I hope I explained this right but if not, please let me know so I can clarify!

Thank you!



Solution 1:[1]

Usually, you want to avoid looping over rows in pandas because it is less efficient. You can do this instead:

summary = df[(df['ID']=='RDL') & (df['PE']==0)].sum()
df_estimates.loc[df['ID']=='RDL', 'Type'] = 'RON'
df_estimates.loc[df['ID']=='RDL', 'W'] = summary['W']
df_estimates.loc[df['ID']=='RDL', 'B'] = summary['B']
df_estimates.loc[df['ID']=='RDL', 'H'] = summary['H']
df_estimates.loc[df['ID']=='RDL', 'A'] = summary['A']
df_estimates.loc[df['ID']=='RDL', 'Yes/No'] = 'Yes'

print(df_estimates)
ID  Type    W   B   H   A   Yes/No
0   RDL RON 17  6   10  7   Yes

EDIT FOLLOWING COMMENT:

Sorry, I didn't realize that you wanted to populate an empty dataframe.

Instead of creating and empty dataframe defining the column names, you can create the empty dataframe without column names and then create the columns on the fly when assinging them a value:

# df is your main dataframe

# select rows based on your condition and sum all
summary = df[(df['ID']=='RDL') & (df['PE']==0)].sum()

# create empty dataframe
emptytable = pd.DataFrame()

# create columns and assign their values
emptytable['ID'] = ['RON']
emptytable['Yes/No'] = ['Yes']
emptytable['ID'] = ['RDL']

emptytable['W'] = summary['W']
emptytable['B'] = summary['B']
emptytable['H'] = summary['H']
emptytable['A'] = summary['A']

emptytable

     ID     Yes/No  W   B   H   A
0   RDL     Yes     17  6   10  7

Another possibility is to create the estimates dataframe with the values you want (RDL and Yes on your example), and then use merge:

df_estimates = pd.DataFrame([['RDL', 'Yes']], columns=['ID', 'Yes/No'])

df_estimates

    ID  Yes/No
0   RDL     Yes

summary = df[(df['ID']=='RDL') & (df['PE']==0)]
summary = summary.groupby(['ID'])['W', 'B', 'H', 'A'].sum()

summary

         W  B   H   A
ID              
RDL     17  6   10  7


df_estimates.merge(summary, on='ID')

     ID     Yes/No  W   B   H   A
0   RDL     Yes     17  6   10  7

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1