'dynamically create and name DFs using a function?

Suppose you have a survey, and you want to calculate the Net Promoter Score (NPS) of different cuts of respondents. My data may look something like this:

import pandas as pd
data = [[1,111,1,1,35,'F','UK','High'], [1,112,0,1,42,'F','Saudi Arabia','Low'], [1,113,1,1,17,'M','Belize','High'],[1,1234,1,1,35,'F','Saudi Arabia','High'],[2,1854,1,1,35,'M','Belize','Low'],[2,1445,1,1,35,'F','UK','Low']]
df = pd.DataFrame(data, columns = ['survey_num','id_num','nps_sum','nps_count','age','gender','country','income_level'])
df

I want to be able to write a function that cycles through this data and does the following each time:

col_list = ['survey_num','nps_sum','nps_count']
df_customname = df_customname1[col_list]

df_customname = df_customname.groupby('survey_num').sum()
df_customname['nps_customname'] = (df_customname['nps_sum'] / df_customname['nps_count'])*100
df_customname = df_customname.sort_values(by=['survey_num'],ascending=True)
df_customname= pd.DataFrame(df_customname.drop(['nps_sum','nps_count'], axis=1))
df_customname

The reason I need this to be dynamic is because I need to repeat this process for different cuts of data. For example, I want to be able to filter for gender = F AND country = Saudi Arabia, for example. Or Just Gender = M. Or just income = High. I then want to do a left join of that to the original df that is currently called customname (this would be my base case, so it may just be called 'all'

So the final table after running the function a few times, defining my cuts each time, my final output will look like this:

data = [[1,66.67,83.5,22.5,47.7,74.1],[2,75.67,23.5,24.5,76.7,91.1]]
df_final = pd.DataFrame(data, columns = ['survey_num','nps_all','nps_saudi_f','nps_m','nps_high','nps_40plus'])
df_final

Note there may be better ways to run this, but I'm looking for the quickest/simplest possible way that is as close as this as possible. I don't yet know what my cuts will be, but there are likely to be a lot of them, so the easier it is to just define those outside the function and have the function run that code, then left join to the original df, the better.

Thank you!



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source