'Calculating percentages for multiple columns

I have a dataset with the following structure

index	candidato	Page Name	Post Created Date	Total Interactions	Likes	Shares	Comments	Love	Angry
0	António Costa	Observador	2022-01-03	4500	340	400	433	545	565

There are 9 different candidato (candidates) and 27 different Page Name

Full dataset can be found here

I need to find a way to calculate, for each Page Name, the totals and the percentage of Total Interactions, Likes, Shares, Comments, Love, and Angry that will result in a DataFrame with the following structure

candidato	Page Name	Total Interactions	Total Interactions Percentage	Total Likes	Total Likes Percentage	Other Columns	Other Columns Percentage
António Costa	Observador	6500	34	23	1%	540	23%
Rui Rio	Observador	4500	23	value	percentage	value	percentage

The reason why I need to calculate this is in order to produce a percent stacked bar chart such as this one:

What is the best way to achieve this with Pandas? Thank you in advance for your help.

Disclosure This question is to help in a non-for-profit project that analyzes media behaviour, and bias, towards Portuguese candidates to the 2022 general elections. The prior report was made using Google Sheets but analyzing the datasets with Python is the best way, since I plan on doing this every 3 months.

The GitHub repo can be found here, where you can access all datasets and code used.

Solution 1:^[1]

After getting the data via:

!wget https://raw.githubusercontent.com/JorgeMiguelGomes/LEG2022_MediaMonitor/main/legislativas_2022_media_monitor_29jan2022/data_products/legislativas_2022_all_candidates_filtered.csv

and reading in csv to dataframe:

df = pd.read_csv("legislativas_2022_all_candidates_filtered.csv")
df = df.drop(columns=["Unnamed: 0"])

You can groupby Page Name and sum the values in the columns you mentioned:

col_list = ["Total Interactions","Likes","Shares","Comments","Love","Angry"]
totals = df.groupby("Page Name")[col_list].sum()

The first 3 lines will look like:

              Total Interactions  Likes  Shares  Comments  Love  Angry
Page Name                                                             
Agência Lusa                4048   1464     285      1011    79    127
CM TV                      21813   7934    1050      7824   401    627
CNN Portugal              216437  59776    8755    107879  3640   5198

To get percentages, you can do:

percentages = totals/df[col_list].sum()*100

Sample from result:

              Total Interactions      Likes  ...       Love      Angry
Page Name                                    ...                      
Agência Lusa            0.324301   0.403282  ...   0.408945   0.371085
CM TV                   1.747521   2.185548  ...   2.075784   1.832048
CNN Portugal           17.339582  16.466265  ...  18.842530  15.188172

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	zabop

'Calculating percentages for multiple columns

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]