'Pandas sum multi-index columns with same name

I know that I can sum index's by:

df["name1"]+df["name2"]

But how does sum work when the two index names are the same?

Given the following CSV:

,,College 1,,,,,,,,,,,,College 2,,,,,,,,,,,,College 3,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,Music,,,,Geography,,,,Business,,,,Mathematics,,,,Biology,,,,Geography,,,,Business,,,,Biology,,,,Technology,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,F,P,M,D,F,P,M,D,F,P,M,D,F,P,M,D,F,P,M,D,F,P,M,D,F,P,M,D,F,P,M,D,F,P,M,D
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Year 1,M,0,5,7,9,2,18,5,10,4,9,6,2,4,14,18,11,10,19,18,20,3,17,19,13,4,9,6,2,0,10,11,14,4,12,12,5
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,F,0,13,14,11,0,6,8,6,2,12,14,9,9,17,12,18,6,17,16,14,0,4,2,5,2,12,14,9,10,11,18,20,0,5,7,8
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Year 2,M,5,10,6,6,1,20,5,18,4,9,6,2,10,13,15,19,2,18,16,13,1,19,5,12,4,9,6,2,1,13,15,18,3,19,8,16
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,F,1,11,14,15,0,9,9,2,2,12,14,9,7,17,18,14,9,18,13,14,0,9,2,10,2,12,14,9,0,17,19,19,0,4,6,4
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Evening,M,4,10,6,5,3,13,19,5,4,9,6,2,8,17,10,18,3,11,20,11,4,18,17,20,4,9,6,2,8,12,16,13,4,19,18,7
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,F,4,12,12,13,0,9,3,8,2,12,14,9,0,18,11,18,1,13,13,10,0,6,2,8,2,12,14,9,9,16,20,13,0,10,5,6

I can clean the file and setup a multi-index with pandas and numpy:

df = pd.read_csv("CollegeGrades2.csv", index_col=[0,1], header=[0,1,2], skiprows=lambda x: x%2 == 1)
df.columns = pd.MultiIndex.from_frame(df.columns.to_frame().apply(lambda x: np.where(x.str.contains('Unnamed'), np.nan, x)).ffill())
df.index = pd.MultiIndex.from_frame(df.index.to_frame().ffill())
df.groupby(level=0, sort=False).sum()

enter image description here

However my issue is that I want to total the subjects e.g. College 1 Geography + College 3 Geography and display them in the following output:

Output

I have tried separating them out into different data frames, summing them and then concatenating them but in doing so I lose the headings, for example:

music = df2["College 1", "Music"]
geography = df2["College 1", "Geography"] + df2["College 1", "Geography"]
pd.concat([music,geography], axis=1).groupby(level=0, sort=False).sum()

result

How I sum the subjects while maintaining my desired output? Any help would be appreciated.

Thank you.



Solution 1:[1]

You can also group by the column:

df.groupby(level=[1, 2], axis=1).sum().groupby(level=0).sum()

Result:

1       Biology             Business             Geography            Mathematics             Music            Technology           
2             D   F   M   P        D   F   M   P         D  F   M   P           D   F   M   P     D  F   M   P          D  F   M   P
0                                                                                                                                   
Evening      47  21  69  52       22  12  40  42        41  7  41  46          36   8  21  35    18  8  18  22         13  4  23  29
Year 1       68  26  63  57       22  12  40  42        34  5  34  45          29  13  30  31    20  0  21  18         13  4  19  17
Year 2       64  12  63  66       22  12  40  42        42  2  21  57          33  17  33  30    21  6  20  21         20  3  14  23

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Code Different