'ValueError: grouper for xxx not 1-dimensional with pandas pivot_table()

I am working on olympics dataset and want to create another dataframe that has total number of athletes and total number of medals won by type for each country. Using following pivot_table gives me an error "ValueError: Grouper for 'ID' not 1-dimensional" pd.pivot_table(olymp, index='NOC', columns=['ID','Medal'], values=['ID','Medal'], aggfunc={'ID':pd.Series.nunique,'Medal':'count'}).sort_values(by='Medal')

Result should have one row for each country with columns for totalAthletes, gold, silver, bronze. Not sure how to go about it using pivot_table. I can do this using merge of crosstab but would like to use just one pivottable statement. Here is what original df looks like.

Olympics dataframe



Solution 1:[1]

Update

I would like to get the medal breakdown as well e.g. gold, silver, bronze. Also I need unique count of athlete id's so I use nunique since one athlete may participate in multiple events. Same with medal, ignoring NA values

IIUC:

out = df.pivot_table('ID', 'NOC', 'Medal', aggfunc='count', fill_value=0)
out['ID'] = df[df['Medal'].notna()].groupby('NOC')['ID'].nunique()

Output:

>>> out
Medal  Bronze  Gold  Silver   ID
NOC                             
AFG         2     0       0    1
AHO         0     0       1    1
ALG         8     5       4   14
ANZ         5    20       4   25
ARG        91    91      92  231
..        ...   ...     ...  ...
VIE         0     1       3    3
WIF         5     0       0    4
YUG        93   130     167  317
ZAM         1     0       1    2
ZIM         1    17       4   16

[149 rows x 4 columns]


Old answer

You can't have the same column for columns and values:

out = olymp.pivot_table(index='NOC', values=['ID','Medal'],
                        aggfunc={'ID':pd.Series.nunique, 'Medal':'count'}) \
           .sort_values('Medal', ascending=False)
print(out)

# Output
       ID  Medal
NOC             
USA  9653   5637
URS  2948   2503
GER  4872   2165
GBR  6281   2068
FRA  6170   1777
..    ...    ...
GAM    33      0
GBS    15      0
GEQ    26      0
PNG    61      0
LBA    68      0

[230 rows x 2 columns]

Another way to get the result above:

out = olym.groupby('NOC').agg({'ID': pd.Series.nunique, 'Medal': 'count'}) \
          .sort_values('Medal', ascending=False)
print(out)

# Output
       ID  Medal
NOC             
USA  9653   5637
URS  2948   2503
GER  4872   2165
GBR  6281   2068
FRA  6170   1777
..    ...    ...
GAM    33      0
GBS    15      0
GEQ    26      0
PNG    61      0
LBA    68      0

[230 rows x 2 columns]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1