'Pandas how to group rows by a dictionary of {row : group}

I have a dataframe n rows:

1 2 3 
3 4 1
5 3 2
9 8 2
7 2 6
0 0 0
4 4 4
8 4 1
...

and a dictionary of keys , so that row is a key and the value is the group:

d = {0 : 0 , 1: 0, 2 : 0, 3 : 1, 4 : 1, 5: 2, 6: 2}

I want to group by the keys and then apply mean on the groups. So I will get:

  3 3 2   #This is the mean of rows 0,1,2 from the original df, as d[0]=d[1]=d[2]=0
  8 5 4
  2 2 2
  8 4 1

What is the best way to do so?



Solution 1:[1]

Simply use the dictionary in the groupby it will replace the index value by the dictionary value matching on the key:

df.groupby(d).mean()

output:

       a    b    c
0.0  3.0  3.0  2.0
1.0  8.0  5.0  4.0
2.0  2.0  2.0  2.0

If you also want to get the missing keys, use dropna=False in groupby. Those keys will be listed in the 'NaN' group:

df.groupby(d, dropna=False).mean()

output:

       a    b    c
0.0  3.0  3.0  2.0
1.0  8.0  5.0  4.0
2.0  2.0  2.0  2.0
NaN  8.0  4.0  1.0

And for a range index instead of the dictionary keys:

df.groupby(d, dropna=False, as_index=False).mean()

output:

     a    b    c
0  3.0  3.0  2.0
1  8.0  5.0  4.0
2  2.0  2.0  2.0
3  8.0  4.0  1.0

used input:

   a  b  c
0  1  2  3
1  3  4  1
2  5  3  2
3  9  8  2
4  7  2  6
5  0  0  0
6  4  4  4
7  8  4  1

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1